loki-mode 7.18.1 → 7.18.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/crash.sh +164 -0
- package/autonomy/lib/crash_capture.py +286 -0
- package/autonomy/lib/crash_redact.py +509 -0
- package/autonomy/loki +248 -12
- package/autonomy/run.sh +56 -2
- package/autonomy/telemetry.sh +11 -0
- package/bin/loki +3 -1
- package/bin/postinstall.js +15 -1
- package/dashboard/__init__.py +1 -1
- package/dashboard/telemetry.py +15 -0
- package/docs/CRASH-REPORTING-PLAN.md +527 -0
- package/docs/INSTALLATION.md +1 -1
- package/docs/PRIVACY.md +145 -0
- package/loki-ts/dist/loki.js +265 -226
- package/mcp/__init__.py +1 -1
- package/package.json +1 -1
|
@@ -0,0 +1,527 @@
|
|
|
1
|
+
# Crash Reporting and Auto-Fix Pipeline -- Implementation Plan
|
|
2
|
+
|
|
3
|
+
Status: DESIGN ONLY. No code, no version bumps, no commits in this pass.
|
|
4
|
+
Author role: Architect. Date: 2026-06-06. Target repo: asklokesh/loki-mode.
|
|
5
|
+
|
|
6
|
+
This plan designs a disclosed, anonymous, frictionless crash-reporting plus
|
|
7
|
+
auto-fix pipeline. It is grounded in the existing codebase; every file path
|
|
8
|
+
and function name below was verified by reading the repo. Where a thing does
|
|
9
|
+
NOT yet exist, it is marked MUST ADD. Where the spec wording and the best
|
|
10
|
+
engineering choice diverge, the deviation is called out explicitly.
|
|
11
|
+
|
|
12
|
+
--------------------------------------------------------------------------------
|
|
13
|
+
## 1. What already exists (verified, not assumed)
|
|
14
|
+
--------------------------------------------------------------------------------
|
|
15
|
+
|
|
16
|
+
### 1a. Telemetry already ships today -- and is currently UNDISCLOSED on first run
|
|
17
|
+
This is the headline finding. Loki Mode already collects anonymous usage data
|
|
18
|
+
via PostHog. There is no first-run disclosure line in the code today.
|
|
19
|
+
|
|
20
|
+
- `autonomy/telemetry.sh` -- bash PostHog client. Hardcoded ingest key
|
|
21
|
+
`phc_ya0vGBru41AJWtGNfZZ8H9W4yjoZy4KON0nnayS7s87`, host
|
|
22
|
+
`https://us.i.posthog.com`, path `/capture/`. Fire-and-forget curl with
|
|
23
|
+
`--max-time 3`. Distinct id persisted at `~/.loki-telemetry-id`.
|
|
24
|
+
Gated only by `LOKI_TELEMETRY_DISABLED=true` and `DO_NOT_TRACK=1`
|
|
25
|
+
(`_loki_telemetry_enabled`, line 9). Sourced by `autonomy/run.sh:648-651`.
|
|
26
|
+
Events fired: `session_start` (run.sh:13310), `session_end` (run.sh:13410).
|
|
27
|
+
- `dashboard/telemetry.py` -- Python equivalent. Same key/host, same opt-out
|
|
28
|
+
vars, `send_telemetry()` on a daemon thread. Called from
|
|
29
|
+
`dashboard/server.py:755` (`dashboard_start`).
|
|
30
|
+
- `bin/postinstall.js:182-209` -- npm install-time event to the same PostHog
|
|
31
|
+
host/key, same opt-out vars.
|
|
32
|
+
- `docs/WELCOME-OPENER-PLAN.md` -- an EXISTING (unimplemented in code) plan for
|
|
33
|
+
a first-run welcome that reuses this same PostHog contract and an opt-in form.
|
|
34
|
+
No sentinel/welcome is wired in code yet; only the plan doc exists.
|
|
35
|
+
|
|
36
|
+
Consequence for this feature: the honesty invariant must cover the PostHog path
|
|
37
|
+
too. PRIVACY.md and the first-run line cannot describe only the new crash
|
|
38
|
+
pipeline while `session_start` / `session_end` / install events fire silently.
|
|
39
|
+
The opt-out must be UNIFIED so one switch disables both PostHog usage telemetry
|
|
40
|
+
and crash reporting. See section 6.
|
|
41
|
+
|
|
42
|
+
### 1b. There is a production-grade redactor already -- the keystone of this plan
|
|
43
|
+
- `autonomy/lib/proof_redact.py` -- the single security chokepoint for the
|
|
44
|
+
proof-of-run feature. Verified contents:
|
|
45
|
+
- `RULES_VERSION = "1.0"` (frozen, bump-on-behavior-change).
|
|
46
|
+
- `redact_value(s)` -- pure function, redacts one string.
|
|
47
|
+
- `redact_tree(obj)` -- recurses dict keys+values, lists, nested structures;
|
|
48
|
+
returns `(new_obj, total_redactions_count)`.
|
|
49
|
+
- Ordered, ReDoS-hardened patterns: Anthropic `sk-ant-`, GitHub `gh[pousr]_`,
|
|
50
|
+
Slack `xox[baprs]-`, AWS `AKIA...` + typed secret assign, JWT `eyJ...`,
|
|
51
|
+
Google `AI...`, generic OpenAI `sk-`, Bearer (keeps scheme), PEM PRIVATE KEY
|
|
52
|
+
blocks (dropped whole), `_ENV_ASSIGN` secret-keyed assignments (bare / JSON /
|
|
53
|
+
YAML quoted), `_URI_CREDENTIAL` (scheme://user:PASS@host), and
|
|
54
|
+
`_UNIX_HOME` `/(Users|home)/<name>` + `_WIN_HOME` `C:\Users\<name>` ->
|
|
55
|
+
`~`, with optional `set_context(home, repo_root)` for repo-relative paths.
|
|
56
|
+
- Parity model is ALREADY a shared Python module, not a TS port. `loki-ts`
|
|
57
|
+
reaches the redactor by shelling out: `loki-ts/src/runner/proof.ts:27` resolves
|
|
58
|
+
`autonomy/lib/proof-generator.py` via `findPython3` (`loki-ts/src/util/python.ts`)
|
|
59
|
+
and bash calls `"$SCRIPT_DIR/lib/proof-generator.py"`. Both routes call the
|
|
60
|
+
SAME python, so redaction can never drift between routes.
|
|
61
|
+
- Fail-closed precedent: `loki-ts/src/commands/proof.ts:216-227` refuses to
|
|
62
|
+
publish unless `redaction.applied` is confirmed -- "never publish an
|
|
63
|
+
unredacted artifact." This plan adopts the identical posture.
|
|
64
|
+
- An older bash inline privacy guard exists around `autonomy/run.sh:9047`
|
|
65
|
+
(referenced in `proof_redact.py` comments as the source the ENV-assign rule
|
|
66
|
+
mirrors).
|
|
67
|
+
|
|
68
|
+
### 1c. Event bus surface
|
|
69
|
+
- `events/bus.py`, `events/bus.ts`, `events/emit.sh`. TS `EventType` enum
|
|
70
|
+
(bus.ts:12): `state | memory | task | metric | error | session | command |
|
|
71
|
+
user`. `EventSource`: `cli | api | vscode | mcp | skill | hook | dashboard |
|
|
72
|
+
memory | runner`. Exports include `emitErrorEvent` (bus.ts:467).
|
|
73
|
+
- `emit.sh` exposes `safe_append_event_jsonl()` (flock or mkdir-mutex serialized
|
|
74
|
+
append to `.loki/events.jsonl`), sourceable with `LOKI_EMIT_LIB_ONLY=1`.
|
|
75
|
+
- `autonomy/run.sh:1138` defines `emit_event_json()`; `emit_event_pending` is
|
|
76
|
+
used at the iteration-complete site.
|
|
77
|
+
|
|
78
|
+
### 1d. Metrics / KPI collector
|
|
79
|
+
- `loki-ts/src/metrics/kpis.ts` and `loki-ts/src/metrics/trust.ts` exist, with
|
|
80
|
+
command parity in `loki-ts/src/commands/kpis.ts`, `stats.ts`, `trust.ts`.
|
|
81
|
+
- `.loki/metrics/` holds efficiency + reward data. `autonomy/context-tracker.py`
|
|
82
|
+
exists. No crash-specific collector exists. MUST ADD.
|
|
83
|
+
|
|
84
|
+
### 1e. The doctor command
|
|
85
|
+
- bash `cmd_doctor` / `cmd_doctor_json` (`autonomy/loki`, per header in
|
|
86
|
+
`loki-ts/src/commands/doctor.ts:1` referencing bash line 6216 / 6534).
|
|
87
|
+
- TS port: `loki-ts/src/commands/doctor.ts`. Good surface to add a "telemetry:
|
|
88
|
+
on/off, crash buffer: N pending" line later.
|
|
89
|
+
|
|
90
|
+
### 1f. Naming and dispatch collisions (already resolved below)
|
|
91
|
+
- `loki report` is TAKEN: `cmd_report` (`autonomy/loki:25091`) is a SESSION
|
|
92
|
+
report generator (text/markdown/html). The manual crash submit command must
|
|
93
|
+
NOT reuse `report`. Decision: use `loki crash` with subcommands
|
|
94
|
+
(`loki crash` = show pending, `loki crash submit`, `loki crash show <id>`).
|
|
95
|
+
- `loki telemetry` is TAKEN: `cmd_telemetry` (`autonomy/loki:17946`) is the
|
|
96
|
+
OTEL tracing config (`status` / `enable`), dispatched `telemetry|otel`
|
|
97
|
+
(`autonomy/loki:13437`). Decision: ADD `off` / `on` / `status` (extended)
|
|
98
|
+
subcommands to the EXISTING `cmd_telemetry`. The spec-mandated
|
|
99
|
+
`loki telemetry off` thus lives inside the existing command and drives the
|
|
100
|
+
unified opt-out. Do not create a second `telemetry` command.
|
|
101
|
+
|
|
102
|
+
### 1g. Capture hook points (verified; mostly MUST ADD)
|
|
103
|
+
- TS: `loki-ts/src/cli.ts` has `process.on("SIGINT", ...)` (line 224),
|
|
104
|
+
`process.on("SIGTERM", ...)` (line 225), and a single terminal
|
|
105
|
+
`process.exit(code)` (line 228). There is NO `uncaughtException` /
|
|
106
|
+
`unhandledRejection` handler. MUST ADD both, plus a wrapper around the
|
|
107
|
+
terminal exit to capture nonzero exits.
|
|
108
|
+
- bash: traps are EXIT/INT/TERM cleanups only (run.sh:186, 199, 2891-2892,
|
|
109
|
+
12734, 12843, 12914, 13192; loki:6160). There is a natural capture point at
|
|
110
|
+
the iteration-complete block (run.sh ~11968-11989) where `$exit_code` is
|
|
111
|
+
known and `status=...error` is already emitted, and `auto_capture_episode`
|
|
112
|
+
(run.sh:12206) already records per-iteration outcome. MUST ADD an ERR/EXIT
|
|
113
|
+
crash hook in `main()` (run.sh:12913) and a friction hook at the existing
|
|
114
|
+
retry/rate-limit/gate sites.
|
|
115
|
+
|
|
116
|
+
### 1h. Issue-mode (auto-fix trigger primitive) already exists
|
|
117
|
+
- `gh issue ...` plumbing in `autonomy/run.sh` (create at 2200, comment at
|
|
118
|
+
2078/2087, close at 2092, list at 1828). The product statement that
|
|
119
|
+
`loki start owner/repo#123` runs in issue-mode is consistent with this
|
|
120
|
+
surface; the auto-fix loop reuses it rather than inventing a runner.
|
|
121
|
+
|
|
122
|
+
### 1i. Release mechanics (do not reconstruct from memory)
|
|
123
|
+
- `scripts/release.sh` is the canonical bump tool. It bumps `VERSION`,
|
|
124
|
+
`package.json`, `vscode-extension/package.json` directly (release.sh:209-211),
|
|
125
|
+
then runs `scripts/update-changelog.sh`. The "14 version files" figure is the
|
|
126
|
+
full release process across docs/wiki/mcp/dashboard `__init__.py`/SKILL.md etc.
|
|
127
|
+
Each phase below says "follow the standard release bump + CHANGELOG"; it does
|
|
128
|
+
NOT enumerate the 14 files from memory. Use the canonical scripts.
|
|
129
|
+
- Gate: `bash scripts/local-ci.sh` must pass (the bun-parity matrix is at
|
|
130
|
+
local-ci.sh:250). 3-reviewer council (2 Opus + 1 Sonnet) unanimous.
|
|
131
|
+
|
|
132
|
+
--------------------------------------------------------------------------------
|
|
133
|
+
## 2. Architecture (ASCII)
|
|
134
|
+
--------------------------------------------------------------------------------
|
|
135
|
+
|
|
136
|
+
CLIENT (bash route OR Bun route -- identical behavior via shared python)
|
|
137
|
+
+--------------------------------------------------------------+
|
|
138
|
+
| capture hook |
|
|
139
|
+
| - TS: uncaughtException / unhandledRejection / nonzero exit | MUST ADD
|
|
140
|
+
| - bash: ERR/EXIT trap in main(); iteration-complete error | MUST ADD
|
|
141
|
+
| - provider invocation failure; friction (retry/ratelimit/gate)|
|
|
142
|
+
+----------------------------+---------------------------------+
|
|
143
|
+
| raw context (in-process only)
|
|
144
|
+
v
|
|
145
|
+
+--------------------------------------------------------------+
|
|
146
|
+
| SHARED SCRUBBER autonomy/lib/crash_redact.py | MUST ADD
|
|
147
|
+
| imports proof_redact.redact_tree (1b) + crash allow/deny |
|
|
148
|
+
| -> emits WHITELIST-ONLY dict + stable fingerprint |
|
|
149
|
+
| FAIL CLOSED: if python3 missing -> write local, NO egress |
|
|
150
|
+
+----------------------------+---------------------------------+
|
|
151
|
+
|
|
|
152
|
+
+----------------+-----------------+
|
|
153
|
+
v v
|
|
154
|
+
+-----------------------+ +-----------------------------+
|
|
155
|
+
| LOCAL SELF-INSPECT | | OUTBOUND QUEUE (later phase) |
|
|
156
|
+
| .loki/crash/<id>.json | | .loki/crash/outbox/*.json |
|
|
157
|
+
| exactly what would be | | drained by `loki crash |
|
|
158
|
+
| sent (Phase 0 proof) | | submit` / background flush |
|
|
159
|
+
+-----------------------+ +--------------+--------------+
|
|
160
|
+
| HTTPS POST (Phase 1+)
|
|
161
|
+
v
|
|
162
|
+
+--------------------------------------------------------------+
|
|
163
|
+
| INGESTION BACKEND (FastAPI, reuse dashboard/ python stack) | MUST ADD
|
|
164
|
+
| POST /v1/crash (anon, rate-limited, no client write token) |
|
|
165
|
+
| 1. SECOND server-side scrub: import crash_redact.redact_tree |
|
|
166
|
+
| 2. validate against whitelist schema; reject unknown fields |
|
|
167
|
+
| 3. fingerprint -> dedup store (sqlite/KV) |
|
|
168
|
+
| 4. holds GitHub App / PAT token (never on clients) |
|
|
169
|
+
+----------------------------+---------------------------------+
|
|
170
|
+
| novel fingerprint -> create
|
|
171
|
+
| known fingerprint -> bump counter
|
|
172
|
+
v
|
|
173
|
+
+--------------------------------------------------------------+
|
|
174
|
+
| PRIVATE TRIAGE REPO asklokesh/loki-telemetry (raw intake) |
|
|
175
|
+
| one issue per novel fingerprint + occurrence counter |
|
|
176
|
+
+----------------------------+---------------------------------+
|
|
177
|
+
| human or rule confirms "real bug"
|
|
178
|
+
v PROMOTION (sanitized title/body only)
|
|
179
|
+
+--------------------------------------------------------------+
|
|
180
|
+
| AUTO-FIX AGENT loki start asklokesh/loki-telemetry#<n> |
|
|
181
|
+
| reproduce -> fix -> bash scripts/local-ci.sh -> open PR |
|
|
182
|
+
+----------------------------+---------------------------------+
|
|
183
|
+
| PR targets PUBLIC repo, sanitized desc
|
|
184
|
+
v
|
|
185
|
+
+--------------------------------------------------------------+
|
|
186
|
+
| PUBLIC REPO github.com/asklokesh/loki-mode |
|
|
187
|
+
| auto-created PR, NOT auto-merged. council + local-ci gate. |
|
|
188
|
+
| human merge approval (CLAUDE.md). Public issue mirrors the |
|
|
189
|
+
| promise shown in the first-run line. |
|
|
190
|
+
+--------------------------------------------------------------+
|
|
191
|
+
|
|
192
|
+
--------------------------------------------------------------------------------
|
|
193
|
+
## 3. Phased ship plan (smallest-first; each phase = one PATCH, shippable in a day)
|
|
194
|
+
--------------------------------------------------------------------------------
|
|
195
|
+
|
|
196
|
+
### Phase 0 -- LOCAL ONLY: capture + scrub + .loki/crash/ + `loki crash` (NO egress)
|
|
197
|
+
Goal: prove the capture+scrub layer with ZERO backend and ZERO network egress.
|
|
198
|
+
Resolves the spec's apparent tension ("manual-submit" vs "no egress"): the
|
|
199
|
+
manual command writes the scrubbed artifact locally and shows the user exactly
|
|
200
|
+
what WOULD be sent; it can optionally open a prefilled GitHub issue URL the user
|
|
201
|
+
submits by hand. No backend POST exists yet.
|
|
202
|
+
|
|
203
|
+
Behavior:
|
|
204
|
+
- On a captured crash/friction event, write the scrubbed whitelist payload to
|
|
205
|
+
`.loki/crash/<fingerprint>-<ts>.json`. Never write unscrubbed data anywhere.
|
|
206
|
+
- `loki crash` lists pending local reports; `loki crash show <id>` prints one;
|
|
207
|
+
`loki crash submit` (Phase 0) prints the payload and a prefilled
|
|
208
|
+
`github.com/asklokesh/loki-mode/issues/new?...` URL for manual submission.
|
|
209
|
+
- FAIL CLOSED: if python3 unavailable, capture still writes nothing to egress;
|
|
210
|
+
local file is written only if scrub ran.
|
|
211
|
+
|
|
212
|
+
Files to ADD:
|
|
213
|
+
- `autonomy/lib/crash_redact.py` -- shared scrubber + fingerprint (section 5).
|
|
214
|
+
Imports `proof_redact.redact_tree` / `redact_value`.
|
|
215
|
+
- `autonomy/lib/crash_capture.py` -- builds the raw context dict, calls
|
|
216
|
+
crash_redact, writes `.loki/crash/...`. Pure-ish, no network in Phase 0.
|
|
217
|
+
- `autonomy/crash.sh` -- bash hook helpers: `loki_crash_capture` (sourced by
|
|
218
|
+
run.sh), `loki_crash_friction`. Calls python3 crash_capture.
|
|
219
|
+
- `loki-ts/src/runner/crash.ts` -- TS hook: registers `uncaughtException` /
|
|
220
|
+
`unhandledRejection` in cli.ts and on nonzero exit; shells to crash_capture.py
|
|
221
|
+
via `findPython3` (mirrors proof.ts:19,27).
|
|
222
|
+
- `loki-ts/src/commands/crash.ts` -- `loki crash` command (Bun route).
|
|
223
|
+
- `docs/PRIVACY.md` -- honest disclosure doc (ships in Phase 0).
|
|
224
|
+
|
|
225
|
+
Files to MODIFY:
|
|
226
|
+
- `autonomy/run.sh` -- source `crash.sh` near telemetry source (648-651); add
|
|
227
|
+
ERR/EXIT crash hook in `main()` (12913); call `loki_crash_capture` at the
|
|
228
|
+
iteration-complete error branch (~11968-11989) and `loki_crash_friction` at
|
|
229
|
+
existing retry/rate-limit sites.
|
|
230
|
+
- `autonomy/loki` -- add `crash)` to dispatch (near report at 13472); add
|
|
231
|
+
`cmd_crash`.
|
|
232
|
+
- `loki-ts/src/cli.ts` -- register crash handlers; wrap terminal `process.exit`
|
|
233
|
+
(228); route `crash` command.
|
|
234
|
+
- `autonomy/telemetry.sh` + `dashboard/telemetry.py` + `bin/postinstall.js` --
|
|
235
|
+
NO behavior change yet, but add a code comment pointer to the unified opt-out
|
|
236
|
+
(full unification lands in section 6, can be Phase 0 since it is local).
|
|
237
|
+
|
|
238
|
+
New tests:
|
|
239
|
+
- `tests/crash/test_crash_redact.py` -- golden vectors: every secret class from
|
|
240
|
+
proof_redact PLUS new crash fields; assert WHITELIST-only output and stable
|
|
241
|
+
fingerprint across two synthetic machines (different home paths -> same hash).
|
|
242
|
+
- `tests/crash/test_crash_redact_negative.py` -- ReDoS / huge-stack guard;
|
|
243
|
+
assert no `/Users/`, no env values, no emails/IPs survive.
|
|
244
|
+
- `loki-ts/tests/commands/crash.test.ts` -- `loki crash` lists/show/submit.
|
|
245
|
+
- Add a bun-parity entry so `loki crash --help` and `loki crash show` match
|
|
246
|
+
byte-for-byte across routes (local-ci.sh:250 matrix).
|
|
247
|
+
|
|
248
|
+
CHANGELOG honest "NOT tested" disclosure (Phase 0):
|
|
249
|
+
- Tested: client-side scrub on golden vectors; local artifact write;
|
|
250
|
+
`loki crash` list/show/submit; fingerprint stability.
|
|
251
|
+
- NOT tested: network egress (none exists in Phase 0); backend dedup;
|
|
252
|
+
cross-machine real-world fingerprint collisions beyond synthetic fixtures;
|
|
253
|
+
auto-fix loop.
|
|
254
|
+
|
|
255
|
+
### Phase 1 -- BACKEND ingest with SECOND scrub (egress behind unified opt-out)
|
|
256
|
+
Goal: add the ingestion backend and turn on opt-in-by-disclosure egress.
|
|
257
|
+
- Stand up FastAPI backend (section 7) reusing the dashboard Python stack so it
|
|
258
|
+
can `import crash_redact` for the mandated second scrub.
|
|
259
|
+
- Client gains a background flush of `.loki/crash/outbox/*.json` to `POST
|
|
260
|
+
/v1/crash`, gated by the unified opt-out (section 6) and rate-limited.
|
|
261
|
+
- Still NO issue creation server-side beyond storing; or create issues in the
|
|
262
|
+
PRIVATE triage repo only.
|
|
263
|
+
|
|
264
|
+
Files to ADD: `dashboard/crash_ingest.py` (or `web-app/` route -- see 7 for the
|
|
265
|
+
host decision), `dashboard/crash_store.py` (dedup store), backend tests.
|
|
266
|
+
Files to MODIFY: `autonomy/crash.sh`, `loki-ts/src/runner/crash.ts` (add flush),
|
|
267
|
+
`loki-ts/src/commands/crash.ts` + `cmd_crash` (add `submit` real POST).
|
|
268
|
+
CHANGELOG NOT tested: production GitHub token custody under load; abuse/spam at
|
|
269
|
+
scale; promotion path (not yet built).
|
|
270
|
+
|
|
271
|
+
### Phase 2 -- DEDUP + fingerprint store + PRIVATE issue creation + counter
|
|
272
|
+
- Backend creates one private issue per novel fingerprint; bumps an occurrence
|
|
273
|
+
counter comment on repeats. GitHub token held server-side only.
|
|
274
|
+
Files: extend `dashboard/crash_store.py`, add `dashboard/crash_github.py`.
|
|
275
|
+
CHANGELOG NOT tested: promotion to public repo; auto-fix.
|
|
276
|
+
|
|
277
|
+
### Phase 3 -- PROMOTION path (private -> public, sanitized) + AUTO-FIX loop
|
|
278
|
+
- Confirmed bugs promoted to public `asklokesh/loki-mode` with sanitized
|
|
279
|
+
title/body. Auto-fix agent runs `loki start asklokesh/loki-telemetry#<n>`,
|
|
280
|
+
fixes, runs local-ci, opens a PR to PUBLIC repo. NOT auto-merged.
|
|
281
|
+
Files: `dashboard/crash_promote.py`, `scripts/crash-autofix.sh` (or a routine).
|
|
282
|
+
CHANGELOG NOT tested: human-merge gate behavior in the wild; regression rate of
|
|
283
|
+
auto-fixes (council + local-ci gate is the guard, see section 8).
|
|
284
|
+
|
|
285
|
+
--------------------------------------------------------------------------------
|
|
286
|
+
## 4. (folded into section 3 per-phase: files + tests + CHANGELOG disclosure)
|
|
287
|
+
--------------------------------------------------------------------------------
|
|
288
|
+
|
|
289
|
+
--------------------------------------------------------------------------------
|
|
290
|
+
## 5. The scrubber spec (allow/deny, regex, fingerprint)
|
|
291
|
+
--------------------------------------------------------------------------------
|
|
292
|
+
|
|
293
|
+
### 5a. Deliberate deviation from spec wording (state plainly)
|
|
294
|
+
The spec asks for "the scrubber as a testable pure function in BOTH routes,"
|
|
295
|
+
"designed identically for bash and TS." The chosen design is ONE shared Python
|
|
296
|
+
module, `autonomy/lib/crash_redact.py`, that BOTH routes call -- bash via
|
|
297
|
+
python3, Bun via `findPython3` (exactly the `proof_redact.py` /
|
|
298
|
+
`proof-generator.py` precedent, verified at proof.ts:27).
|
|
299
|
+
- Why this is better: it makes drift between routes impossible, and the SAME
|
|
300
|
+
module is importable by the FastAPI backend for the mandated SECOND scrub.
|
|
301
|
+
One module, three call sites: bash client, Bun client, backend.
|
|
302
|
+
- Spirit of the requirement is met: `redact_value` / `redact_tree` ARE pure
|
|
303
|
+
functions, covered by shared golden-vector tests.
|
|
304
|
+
- Contingency if the council demands strict per-route code: port the rules to
|
|
305
|
+
`loki-ts/src/util/crash_redact.ts` and test it against the IDENTICAL fixture
|
|
306
|
+
set as the Python module (same golden vectors), so parity is proven by tests.
|
|
307
|
+
|
|
308
|
+
### 5b. FAIL CLOSED
|
|
309
|
+
If python3 is unavailable, the no-leak guarantee cannot be enforced, so NO
|
|
310
|
+
egress happens. The client may write a local note that capture was skipped, but
|
|
311
|
+
must never POST. This mirrors proof.ts:216-227 ("never publish an unredacted
|
|
312
|
+
artifact").
|
|
313
|
+
|
|
314
|
+
### 5c. WHITELIST-only emit (deny-by-default)
|
|
315
|
+
The payload that leaves the machine contains ONLY these fields. Anything not on
|
|
316
|
+
this list is dropped, not redacted:
|
|
317
|
+
- os (uname -s), arch (uname -m)
|
|
318
|
+
- loki_version (from VERSION)
|
|
319
|
+
- runtime: node_version and/or bun_version
|
|
320
|
+
- error_class (e.g. TypeError, ENOENT, NonZeroExit)
|
|
321
|
+
- stack_signature: list of top N (default 5) normalized frame signatures
|
|
322
|
+
(function/symbol names only; file paths, line numbers, columns stripped)
|
|
323
|
+
- rarv_phase (REASON/ACT/REVIEW/VERIFY/CLOSE/iteration)
|
|
324
|
+
- exit_code
|
|
325
|
+
- friction_kind (retry_loop | rate_limit_loop | gate_failure) when applicable
|
|
326
|
+
- project_id_hash (section 5e)
|
|
327
|
+
- fingerprint (section 5d)
|
|
328
|
+
- rules_version (from crash_redact) and redactions_count
|
|
329
|
+
- captured_at (UTC, second precision)
|
|
330
|
+
|
|
331
|
+
### 5d. Deny rules (reuse proof_redact, plus crash additions)
|
|
332
|
+
crash_redact imports and applies `proof_redact.redact_tree` first (all rules in
|
|
333
|
+
1b: keys, Bearer, PEM, env-assign, URI creds, /Users//home/ -> ~, Windows
|
|
334
|
+
home). Then crash-specific additions BEFORE whitelisting:
|
|
335
|
+
- emails: `[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}` -> [REDACTED:EMAIL]
|
|
336
|
+
- IPv4: `\b(?:\d{1,3}\.){3}\d{1,3}\b` -> [REDACTED:IP]
|
|
337
|
+
- IPv6: standard colon-hex form -> [REDACTED:IP]
|
|
338
|
+
- repo names: any `owner/repo` derived from the local git remote and any value
|
|
339
|
+
matching the configured public/private repo names -> [REDACTED:REPO]
|
|
340
|
+
- prompt/PRD/code/file-content fields: never whitelisted, so dropped by 5c.
|
|
341
|
+
Because emit is whitelist-only, free-text fields (briefs, prompts, diffs) can
|
|
342
|
+
never reach the payload even if a deny rule missed them.
|
|
343
|
+
|
|
344
|
+
### 5e. Hashed non-reversible project id
|
|
345
|
+
- Do NOT hash the local filesystem path (it contains `/Users/<name>/`, which is
|
|
346
|
+
reversible-ish and stripped anyway).
|
|
347
|
+
- Hash the git remote origin URL (normalized: strip scheme, `.git` suffix,
|
|
348
|
+
trailing slash, lowercase host).
|
|
349
|
+
- Use SHA-256, UNSALTED. Tradeoff stated explicitly: unsalted gives cross-user
|
|
350
|
+
dedup (two users hitting the same bug in the same public repo collapse to one
|
|
351
|
+
triage issue, which is the whole point of the occurrence counter); a per-user
|
|
352
|
+
salt would kill that dedup. Unsalted is dictionary-attackable for known public
|
|
353
|
+
repos, but the project id reveals only "which public repo," which is already
|
|
354
|
+
public, so the privacy cost is acceptable. Private-repo origins still hash to
|
|
355
|
+
an opaque value with no path/name leakage.
|
|
356
|
+
|
|
357
|
+
### 5f. Fingerprint (dedup key)
|
|
358
|
+
- Computed in crash_redact AFTER scrub, on the REDACTED data, so client and
|
|
359
|
+
backend derive the identical value.
|
|
360
|
+
- `fingerprint = sha256(error_class + "\n" + "\n".join(top_N_stack_signatures))`
|
|
361
|
+
where each stack signature is the symbol/function name only, with file paths,
|
|
362
|
+
line numbers, columns, and addresses stripped (or it would differ per machine).
|
|
363
|
+
- N defaults to 5; configurable constant in the module. Same hash function and N
|
|
364
|
+
on client and server -> stable cross-machine dedup.
|
|
365
|
+
|
|
366
|
+
--------------------------------------------------------------------------------
|
|
367
|
+
## 6. First-run disclosure UX + unified opt-out
|
|
368
|
+
--------------------------------------------------------------------------------
|
|
369
|
+
|
|
370
|
+
### 6a. Copy (no emojis, no em dashes -- both are banned)
|
|
371
|
+
Shown once, on the first run, before any egress:
|
|
372
|
+
|
|
373
|
+
Loki Mode auto-creates the issues you hit at github.com/asklokesh/loki-mode
|
|
374
|
+
and tries to auto-resolve them. If it cannot, we encourage you to open an
|
|
375
|
+
issue for anything causing hesitation.
|
|
376
|
+
We send anonymous diagnostics only (os, arch, version, error type, sanitized
|
|
377
|
+
stack signatures). Never your code, prompts, paths, keys, or repo names.
|
|
378
|
+
See docs/PRIVACY.md. Turn this off anytime with: loki telemetry off
|
|
379
|
+
|
|
380
|
+
### 6b. Where the one-time flag lives (separate from opt-out)
|
|
381
|
+
- Disclosure sentinel: a `DISCLOSURE_SHOWN=true` key in `~/.loki/config` (the
|
|
382
|
+
same global config already read at `autonomy/run.sh:643`). Shown once
|
|
383
|
+
regardless of enable/disable state. Never re-shown after opt-out.
|
|
384
|
+
- Reuse `~/.loki/config`; do NOT invent a new sentinel file.
|
|
385
|
+
|
|
386
|
+
### 6c. Unified opt-out (gates BOTH PostHog usage telemetry AND crash reporting)
|
|
387
|
+
Map all switches to the SAME persisted key, and keep honoring the community
|
|
388
|
+
standard:
|
|
389
|
+
- `LOKI_TELEMETRY=off` (new, spec-mandated) -> treated as disabled.
|
|
390
|
+
- `loki telemetry off` (new subcommand on existing `cmd_telemetry`,
|
|
391
|
+
autonomy/loki:17946) -> writes `TELEMETRY_DISABLED=true` to `~/.loki/config`.
|
|
392
|
+
- `loki telemetry on` -> removes/sets it false; `loki telemetry status` shows
|
|
393
|
+
BOTH OTEL state (existing) and collection state (new) + pending crash count.
|
|
394
|
+
- Existing `LOKI_TELEMETRY_DISABLED=true` and `DO_NOT_TRACK=1` -> still honored.
|
|
395
|
+
- The unified check (a single helper, e.g. `loki_collection_enabled` in
|
|
396
|
+
crash.sh and a TS mirror) must be consulted by: `autonomy/telemetry.sh`
|
|
397
|
+
(`_loki_telemetry_enabled`), `dashboard/telemetry.py` (`_is_enabled`),
|
|
398
|
+
`bin/postinstall.js`, AND the new crash flush. Otherwise the disclosure is a
|
|
399
|
+
lie about PostHog events that keep firing.
|
|
400
|
+
- Never re-prompt once disabled (the sentinel is independent and never re-shown).
|
|
401
|
+
|
|
402
|
+
--------------------------------------------------------------------------------
|
|
403
|
+
## 7. Backend design
|
|
404
|
+
--------------------------------------------------------------------------------
|
|
405
|
+
|
|
406
|
+
### 7a. Host recommendation: reuse the FastAPI Python stack (dashboard/)
|
|
407
|
+
Recommendation: a small FastAPI service, deployed separately from the local
|
|
408
|
+
dashboard (dashboard/server.py runs on port 57374 locally; the ingest service
|
|
409
|
+
is a hosted deployment, e.g. on the existing `web-app/` / `deploy/` infra).
|
|
410
|
+
Justification (strongest single argument): the scrubber is Python, so the
|
|
411
|
+
backend can `import crash_redact` and run the EXACT SAME second scrub the client
|
|
412
|
+
ran -- no reimplementation, no drift, identical RULES_VERSION. A serverless
|
|
413
|
+
function in another language would force a second, divergent scrubber, which is
|
|
414
|
+
the primary data-leak risk. Reusing Python keeps one source of truth.
|
|
415
|
+
- Alternative considered: serverless (e.g. a single function). Rejected as the
|
|
416
|
+
default precisely because it tends toward a re-implemented scrubber; acceptable
|
|
417
|
+
ONLY if it runs the same Python module.
|
|
418
|
+
|
|
419
|
+
### 7b. Endpoints
|
|
420
|
+
- `POST /v1/crash` -- accept one scrubbed report. Returns 202 always (privacy:
|
|
421
|
+
no confirmation that reveals dedup state to the client).
|
|
422
|
+
- `GET /healthz` -- liveness.
|
|
423
|
+
- (internal/admin, authn-gated) `GET /v1/crash/stats`, `POST /v1/crash/promote`.
|
|
424
|
+
|
|
425
|
+
### 7c. Auth model (clients carry NO write secret)
|
|
426
|
+
- Clients POST UNAUTHENTICATED but heavily constrained:
|
|
427
|
+
- Strict rate limiting per source IP and per project_id_hash.
|
|
428
|
+
- Body size cap; whitelist-schema validation; reject unknown fields.
|
|
429
|
+
- Server-side second scrub regardless of client claims.
|
|
430
|
+
- Optionally a PUBLIC anon ingest key (like the PostHog public key already in
|
|
431
|
+
the repo) for coarse routing/quota -- it is not a secret and grants no write
|
|
432
|
+
access to GitHub. The GitHub write token is NEVER on clients.
|
|
433
|
+
- Rationale: any secret shipped in the client is exfiltratable (the repo already
|
|
434
|
+
treats `phc_...` as public for this reason). So the trust boundary is: clients
|
|
435
|
+
can only enqueue scrubbed diagnostics; only the server can write to GitHub.
|
|
436
|
+
|
|
437
|
+
### 7d. Dedup store
|
|
438
|
+
- Start with SQLite (file-backed) keyed by `fingerprint`: columns fingerprint,
|
|
439
|
+
first_seen, last_seen, occurrence_count, private_issue_number, status
|
|
440
|
+
(new|confirmed|promoted|fixed). Trivially swappable for a hosted KV later.
|
|
441
|
+
|
|
442
|
+
### 7e. GitHub token custody
|
|
443
|
+
- A GitHub App installation token or a fine-grained PAT, scoped to the PRIVATE
|
|
444
|
+
triage repo (issues:write) and the PUBLIC repo (pull-requests:write for the
|
|
445
|
+
auto-fix PR). Stored in the backend secret store / env, never returned to
|
|
446
|
+
clients, never logged. Token rotation documented in PRIVACY.md ops notes.
|
|
447
|
+
|
|
448
|
+
### 7f. Second server-side scrub (mandatory)
|
|
449
|
+
- On ingest, before any storage or issue creation: `crash_redact.redact_tree`
|
|
450
|
+
the entire body, then validate against the whitelist schema and DROP unknown
|
|
451
|
+
fields. Record server `redactions_count`; if it is > 0 on a payload the client
|
|
452
|
+
claimed was clean, log a scrubber-miss metric (a client-rule gap to fix).
|
|
453
|
+
|
|
454
|
+
--------------------------------------------------------------------------------
|
|
455
|
+
## 8. The auto-fix loop
|
|
456
|
+
--------------------------------------------------------------------------------
|
|
457
|
+
|
|
458
|
+
Trigger and flow:
|
|
459
|
+
1. A novel fingerprint creates a PRIVATE triage issue
|
|
460
|
+
(asklokesh/loki-telemetry). A repeat bumps an occurrence-counter comment.
|
|
461
|
+
2. Confirmation gate: a bug is PROMOTED only when confirmed (rule-based on
|
|
462
|
+
occurrence threshold + reproducibility, or a maintainer label). Promotion
|
|
463
|
+
creates/links a sanitized PUBLIC issue (title/body scrubbed; no triage-repo
|
|
464
|
+
internals leak to the public repo).
|
|
465
|
+
3. Auto-fix run: the backend (or a scheduled routine) invokes
|
|
466
|
+
`loki start asklokesh/loki-telemetry#<n>` in issue-mode, reusing the existing
|
|
467
|
+
issue plumbing (gh create/comment/close in run.sh: 2200/2078/2092). The agent
|
|
468
|
+
reproduces, fixes, and runs `bash scripts/local-ci.sh`.
|
|
469
|
+
4. PR: the resulting PR TARGETS the PUBLIC repo with a SANITIZED description
|
|
470
|
+
(links the public issue, never the raw triage payload). The PR is
|
|
471
|
+
auto-created but NOT auto-merged.
|
|
472
|
+
|
|
473
|
+
Guards that prevent a bad auto-fix from shipping:
|
|
474
|
+
- local-ci gate: PR cannot be opened unless `bash scripts/local-ci.sh` passes
|
|
475
|
+
(the same 42/42 + bun-parity matrix at local-ci.sh:250) on the fix branch.
|
|
476
|
+
- council: the standard 3-reviewer council (2 Opus + 1 Sonnet) unanimous, per
|
|
477
|
+
CLAUDE.md, applies to the auto-fix PR like any other.
|
|
478
|
+
- NO auto-merge: human merge approval still gates (CLAUDE.md). The pipeline ends
|
|
479
|
+
at "PR open + green + council-approved."
|
|
480
|
+
- Idempotency: one open auto-fix PR per fingerprint; the dedup store records
|
|
481
|
+
private_issue_number and PR linkage to avoid duplicate PR storms.
|
|
482
|
+
- Sanitization at the boundary: the promotion step re-runs crash_redact on any
|
|
483
|
+
text copied from private -> public, so the public PR/issue can never carry
|
|
484
|
+
raw triage content.
|
|
485
|
+
|
|
486
|
+
--------------------------------------------------------------------------------
|
|
487
|
+
## 9. Risks (named + mitigated)
|
|
488
|
+
--------------------------------------------------------------------------------
|
|
489
|
+
|
|
490
|
+
| Risk | Mitigation |
|
|
491
|
+
| --- | --- |
|
|
492
|
+
| Scrubber miss -> data leak | Whitelist-ONLY emit (deny by default, 5c) so unlisted fields never ship even if a regex misses; reuse hardened proof_redact (1b); SECOND server-side scrub via the same module (7f); fail closed if python3 missing (5b); golden-vector + negative tests. |
|
|
493
|
+
| Undisclosed existing PostHog telemetry | Unified opt-out (6c) gates PostHog AND crash; PRIVACY.md + first-run line describe BOTH; honesty invariant covers existing events. |
|
|
494
|
+
| Backend abuse / spam | Unauthenticated-but-rate-limited (per IP + project_id_hash), body-size cap, whitelist-schema validation, 202-always (no oracle), occurrence counter collapses floods into one issue. |
|
|
495
|
+
| Auto-fix ships a regression | local-ci gate (42/42 + bun-parity) BEFORE PR; unanimous council; NO auto-merge; one PR per fingerprint; human merge gate per CLAUDE.md. |
|
|
496
|
+
| GitHub token exfiltration | Token only on backend (7c/7e); clients carry no write secret; only a public anon ingest key at most. |
|
|
497
|
+
| GDPR / CCPA compliance | Anonymous-by-design (no PII in whitelist; emails/IPs denied); disclosed default-on with friction-free persistent opt-out (LOKI_TELEMETRY=off, loki telemetry off, DO_NOT_TRACK=1); project_id_hash is non-reversible; PRIVACY.md documents data categories, retention, opt-out, and deletion-by-fingerprint on request. Default-on is defensible only WITH the disclosure; covert would not be. |
|
|
498
|
+
| Dual-route parity burden | Single shared Python scrubber called by both routes (5a) eliminates redaction drift; bun-parity matrix entries for `loki crash` (local-ci.sh:250); commands ported in both `autonomy/loki` and `loki-ts/src/commands/`. |
|
|
499
|
+
| Over-reporting normal operation | Conservative capture: only uncaught/ nonzero-exit/provider-failure/explicit friction signals (retry loop, rate-limit loop, gate failure), not routine retries; thresholds before friction fires. |
|
|
500
|
+
| python3 absence breaks capture | Fail closed: local write only if scrub ran, never egress without scrub (5b). |
|
|
501
|
+
| Fingerprint instability across machines | Hash computed post-scrub on path/line-stripped frame signatures (5f); synthetic two-machine test in Phase 0. |
|
|
502
|
+
|
|
503
|
+
--------------------------------------------------------------------------------
|
|
504
|
+
## 10. Non-goals (explicit)
|
|
505
|
+
--------------------------------------------------------------------------------
|
|
506
|
+
|
|
507
|
+
- NOT auto-merging auto-fix PRs. Human merge approval always gates.
|
|
508
|
+
- NOT collecting any PII, code, prompts, PRDs, file contents, paths, or repo
|
|
509
|
+
names. Whitelist-only.
|
|
510
|
+
- NOT replacing the existing OTEL `cmd_telemetry` tracing feature; this adds
|
|
511
|
+
subcommands and a separate crash pipeline.
|
|
512
|
+
- NOT replacing the existing PostHog usage telemetry; this UNIFIES its opt-out
|
|
513
|
+
and discloses it, but does not rip it out.
|
|
514
|
+
- NOT building a real-time crash dashboard UI in this plan (the local dashboard
|
|
515
|
+
may surface a count later; out of scope here).
|
|
516
|
+
- NOT a public bug bounty or external contributor intake flow.
|
|
517
|
+
- NOT cross-product telemetry beyond loki-mode.
|
|
518
|
+
- NOT shipping egress in Phase 0 (local-only proof first).
|
|
519
|
+
|
|
520
|
+
--------------------------------------------------------------------------------
|
|
521
|
+
## Critical Files for Implementation
|
|
522
|
+
--------------------------------------------------------------------------------
|
|
523
|
+
- /Users/lokesh/git/loki-mode/autonomy/lib/proof_redact.py (reuse / import; the keystone redactor)
|
|
524
|
+
- /Users/lokesh/git/loki-mode/autonomy/run.sh (bash capture hooks + telemetry sourcing)
|
|
525
|
+
- /Users/lokesh/git/loki-mode/loki-ts/src/cli.ts (TS uncaughtException/unhandledRejection/exit hook + command routing)
|
|
526
|
+
- /Users/lokesh/git/loki-mode/autonomy/loki (cmd_crash + telemetry off/on subcommands + dispatch)
|
|
527
|
+
- /Users/lokesh/git/loki-mode/dashboard/server.py (FastAPI host to extend for the ingest backend + second scrub)
|