loki-mode 6.71.1 → 6.72.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (91) hide show
  1. package/README.md +9 -1
  2. package/SKILL.md +2 -2
  3. package/VERSION +1 -1
  4. package/autonomy/hooks/migration-hooks.sh +26 -0
  5. package/autonomy/loki +429 -92
  6. package/autonomy/run.sh +219 -38
  7. package/dashboard/__init__.py +1 -1
  8. package/dashboard/server.py +101 -19
  9. package/docs/INSTALLATION.md +20 -11
  10. package/docs/bug-fixes/agent-01-cli-fixes.md +101 -0
  11. package/docs/bug-fixes/agent-02-purplelab-fixes.md +88 -0
  12. package/docs/bug-fixes/agent-03-dashboard-fixes.md +119 -0
  13. package/docs/bug-fixes/agent-04-memory-fixes.md +105 -0
  14. package/docs/bug-fixes/agent-05-provider-fixes.md +86 -0
  15. package/docs/bug-fixes/agent-06-integration-fixes.md +101 -0
  16. package/docs/bug-fixes/agent-07-dash-run-fixes.md +101 -0
  17. package/docs/bug-fixes/agent-08-docker-fixes.md +164 -0
  18. package/docs/bug-fixes/agent-09-e2e-build-fixes.md +69 -0
  19. package/docs/bug-fixes/agent-10-e2e-fullstack-fixes.md +102 -0
  20. package/docs/bug-fixes/agent-11-e2e-session-fixes.md +70 -0
  21. package/docs/bug-fixes/agent-12-scenario-fixes.md +120 -0
  22. package/docs/bug-fixes/agent-13-enterprise-fixes.md +143 -0
  23. package/docs/bug-fixes/agent-14-uat-newuser-fixes.md +88 -0
  24. package/docs/bug-fixes/agent-15-uat-poweruser-fixes.md +132 -0
  25. package/docs/bug-fixes/agent-19-code-review.md +316 -0
  26. package/docs/bug-fixes/agent-20-architecture-review.md +331 -0
  27. package/docs/competitive/bolt-new-analysis.md +579 -0
  28. package/docs/competitive/emergence-others-analysis.md +605 -0
  29. package/docs/competitive/replit-lovable-analysis.md +622 -0
  30. package/docs/test-scenarios/edge-cases.md +813 -0
  31. package/docs/test-scenarios/enterprise-scenarios.md +732 -0
  32. package/mcp/__init__.py +1 -1
  33. package/mcp/server.py +49 -5
  34. package/memory/consolidation.py +33 -0
  35. package/memory/embeddings.py +10 -1
  36. package/memory/engine.py +83 -38
  37. package/memory/retrieval.py +36 -0
  38. package/memory/storage.py +56 -4
  39. package/memory/token_economics.py +14 -2
  40. package/memory/vector_index.py +36 -7
  41. package/package.json +1 -1
  42. package/providers/gemini.sh +89 -2
  43. package/templates/README.md +1 -1
  44. package/templates/cli-tool.md +30 -0
  45. package/templates/dashboard.md +4 -0
  46. package/templates/data-pipeline.md +4 -0
  47. package/templates/discord-bot.md +47 -0
  48. package/templates/game.md +4 -0
  49. package/templates/microservice.md +4 -0
  50. package/templates/npm-library.md +4 -0
  51. package/templates/rest-api-auth.md +50 -20
  52. package/templates/rest-api.md +15 -0
  53. package/templates/saas-starter.md +1 -1
  54. package/templates/slack-bot.md +36 -0
  55. package/templates/static-landing-page.md +9 -1
  56. package/templates/web-scraper.md +4 -0
  57. package/web-app/dist/assets/Badge-CeBkFjo6.js +1 -0
  58. package/web-app/dist/assets/Button-yuhqo8Fq.js +1 -0
  59. package/web-app/dist/assets/{Card-B1bV4syB.js → Card-BG17vsX0.js} +1 -1
  60. package/web-app/dist/assets/{HomePage-CZTV6Nea.js → HomePage-BMSQ7Apj.js} +3 -3
  61. package/web-app/dist/assets/{LoginPage-D4UdURJc.js → LoginPage-aH_6iolg.js} +1 -1
  62. package/web-app/dist/assets/{NotFoundPage-CCLSeL6j.js → NotFoundPage-Di8cNtB1.js} +1 -1
  63. package/web-app/dist/assets/ProjectPage-BtRssmw9.js +285 -0
  64. package/web-app/dist/assets/ProjectsPage-B-FTFagc.js +6 -0
  65. package/web-app/dist/assets/{SettingsPage-Xuv8EfAg.js → SettingsPage-DIJPBla4.js} +1 -1
  66. package/web-app/dist/assets/TeamsPage--19fNX7w.js +36 -0
  67. package/web-app/dist/assets/TemplatesPage-ChUQNOOv.js +11 -0
  68. package/web-app/dist/assets/TerminalOutput-Dwrzecyl.js +31 -0
  69. package/web-app/dist/assets/activity-BNRWeu9N.js +6 -0
  70. package/web-app/dist/assets/{arrow-left-CaGtolHc.js → arrow-left-Ce6g1_YE.js} +1 -1
  71. package/web-app/dist/assets/circle-alert-LIndawHL.js +11 -0
  72. package/web-app/dist/assets/clock-Bpj4VPlP.js +6 -0
  73. package/web-app/dist/assets/{external-link-CazyUyav.js → external-link-BhhdF0iQ.js} +1 -1
  74. package/web-app/dist/assets/folder-open-CM2LgfxI.js +11 -0
  75. package/web-app/dist/assets/index-8-KpWWq7.css +1 -0
  76. package/web-app/dist/assets/index-kPDW4e_b.js +236 -0
  77. package/web-app/dist/assets/lock-sAk3Xe54.js +16 -0
  78. package/web-app/dist/assets/search-CR-2i9by.js +6 -0
  79. package/web-app/dist/assets/server-DuFh4ymA.js +26 -0
  80. package/web-app/dist/assets/trash-2-BmkkT8V_.js +11 -0
  81. package/web-app/dist/index.html +2 -2
  82. package/web-app/server.py +1321 -53
  83. package/web-app/dist/assets/Badge-CBUx2PjL.js +0 -6
  84. package/web-app/dist/assets/Button-DsRiznlh.js +0 -21
  85. package/web-app/dist/assets/ProjectPage-D0w_X9tG.js +0 -237
  86. package/web-app/dist/assets/ProjectsPage-ByYxDlKC.js +0 -16
  87. package/web-app/dist/assets/TemplatesPage-BKWN07mc.js +0 -1
  88. package/web-app/dist/assets/TerminalOutput-Dj98V8Z-.js +0 -51
  89. package/web-app/dist/assets/clock-C_CDmobx.js +0 -11
  90. package/web-app/dist/assets/index-D452pFGl.css +0 -1
  91. package/web-app/dist/assets/index-Df4_kgLY.js +0 -196
@@ -0,0 +1,102 @@
1
+ # Agent 10: Full-Stack Project E2E Testing - Bug Fixes
2
+
3
+ ## Summary
4
+
5
+ Audited all 21 PRD templates, the CLI `loki init` scaffolding system, the web-app file browser, and the file watcher subsystem. Found and fixed 12 bugs across templates, CLI, server, and tests.
6
+
7
+ ## Known Bugs Fixed
8
+
9
+ ### BUG-TPL-001: SaaS template references inconsistent NextAuth.js patterns
10
+ - **File:** `templates/saas-starter.md`
11
+ - **Issue:** Template specifies NextAuth.js v5 in tech stack but uses v4-style route pattern in API docs (`/api/auth/[...nextauth]` without mentioning the v5 `auth.ts` config approach).
12
+ - **Fix:** Updated the OAuth route description to reference both the Auth.js v5 config file (`src/lib/auth.ts`) and the catch-all route, making the pattern consistent.
13
+
14
+ ### BUG-TPL-002: CLI template missing shebang and bin configuration
15
+ - **File:** `templates/cli-tool.md`
16
+ - **Issue:** The CLI tool template had no mention of `#!/usr/bin/env node` shebang, no `bin` field in package.json, and no tsup banner configuration. A CLI tool built from this template would fail on `npm install -g` because the entry point wouldn't be executable.
17
+ - **Fix:** Added a "Package Configuration" section with the required `bin` field in package.json, the shebang requirement, and tsup `banner` configuration to auto-inject the shebang into compiled output.
18
+
19
+ ### BUG-TPL-003: Discord bot template missing environment variable handling
20
+ - **File:** `templates/discord-bot.md`
21
+ - **Issue:** Template referenced `dotenv` in tech stack and `.env.example` in project structure but never specified what environment variables are needed. The AI agent would have to guess `DISCORD_TOKEN`, `DISCORD_CLIENT_ID`, etc.
22
+ - **Fix:** Added a comprehensive "Environment Variables" section with required vars (`DISCORD_TOKEN`, `DISCORD_CLIENT_ID`), optional vars (`DISCORD_GUILD_ID`, `LOG_CHANNEL_ID`, etc.), a complete `.env.example` template, and startup validation code.
23
+
24
+ ### BUG-PROJ-001: File tree too shallow for monorepo structures
25
+ - **File:** `web-app/server.py` (line 2366)
26
+ - **Issue:** `_build_file_tree()` had `max_depth=4`, which is insufficient for monorepo structures like `packages/frontend/src/components/ui/Button.tsx` (6 levels). Files beyond 4 levels were silently omitted from the file browser.
27
+ - **Fix:** Increased `max_depth` from 4 to 8. Added a `MAX_CHILDREN=500` per-directory cap with a "... (N more items)" indicator to prevent memory issues on very large projects. Also added more noise directories to the ignore list: `vendor`, `.turbo`, `.nx`, `coverage`, `.parcel-cache`.
28
+
29
+ ### BUG-PROJ-002: File tree doesn't update after directory moves/renames
30
+ - **File:** `web-app/server.py` (line 429)
31
+ - **Issue:** The `FileChangeHandler.on_any_event()` method filtered directory events to only `("created", "deleted")`, dropping `"moved"` events. When the AI renamed or moved directories during development, the file tree in the browser would not update until a manual refresh.
32
+ - **Fix:** Added `"moved"` to the allowed directory event types: `("created", "deleted", "moved")`.
33
+
34
+ ## Additional Bugs Discovered and Fixed
35
+
36
+ ### BUG-TPL-004: Phantom `saas-app` template entry (CRITICAL)
37
+ - **Files:** `autonomy/loki` (line 7387), `tests/test-init-command.sh`
38
+ - **Issue:** The `TEMPLATE_NAMES` array in `cmd_init()` contained `saas-app` which has no corresponding template file. Only `saas-starter.md` exists. This caused `loki init --template saas-app` to pass validation (the name is in the array) but then fail at file lookup with a confusing "Unknown template" error. The init tests were also broken, asserting `saas-app` in config.
39
+ - **Fix:** Removed `saas-app` from the `TEMPLATE_NAMES` array, removed its label from `_get_template_label()`, updated the help text examples to reference `saas-starter`, updated the test file to use `saas-starter` in all 4 affected test cases.
40
+
41
+ ### BUG-TPL-005: REST API Auth template uses .js extensions with "TypeScript throughout"
42
+ - **File:** `templates/rest-api-auth.md`
43
+ - **Issue:** The template says "TypeScript throughout" in requirements but lists all files with `.js` extensions in the project structure. Also referenced `Jest + supertest` instead of `Vitest + supertest` (inconsistent with other templates).
44
+ - **Fix:** Changed all file extensions from `.js` to `.ts` in the project structure, added `tsconfig.json` to the file tree, updated testing framework from "Jest + supertest" to "Vitest + supertest".
45
+
46
+ ### BUG-TPL-006: Templates missing environment variable specifications
47
+ - **Files:** `templates/slack-bot.md`, `templates/rest-api.md`, `templates/rest-api-auth.md`
48
+ - **Issue:** Templates referenced `.env.example` in their project structures but never specified what environment variables are needed. The autonomous agent would have to invent variable names and defaults.
49
+ - **Fix:** Added "Environment Variables" sections with complete `.env.example` content and startup validation requirements to all three templates.
50
+
51
+ ### BUG-TPL-007: api-only README entry says Jest, template uses Vitest
52
+ - **File:** `templates/README.md`
53
+ - **Issue:** The README template gallery listed "Express, in-memory, Jest" for api-only.md but the actual template specifies Vitest.
54
+ - **Fix:** Updated README to say "Vitest" instead of "Jest".
55
+
56
+ ### BUG-TPL-008: Template count mismatch across documentation
57
+ - **Files:** `CLAUDE.md`, `autonomy/loki`
58
+ - **Issue:** CLAUDE.md said "13 PRD templates" but there are 21. The loki CLI said "22 built-in template names" but there are 21 (after removing the phantom `saas-app`).
59
+ - **Fix:** Updated CLAUDE.md to "21 PRD templates". Updated loki CLI comment to "21 built-in template names" and help text to "21 PRD templates".
60
+
61
+ ### BUG-TPL-009: 7 templates missing purpose footer
62
+ - **Files:** `dashboard.md`, `data-pipeline.md`, `game.md`, `microservice.md`, `npm-library.md`, `slack-bot.md`, `web-scraper.md`
63
+ - **Issue:** The README says every template should have a "Purpose Footer" explaining what it tests. 7 templates were missing this section entirely. Also missing: estimated execution time.
64
+ - **Fix:** Added purpose footer with description and time estimate to all 7 templates.
65
+
66
+ ### BUG-TPL-010: static-landing-page missing Success Criteria
67
+ - **File:** `templates/static-landing-page.md`
68
+ - **Issue:** This template had no "Success Criteria" section and no "Testing" section, unlike all other templates. The autonomous agent would not know when to stop.
69
+ - **Fix:** Added a "Success Criteria" section with 6 measurable criteria. Also added estimated time to the purpose footer.
70
+
71
+ ## Validation Results
72
+
73
+ - `bash -n autonomy/loki` -- PASS
74
+ - `python3 -c "ast.parse(open('web-app/server.py').read())"` -- PASS
75
+ - `bash -n tests/test-init-command.sh` -- PASS
76
+ - All 21 template markdown files have properly closed code blocks -- PASS
77
+ - All 21 templates now have purpose footers with time estimates -- PASS
78
+ - No remaining references to phantom `saas-app` template in source code -- PASS
79
+
80
+ ## Files Changed
81
+
82
+ | File | Change |
83
+ |------|--------|
84
+ | `web-app/server.py` | Fixed file watcher to handle directory moves; increased file tree depth to 8; added monorepo-friendly ignore list and child cap |
85
+ | `autonomy/loki` | Removed phantom `saas-app` template; fixed template count (22->21); updated help text examples |
86
+ | `tests/test-init-command.sh` | Updated 4 test cases from `saas-app` to `saas-starter` |
87
+ | `templates/saas-starter.md` | Fixed NextAuth.js v5 route pattern reference |
88
+ | `templates/cli-tool.md` | Added shebang, bin field, and tsup banner configuration |
89
+ | `templates/discord-bot.md` | Added environment variables section with required/optional vars |
90
+ | `templates/slack-bot.md` | Added environment variables section |
91
+ | `templates/rest-api-auth.md` | Fixed .js to .ts extensions; added env vars section; fixed Jest to Vitest |
92
+ | `templates/rest-api.md` | Added environment variables section |
93
+ | `templates/README.md` | Fixed "Jest" to "Vitest" for api-only entry |
94
+ | `templates/static-landing-page.md` | Added Success Criteria section and time estimate |
95
+ | `templates/dashboard.md` | Added purpose footer |
96
+ | `templates/data-pipeline.md` | Added purpose footer |
97
+ | `templates/game.md` | Added purpose footer |
98
+ | `templates/microservice.md` | Added purpose footer |
99
+ | `templates/npm-library.md` | Added purpose footer |
100
+ | `templates/slack-bot.md` | Added purpose footer |
101
+ | `templates/web-scraper.md` | Added purpose footer |
102
+ | `CLAUDE.md` | Fixed template count from 13 to 21 |
@@ -0,0 +1,70 @@
1
+ # Agent 11: Session Lifecycle E2E Testing - Bug Fixes
2
+
3
+ ## Summary
4
+
5
+ Investigated and fixed 5 bugs in the session lifecycle (start, pause, resume, stop, restart, monitor). Also discovered and fixed 1 new bug. All 3 files modified pass syntax validation.
6
+
7
+ ## Bugs Fixed
8
+
9
+ ### BUG-ST-002: Pause signal not checked between quality gates
10
+ - **File**: `autonomy/run.sh` (lines ~9811, ~9833)
11
+ - **Problem**: Three quality gates (static analysis, test coverage, code review) run sequentially with no pause/stop check between them. If a user sends PAUSE during static analysis, execution continues through all remaining gates before the pause is processed on the next loop iteration. Code review alone can take 30+ seconds.
12
+ - **Fix**: Added pause/stop file checks between each quality gate. If a signal is detected, partial gate failures are saved and `continue` exits to the main loop, which will handle the pause on the next iteration.
13
+
14
+ ### BUG-ST-004: Stop endpoint returns before processes are actually killed
15
+ - **File**: `dashboard/server.py` (line ~2863, `stop_session()`)
16
+ - **Problem**: The `/api/control/stop` endpoint sent SIGTERM via `os.kill(pid, 15)` and immediately returned `{"success": True, "message": "Stop signal sent"}`. The caller (dashboard UI) would show "stopped" while the process was still running and cleaning up. This could lead to users starting a new session while the old one was still shutting down.
17
+ - **Fix**: Added `await asyncio.sleep(0.5)` polling loop (up to 5s) that waits for the process to actually exit. If the process doesn't exit gracefully within 5s, escalates to SIGKILL. Response now includes `process_stopped` boolean and accurate message ("Session stopped" vs "Stop signal sent").
18
+
19
+ ### BUG-ST-006: Resume doesn't validate checkpoint integrity
20
+ - **File**: `autonomy/run.sh` (`load_state()` at line ~7956)
21
+ - **Problem**: `load_state()` loaded `retryCount` and `iterationCount` from `autonomy-state.json` without validating that the file contained valid JSON or that the values were sane (non-negative integers). A corrupted or truncated state file (from a crash during save, disk full, etc.) could cause the shell to use non-numeric values, leading to arithmetic errors or infinite loops.
22
+ - **Fix**: Added pre-validation step using Python that checks: (1) file is valid JSON, (2) `retryCount` and `iterationCount` are numeric, (3) values are non-negative. If validation fails, backs up the corrupted file with a `.corrupt.<timestamp>` suffix and starts fresh with count=0.
23
+
24
+ ### BUG-ST-007: Multiple concurrent pause signals cause state corruption
25
+ - **File**: `autonomy/run.sh` (`handle_pause()` at line ~10111)
26
+ - **Problem**: `handle_pause()` had no re-entrancy guard. If a signal handler triggered a second pause while one was already being handled (e.g., signal handler calling cleanup which checks pause state), two concurrent pause handlers could run, both trying to read/write PAUSE files and state. The function also did not save state on pause entry, so a crash during pause would lose the "paused" status.
27
+ - **Fix**: Added `_PAUSE_IN_PROGRESS` guard flag (checked at entry, cleared at all exit paths). Added `save_state` call at pause entry so the "paused" status persists across crashes.
28
+
29
+ ### BUG-ST-008: Non-atomic session.json update in loki CLI
30
+ - **File**: `autonomy/loki` (`cmd_stop()` at line ~1354)
31
+ - **Problem**: While `run.sh` was already fixed to use atomic temp-file + `os.replace()` for session.json updates, the `loki` CLI `cmd_stop()` still used the old pattern: `f.seek(0); f.truncate(); json.dump(d, f)`. This is non-atomic -- if the process is killed between `truncate()` and the `json.dump()` completing, session.json is left empty or partially written. The next `loki status` would fail to parse it.
32
+ - **Fix**: Replaced with the same atomic pattern used in `run.sh`: `tempfile.mkstemp()` + `json.dump()` + `os.replace()`.
33
+
34
+ ### BUG-ST-010 (NEW): ITERATION_COUNT spuriously incremented on pause resume
35
+ - **File**: `autonomy/run.sh` (`run_autonomous()` main loop at line ~9313)
36
+ - **Problem**: The main while loop incremented `ITERATION_COUNT` at the top of each iteration, BEFORE checking for pause/stop signals. When `check_human_intervention` returned 1 (pause handled, then resumed), the `continue` statement jumped back to the top of the loop, incrementing `ITERATION_COUNT` again without actually running an AI provider iteration. Same issue occurred with `check_budget_limit` returning true. Over a session with multiple pauses, this inflated the iteration count, causing premature `max_iterations_reached` exits and incorrect RARV tier selection.
37
+ - **Fix**: Moved pause/stop and budget checks BEFORE the `ITERATION_COUNT++` increment. Now the count only increments when an actual iteration will execute.
38
+
39
+ ## Bugs Verified Already Fixed
40
+
41
+ ### BUG-ST-001: save_state not atomic
42
+ - **Status**: Already fixed (line 7938). Uses temp file with PID suffix + `mv -f`.
43
+
44
+ ### BUG-ST-003: ITERATION_COUNT not restored on resume
45
+ - **Status**: Already fixed (line 7964). Duplicate of BUG-RUN-003.
46
+
47
+ ### BUG-ST-005: Gate escalation PAUSE writes to wrong path
48
+ - **Status**: Already fixed (line 9804). Writes to `${TARGET_DIR:-.}/.loki/PAUSE`.
49
+
50
+ ## Files Modified
51
+
52
+ | File | Changes |
53
+ |------|---------|
54
+ | `autonomy/run.sh` | BUG-ST-002, BUG-ST-006, BUG-ST-007, BUG-ST-010 |
55
+ | `autonomy/loki` | BUG-ST-008 |
56
+ | `dashboard/server.py` | BUG-ST-004 |
57
+
58
+ ## Validation
59
+
60
+ - `bash -n autonomy/run.sh` -- PASS
61
+ - `bash -n autonomy/loki` -- PASS
62
+ - `python3 -c "import ast; ast.parse(open('dashboard/server.py').read())"` -- PASS
63
+
64
+ ## Edge Cases Considered
65
+
66
+ 1. **Crash during save_state**: Atomic write via temp+mv means the file is either fully written or not written at all. No partial state.
67
+ 2. **Concurrent stop+pause**: The pause handler checks for STOP file in its wait loop. If both arrive simultaneously, STOP takes precedence (handle_pause returns 1, which maps to return 2/stop in check_human_intervention).
68
+ 3. **Disk full during session.json write**: `tempfile.mkstemp` will fail, caught by the `except (json.JSONDecodeError, OSError): pass` handler. The original file is untouched.
69
+ 4. **OOM kill during pause**: State is saved to "paused" status at pause entry. On restart, `load_state()` will restore the paused state and the session will resume from the correct iteration.
70
+ 5. **Rapid pause/resume cycling**: The `_PAUSE_IN_PROGRESS` guard prevents re-entrant pause handling. The iteration count fix prevents count inflation during rapid pause/resume cycles.
@@ -0,0 +1,120 @@
1
+ # Agent 12 Bug Fixes - Discovered During Scenario Writing
2
+
3
+ Date: 2026-03-24 | Version: v6.71.1
4
+
5
+ ---
6
+
7
+ ## Bugs Fixed (5 fixes across 3 files)
8
+
9
+ ### 1. BUG-EP-012: Corrupted memory index/timeline not auto-recovered
10
+
11
+ **File:** `memory/storage.py` (lines 170-192)
12
+ **Severity:** Medium
13
+ **Symptom:** If `.loki/memory/index.json` or `timeline.json` becomes corrupted (invalid JSON from a crash or disk error), all memory operations silently fail permanently. The `_ensure_index()` method only recreates the file when it does not exist, not when it exists but contains invalid JSON.
14
+
15
+ **Fix:** Added JSON validity checks in `_ensure_index()` and `_ensure_timeline()`. When the file exists but is corrupted (JSONDecodeError), it is now logged and recreated from scratch. This restores memory system functionality without requiring manual file deletion.
16
+
17
+ **Before:**
18
+ ```python
19
+ def _ensure_index(self) -> None:
20
+ index_path = self.base_path / "index.json"
21
+ if not index_path.exists():
22
+ # ... create initial index
23
+ ```
24
+
25
+ **After:**
26
+ ```python
27
+ def _ensure_index(self) -> None:
28
+ index_path = self.base_path / "index.json"
29
+ needs_init = not index_path.exists()
30
+ if not needs_init:
31
+ try:
32
+ text = index_path.read_text(encoding="utf-8", errors="replace")
33
+ json.loads(text)
34
+ except (json.JSONDecodeError, OSError):
35
+ logging.getLogger(__name__).warning(
36
+ "Corrupted index.json detected, recreating from scratch"
37
+ )
38
+ needs_init = True
39
+ if needs_init:
40
+ # ... create initial index
41
+ ```
42
+
43
+ ---
44
+
45
+ ### 2. BUG-EP-015: Orphaned temp files accumulate after kill -9
46
+
47
+ **Files:** `memory/storage.py`, `autonomy/run.sh`
48
+ **Severity:** Low
49
+ **Symptom:** When a process is killed with SIGKILL during an atomic write (temp file + rename), the temp file is left behind because the rename never completes. These `.tmp_*.json` files in the memory directory and `.tmp.*` files in `.loki/` accumulate indefinitely.
50
+
51
+ **Fix (memory/storage.py):** Added `_cleanup_stale_tmp_files()` method that runs on MemoryStorage initialization. Removes `.tmp_*.json` files older than 5 minutes.
52
+
53
+ **Fix (autonomy/run.sh):** Added cleanup in `load_state()` that runs `find .loki/ -name "*.tmp.*" -mmin +5 -delete` on session startup. This catches orphaned temp files from previous kill -9 events.
54
+
55
+ ---
56
+
57
+ ### 3. BUG-EC-013: Empty provider output silently treated as success
58
+
59
+ **File:** `autonomy/run.sh` (after provider invocation, ~line 9691)
60
+ **Severity:** Medium
61
+ **Symptom:** When a provider returns exit code 0 but produces zero output (0 bytes in iter_output), the system treats it as a successful iteration. This wastes iterations -- the system continues to the next iteration without detecting that nothing happened. If the provider consistently returns empty output (broken prompt, API issue), the stagnation detector does not kick in for 5+ iterations.
62
+
63
+ **Fix:** Added a post-invocation check: if `$iter_output` exists, is empty (0 bytes), and exit_code is 0, the exit_code is overridden to 1 with a warning log message. This ensures the iteration is treated as a failure, triggering appropriate retry/backoff logic.
64
+
65
+ ```bash
66
+ # BUG-EC-013: Detect empty provider output (0 bytes = no work done)
67
+ if [ -f "$iter_output" ] && [ ! -s "$iter_output" ] && [ $exit_code -eq 0 ]; then
68
+ log_warn "Provider returned empty output (0 bytes) despite exit code 0 -- treating as error"
69
+ exit_code=1
70
+ fi
71
+ ```
72
+
73
+ ---
74
+
75
+ ### 4. BUG-EC-014: Quality gate subprocesses have no timeout
76
+
77
+ **File:** `autonomy/run.sh` (enforce_test_coverage, ~line 5529)
78
+ **Severity:** High
79
+ **Symptom:** Test runner invocations (vitest, jest, mocha) inside quality gates have no timeout. A hanging test runner (e.g., waiting for user input, network timeout, infinite loop in tests) blocks the entire autonomous iteration indefinitely. The system becomes unresponsive.
80
+
81
+ **Fix:** Wrapped all test runner invocations with the `timeout` command, defaulting to 300 seconds (5 minutes), configurable via `LOKI_GATE_TIMEOUT` environment variable. When the timeout fires, the test runner is killed and the gate reports failure, allowing the system to continue.
82
+
83
+ ```bash
84
+ local gate_timeout="${LOKI_GATE_TIMEOUT:-300}" # 5 minutes default
85
+ output=$(cd "${TARGET_DIR:-.}" && timeout "$gate_timeout" npx vitest run --reporter=json 2>&1) || test_passed=false
86
+ ```
87
+
88
+ ---
89
+
90
+ ## Bugs Identified But Not Fixed (4 bugs, require design decisions)
91
+
92
+ ### BUG-EP-004: check_provider_health() validates key exists, not validity
93
+ - **Location:** run.sh:6864
94
+ - **Reason not fixed:** Validating key validity requires an API call to each provider, which has cost/rate-limit implications. Requires design decision on whether to add a lightweight health check endpoint call.
95
+
96
+ ### BUG-CU-002: No automatic dashboard port increment
97
+ - **Location:** run.sh dashboard startup
98
+ - **Reason not fixed:** Changing port allocation logic requires coordination between the dashboard server, the CLI status display, and the web frontend (which connects to a hardcoded port). Needs design discussion on port discovery mechanism.
99
+
100
+ ### BUG-CU-005: Export reads state files without cross-file consistency
101
+ - **Location:** loki:5034
102
+ - **Reason not fixed:** True cross-file consistency requires either a snapshot mechanism or a single monolithic state file. The current multi-file approach is by design for performance. Low impact since export is typically used after pausing.
103
+
104
+ ### BUG-EC-002: No PRD size limit or truncation before context injection
105
+ - **Location:** run.sh build_prompt
106
+ - **Reason not fixed:** The PRD is passed as a file path reference, not inline content. Truncation would lose requirements. The AI provider handles context window overflow. However, a warning for very large PRDs (> 50KB) would be useful.
107
+
108
+ ---
109
+
110
+ ## Test Impact
111
+
112
+ The fixes touch three files:
113
+ 1. `memory/storage.py` - Memory system initialization (covered by `tests/test-memory-engine.sh`, `tests/test-unified-memory.sh`)
114
+ 2. `autonomy/run.sh` - Core orchestration loop (covered by `tests/test-state-recovery.sh`, `tests/test-v6-features.sh`)
115
+
116
+ All fixes are backward-compatible:
117
+ - Memory corruption recovery only triggers on actual corruption (no behavioral change for healthy systems)
118
+ - Temp file cleanup only removes files older than 5 minutes (safe with concurrent processes)
119
+ - Empty output detection is a strict subset (only overrides exit_code when output is literally 0 bytes AND exit was 0)
120
+ - Quality gate timeout defaults to 5 minutes (longer than any reasonable test suite; configurable via env var)
@@ -0,0 +1,143 @@
1
+ # Agent 13 - Enterprise Bug Fixes
2
+
3
+ Bugs discovered during enterprise scenario writing. Each includes root cause
4
+ analysis, affected files, and applied fix (where applicable).
5
+
6
+ ---
7
+
8
+ ## BUG-E01: Helm Chart appVersion Severely Out of Date
9
+
10
+ **Severity:** Medium
11
+ **Status:** Fixed
12
+
13
+ **Description:**
14
+ The Helm chart `Chart.yaml` has `appVersion: "5.52.0"` while the actual product
15
+ version is `6.71.1`. This means `helm install` without an explicit `--set image.tag`
16
+ will pull the Docker image tagged `5.52.0`, which is 119+ minor versions behind.
17
+ The `_helpers.tpl` `autonomi.image` template defaults to `Chart.appVersion` when
18
+ `image.tag` is empty, so this directly affects production deployments.
19
+
20
+ **Root Cause:**
21
+ The Helm chart `appVersion` is not included in the 14-location version bump
22
+ checklist in CLAUDE.md. It has drifted since the chart was first created.
23
+
24
+ **Affected Files:**
25
+ - `deploy/helm/autonomi/Chart.yaml` (line 6)
26
+
27
+ **Fix Applied:**
28
+ Updated `appVersion` from `"5.52.0"` to `"6.71.1"`.
29
+
30
+ ---
31
+
32
+ ## BUG-E02: automountServiceAccountToken Conflict
33
+
34
+ **Severity:** Low
35
+ **Status:** Documented (intentional override but inconsistent intent)
36
+
37
+ **Description:**
38
+ The ServiceAccount template (`serviceaccount.yaml:12`) sets
39
+ `automountServiceAccountToken: false` (security best practice -- do not mount
40
+ the SA token unless needed). However, the controlplane deployment template
41
+ (`deployment-controlplane.yaml:29`) explicitly sets
42
+ `automountServiceAccountToken: true` at the pod spec level. The pod-level
43
+ setting overrides the SA-level setting, so the token IS mounted in controlplane
44
+ pods.
45
+
46
+ The worker deployment does NOT set `automountServiceAccountToken` at the pod
47
+ level, so it inherits the SA-level `false` setting. This means:
48
+ - Controlplane pods: SA token IS mounted (explicit true)
49
+ - Worker pods: SA token is NOT mounted (inherits SA false)
50
+
51
+ This is likely intentional (controlplane needs K8s API access for the RBAC role
52
+ to query pods/logs/configmaps/events), but the inconsistency should be
53
+ documented. If the controlplane needs the token, the SA-level `false` is
54
+ misleading.
55
+
56
+ **Affected Files:**
57
+ - `deploy/helm/autonomi/templates/serviceaccount.yaml` (line 12)
58
+ - `deploy/helm/autonomi/templates/deployment-controlplane.yaml` (line 29)
59
+
60
+ **Recommendation:**
61
+ Add a comment in `serviceaccount.yaml` explaining that the controlplane
62
+ overrides this at the pod level. Alternatively, make the SA-level setting
63
+ configurable via values.yaml.
64
+
65
+ ---
66
+
67
+ ## BUG-E03: Agent Card Reports "sso": false Despite OIDC Implementation
68
+
69
+ **Severity:** Low
70
+ **Status:** Fixed
71
+
72
+ **Description:**
73
+ The A2A Agent Card endpoint (`GET /.well-known/agent.json`) in
74
+ `dashboard/server.py:516` hardcodes `"sso": False` in the enterprise
75
+ capabilities section. However, OIDC/SSO support is fully implemented in
76
+ `dashboard/auth.py` with:
77
+ - OIDC issuer discovery
78
+ - JWKS key fetching and caching
79
+ - JWT validation (with PyJWT when available)
80
+ - Support for Okta, Azure AD, Google Workspace
81
+
82
+ The `sso` field should dynamically reflect whether OIDC is configured.
83
+
84
+ **Affected Files:**
85
+ - `dashboard/server.py` (line 516)
86
+
87
+ **Fix Applied:**
88
+ Changed `"sso": False` to `"sso": auth.is_oidc_mode()` so the agent card
89
+ accurately reflects the current OIDC configuration state.
90
+
91
+ ---
92
+
93
+ ## BUG-E04: Worker Deployment Missing Audit Logs Volume Mount
94
+
95
+ **Severity:** Low
96
+ **Status:** Documented
97
+
98
+ **Description:**
99
+ The controlplane deployment mounts both `checkpoints` and `audit-logs` volumes
100
+ (lines 79-86 in deployment-controlplane.yaml). The worker deployment only
101
+ mounts `checkpoints` (lines 73-77 in deployment-worker.yaml). If workers
102
+ perform any audit-worthy actions that write to the audit log path
103
+ (`/data/audit/audit.log`), those writes will fail silently or go to the
104
+ ephemeral container filesystem.
105
+
106
+ This may be intentional (only the controlplane/dashboard writes audit logs),
107
+ but if RARV iteration actions should be audited at the worker level, the
108
+ volume mount is needed.
109
+
110
+ **Affected Files:**
111
+ - `deploy/helm/autonomi/templates/deployment-worker.yaml` (missing audit volume mount)
112
+
113
+ **Recommendation:**
114
+ If workers should write audit logs, add the audit-logs volume mount. If only
115
+ the controlplane audits, add a comment in the worker template explaining the
116
+ intentional omission.
117
+
118
+ ---
119
+
120
+ ## BUG-E05: Helm Test test-health.yaml Expects python3 in curl Image
121
+
122
+ **Severity:** Medium
123
+ **Status:** Fixed
124
+
125
+ **Description:**
126
+ The Helm test `test-health.yaml` uses the `curlimages/curl:8.5.0` image and
127
+ attempts to pipe the API response through `python3 -c "import sys, json;
128
+ json.load(sys.stdin)"`. The `curlimages/curl` image is Alpine-based and does
129
+ NOT include Python. The test will always fail at the JSON validation step.
130
+
131
+ The fallback `grep -q '{'` check partially compensates, but the logic flow is
132
+ incorrect: the `||` chain means it tries python3 first, and if python3 is not
133
+ found (exit code 127), it falls through to grep. This works accidentally but
134
+ is fragile and misleading.
135
+
136
+ **Affected Files:**
137
+ - `deploy/helm/autonomi/tests/test-health.yaml` (line 22-23)
138
+
139
+ **Fix Applied:**
140
+ Replaced the python3 JSON validation with a pure-shell approach that only uses
141
+ tools available in the curl image (grep for JSON structure verification).
142
+
143
+ ---
@@ -0,0 +1,88 @@
1
+ # Agent 14: First-Time User Acceptance Testing -- Bug Fixes
2
+
3
+ **Date:** 2026-03-24
4
+ **Scope:** Full first-time user journey audit (install through first build)
5
+ **Files Modified:** `autonomy/loki`, `docs/INSTALLATION.md`, `README.md`
6
+
7
+ ---
8
+
9
+ ## Bugs Fixed
10
+
11
+ ### BUG-FTU-001: `loki init` does not tell user to set up AI provider
12
+ **Severity:** High -- first-time users scaffold a project but have no idea they need a provider CLI
13
+ **Location:** `autonomy/loki`, `cmd_init()` (line ~7793)
14
+ **Fix:** Added post-scaffold check that detects whether any AI provider CLI (claude, codex, gemini, cline, aider) is installed. If none found, prints clear installation instructions and suggests running `loki doctor`.
15
+
16
+ ### BUG-FTU-002: `loki web` opens browser before server is ready
17
+ **Severity:** Medium -- user sees a blank page or connection refused on first launch
18
+ **Location:** `autonomy/loki`, `cmd_web_start()` (line ~3336)
19
+ **Root Cause:** The readiness loop (`curl` against `/api/session/status`) ran up to 15 retries, but its result was never checked. The browser opened regardless of whether the server actually responded.
20
+ **Fix:** Track readiness in a `server_ready` boolean. Only open browser when `server_ready=true`. If the server is still starting, print a message telling the user to open the URL manually or refresh.
21
+
22
+ ### BUG-FTU-003: `loki quick` with no provider CLI gives unhelpful error
23
+ **Severity:** High -- user sees a cryptic `run.sh` error instead of actionable guidance
24
+ **Location:** `autonomy/loki`, `cmd_quick()` (line ~7050)
25
+ **Fix:** Added pre-flight provider CLI check before `exec "$RUN_SH"`. If the provider CLI is missing, prints the specific install command for that provider (e.g., `npm install -g @anthropic-ai/claude-code` for claude).
26
+
27
+ ### BUG-FTU-005: `loki start` with no provider CLI gives unhelpful error
28
+ **Severity:** High -- same root cause as BUG-FTU-003 but for the main `start` command
29
+ **Location:** `autonomy/loki`, `cmd_start()` (line ~1095)
30
+ **Fix:** Added pre-flight provider CLI check before `exec "$RUN_SH"`. Clear error message with install command.
31
+
32
+ ### BUG-FTU-006: `loki doctor` does not check API keys or "no provider at all"
33
+ **Severity:** Medium -- doctor gives green output even when no provider is usable
34
+ **Location:** `autonomy/loki`, `cmd_doctor()` (line ~5902)
35
+ **Fix:** Added two new sections to doctor output:
36
+ 1. After listing all provider CLIs, check if zero providers are installed and show a FAIL with install instructions.
37
+ 2. New "API Keys" section showing status of `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`. For provider CLIs that use their own login sessions, a note is shown instead of a failure.
38
+
39
+ ### BUG-FTU-004/BUG-FTU-007: INSTALLATION.md contains inaccurate references
40
+ **Severity:** Medium -- confuses new users with nonexistent paths
41
+ **Location:** `docs/INSTALLATION.md`
42
+ **Fixes:**
43
+ - **Wrong license**: File structure section claimed "MIT License" but actual license is "Business Source License 1.1". Fixed.
44
+ - **Wrong directory**: Referenced `examples/` directory (which does not exist) instead of `templates/`. Fixed to show `templates/` with accurate description.
45
+ - **Broken next steps**: "Next Steps" section referenced `./autonomy/run.sh examples/simple-todo-app.md` which is a path that does not exist. Replaced with the standard workflow: `loki doctor` -> `loki init` -> `loki start`.
46
+ - **Stale note**: "Some files/directories (autonomy, tests, examples)" changed to "templates".
47
+ - **Broken relative link**: `[README.md](README.md)` from `docs/` should be `[README.md](../README.md)`. Fixed.
48
+
49
+ ---
50
+
51
+ ## README.md Improvements
52
+
53
+ ### Improved "Get Started in 30 Seconds" section
54
+ **Problem:** The quick start jumped directly from `npm install` to `loki start ./prd.md` without explaining where a first-time user gets a PRD file. This was a dead end for anyone who does not already have a PRD.
55
+ **Fix:** Added `loki init my-app --template simple-todo-app` and `cd my-app` steps to bridge the gap. Also added a `loki quick` alternative for users who want to skip PRD creation entirely.
56
+
57
+ ---
58
+
59
+ ## Bugs Verified as Already Fixed
60
+
61
+ ### BUG-CLI-001: `--port` flag crashes (unbound variable)
62
+ **Status:** Already fixed in current codebase.
63
+ **Evidence:** Both `cmd_web_start()` and `cmd_dashboard_start()` properly guard the `--port` flag with `[[ -z "${2:-}" ]]` checks and have default port variables (`PURPLE_LAB_DEFAULT_PORT=57375`, `DASHBOARD_DEFAULT_PORT=57374`). All port references use `${LOKI_DASHBOARD_PORT:-57374}` pattern. No unbound variable risk.
64
+
65
+ ---
66
+
67
+ ## New Bugs Discovered (Not Fixed -- Documenting Only)
68
+
69
+ ### BUG-FTU-008: `INSTALLATION.md` "What's New" section is stale
70
+ The section header says "What's New in v6.7.0" but the current version is v6.71.1. The content describes features from v5.15.0 through v6.1.0 -- all many versions old. This misleads first-time users about the product's current state. Recommendation: either update to show recent highlights or remove version-specific "what's new" content from the installation guide entirely (it belongs in the CHANGELOG).
71
+
72
+ ### BUG-FTU-009: `loki doctor` providers all marked "optional"
73
+ All five AI providers show as "optional" in doctor output. For a first-time user, this implies none of them are needed, when in fact at least one is required for any functionality. The fix added above (checking for zero providers) mitigates this, but the individual items could be marked "at least one required" for clarity.
74
+
75
+ ---
76
+
77
+ ## Test Matrix
78
+
79
+ | Journey Step | Before | After |
80
+ |---|---|---|
81
+ | `loki init my-app` with no provider CLI | No guidance | Prints install instructions |
82
+ | `loki start prd.md` with no provider CLI | Cryptic run.sh error | Clear error with install command |
83
+ | `loki quick "task"` with no provider CLI | Cryptic run.sh error | Clear error with install command |
84
+ | `loki web` on slow server start | Browser opens to blank page | Browser deferred; user told to refresh |
85
+ | `loki doctor` with no providers | All green (misleading) | Explicit FAIL + API key section |
86
+ | INSTALLATION.md file structure | References nonexistent `examples/` | References correct `templates/` |
87
+ | INSTALLATION.md license | Claims MIT | Correctly says BSL 1.1 |
88
+ | README.md quick start | Assumes user has a PRD | Guides through `loki init` |