loki-mode 6.71.1 → 6.72.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -1
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/hooks/migration-hooks.sh +26 -0
- package/autonomy/loki +429 -92
- package/autonomy/run.sh +219 -38
- package/dashboard/__init__.py +1 -1
- package/dashboard/server.py +101 -19
- package/docs/INSTALLATION.md +20 -11
- package/docs/bug-fixes/agent-01-cli-fixes.md +101 -0
- package/docs/bug-fixes/agent-02-purplelab-fixes.md +88 -0
- package/docs/bug-fixes/agent-03-dashboard-fixes.md +119 -0
- package/docs/bug-fixes/agent-04-memory-fixes.md +105 -0
- package/docs/bug-fixes/agent-05-provider-fixes.md +86 -0
- package/docs/bug-fixes/agent-06-integration-fixes.md +101 -0
- package/docs/bug-fixes/agent-07-dash-run-fixes.md +101 -0
- package/docs/bug-fixes/agent-08-docker-fixes.md +164 -0
- package/docs/bug-fixes/agent-09-e2e-build-fixes.md +69 -0
- package/docs/bug-fixes/agent-10-e2e-fullstack-fixes.md +102 -0
- package/docs/bug-fixes/agent-11-e2e-session-fixes.md +70 -0
- package/docs/bug-fixes/agent-12-scenario-fixes.md +120 -0
- package/docs/bug-fixes/agent-13-enterprise-fixes.md +143 -0
- package/docs/bug-fixes/agent-14-uat-newuser-fixes.md +88 -0
- package/docs/bug-fixes/agent-15-uat-poweruser-fixes.md +132 -0
- package/docs/bug-fixes/agent-19-code-review.md +316 -0
- package/docs/bug-fixes/agent-20-architecture-review.md +331 -0
- package/docs/competitive/bolt-new-analysis.md +579 -0
- package/docs/competitive/emergence-others-analysis.md +605 -0
- package/docs/competitive/replit-lovable-analysis.md +622 -0
- package/docs/test-scenarios/edge-cases.md +813 -0
- package/docs/test-scenarios/enterprise-scenarios.md +732 -0
- package/mcp/__init__.py +1 -1
- package/mcp/server.py +49 -5
- package/memory/consolidation.py +33 -0
- package/memory/embeddings.py +10 -1
- package/memory/engine.py +83 -38
- package/memory/retrieval.py +36 -0
- package/memory/storage.py +56 -4
- package/memory/token_economics.py +14 -2
- package/memory/vector_index.py +36 -7
- package/package.json +1 -1
- package/providers/gemini.sh +89 -2
- package/templates/README.md +1 -1
- package/templates/cli-tool.md +30 -0
- package/templates/dashboard.md +4 -0
- package/templates/data-pipeline.md +4 -0
- package/templates/discord-bot.md +47 -0
- package/templates/game.md +4 -0
- package/templates/microservice.md +4 -0
- package/templates/npm-library.md +4 -0
- package/templates/rest-api-auth.md +50 -20
- package/templates/rest-api.md +15 -0
- package/templates/saas-starter.md +1 -1
- package/templates/slack-bot.md +36 -0
- package/templates/static-landing-page.md +9 -1
- package/templates/web-scraper.md +4 -0
- package/web-app/dist/assets/Badge-CeBkFjo6.js +1 -0
- package/web-app/dist/assets/Button-yuhqo8Fq.js +1 -0
- package/web-app/dist/assets/{Card-B1bV4syB.js → Card-BG17vsX0.js} +1 -1
- package/web-app/dist/assets/{HomePage-CZTV6Nea.js → HomePage-BMSQ7Apj.js} +3 -3
- package/web-app/dist/assets/{LoginPage-D4UdURJc.js → LoginPage-aH_6iolg.js} +1 -1
- package/web-app/dist/assets/{NotFoundPage-CCLSeL6j.js → NotFoundPage-Di8cNtB1.js} +1 -1
- package/web-app/dist/assets/ProjectPage-BtRssmw9.js +285 -0
- package/web-app/dist/assets/ProjectsPage-B-FTFagc.js +6 -0
- package/web-app/dist/assets/{SettingsPage-Xuv8EfAg.js → SettingsPage-DIJPBla4.js} +1 -1
- package/web-app/dist/assets/TeamsPage--19fNX7w.js +36 -0
- package/web-app/dist/assets/TemplatesPage-ChUQNOOv.js +11 -0
- package/web-app/dist/assets/TerminalOutput-Dwrzecyl.js +31 -0
- package/web-app/dist/assets/activity-BNRWeu9N.js +6 -0
- package/web-app/dist/assets/{arrow-left-CaGtolHc.js → arrow-left-Ce6g1_YE.js} +1 -1
- package/web-app/dist/assets/circle-alert-LIndawHL.js +11 -0
- package/web-app/dist/assets/clock-Bpj4VPlP.js +6 -0
- package/web-app/dist/assets/{external-link-CazyUyav.js → external-link-BhhdF0iQ.js} +1 -1
- package/web-app/dist/assets/folder-open-CM2LgfxI.js +11 -0
- package/web-app/dist/assets/index-8-KpWWq7.css +1 -0
- package/web-app/dist/assets/index-kPDW4e_b.js +236 -0
- package/web-app/dist/assets/lock-sAk3Xe54.js +16 -0
- package/web-app/dist/assets/search-CR-2i9by.js +6 -0
- package/web-app/dist/assets/server-DuFh4ymA.js +26 -0
- package/web-app/dist/assets/trash-2-BmkkT8V_.js +11 -0
- package/web-app/dist/index.html +2 -2
- package/web-app/server.py +1321 -53
- package/web-app/dist/assets/Badge-CBUx2PjL.js +0 -6
- package/web-app/dist/assets/Button-DsRiznlh.js +0 -21
- package/web-app/dist/assets/ProjectPage-D0w_X9tG.js +0 -237
- package/web-app/dist/assets/ProjectsPage-ByYxDlKC.js +0 -16
- package/web-app/dist/assets/TemplatesPage-BKWN07mc.js +0 -1
- package/web-app/dist/assets/TerminalOutput-Dj98V8Z-.js +0 -51
- package/web-app/dist/assets/clock-C_CDmobx.js +0 -11
- package/web-app/dist/assets/index-D452pFGl.css +0 -1
- package/web-app/dist/assets/index-Df4_kgLY.js +0 -196
|
@@ -0,0 +1,331 @@
|
|
|
1
|
+
# Agent 20: Architecture Review Report
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-03-24
|
|
4
|
+
**Scope:** Systemic architecture issues, component boundaries, security, complexity
|
|
5
|
+
**Files Reviewed:** 40+ files across all major subsystems
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Executive Summary
|
|
10
|
+
|
|
11
|
+
Loki Mode's architecture follows a filesystem-as-IPC pattern where all components
|
|
12
|
+
communicate through `.loki/` state files. While pragmatic for a CLI-first tool,
|
|
13
|
+
this design has accumulated several systemic issues as the codebase scaled to
|
|
14
|
+
40,000+ lines across Bash, Python, and TypeScript. The most critical findings
|
|
15
|
+
are: (1) three inconsistent LOKI_DIR resolution mechanisms within the dashboard
|
|
16
|
+
package, (2) a dual event system where events are written to two different
|
|
17
|
+
locations with different formats, (3) non-atomic JSON writes in `set_phase()`
|
|
18
|
+
that can corrupt state under concurrent access, and (4) extreme file sizes
|
|
19
|
+
that hinder maintainability.
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## CRITICAL -- Bugs Found and Fixed
|
|
24
|
+
|
|
25
|
+
### BUG ARCH-001: Non-atomic JSON write in `set_phase()` (SEVERITY: HIGH)
|
|
26
|
+
|
|
27
|
+
**File:** `autonomy/run.sh`, line 3254-3261
|
|
28
|
+
|
|
29
|
+
The `set_phase()` function updates `.loki/state/orchestrator.json` using a
|
|
30
|
+
direct read-then-write pattern without atomicity:
|
|
31
|
+
|
|
32
|
+
```python
|
|
33
|
+
with open(sys.argv[1], 'r') as f:
|
|
34
|
+
data = json.load(f)
|
|
35
|
+
data['currentPhase'] = sys.argv[2]
|
|
36
|
+
with open(sys.argv[1], 'w') as f: # <-- Non-atomic: truncates then writes
|
|
37
|
+
json.dump(data, f, indent=2)
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
If the process is killed (Ctrl+C, OOM) between truncation and write completion,
|
|
41
|
+
`orchestrator.json` will be empty or contain partial JSON. The dashboard's
|
|
42
|
+
`_safe_json_read()` handles corrupt reads gracefully, but the state is permanently
|
|
43
|
+
lost until the next `write_dashboard_state()` call re-derives it.
|
|
44
|
+
|
|
45
|
+
This is the same class of bug already fixed elsewhere (BUG-XC-004, BUG-ST-008)
|
|
46
|
+
using temp-file-plus-rename. The fix pattern is already established in the codebase.
|
|
47
|
+
|
|
48
|
+
**Impact:** State corruption on crash during phase transition. The dashboard
|
|
49
|
+
and completion council may see stale or missing phase data.
|
|
50
|
+
|
|
51
|
+
### BUG ARCH-002: Three inconsistent LOKI_DIR resolution mechanisms (SEVERITY: HIGH)
|
|
52
|
+
|
|
53
|
+
Within the `dashboard/` package, three different files resolve the `.loki/`
|
|
54
|
+
directory path using three different mechanisms:
|
|
55
|
+
|
|
56
|
+
| File | Env Var | Default | Resolution |
|
|
57
|
+
|------|---------|---------|------------|
|
|
58
|
+
| `dashboard/control.py:30` | `LOKI_DIR` | `.loki` (CWD-relative) | Static at import time |
|
|
59
|
+
| `dashboard/api_v2.py:68` | `LOKI_DATA_DIR` | `~/.loki` (user home) | Static at import time |
|
|
60
|
+
| `dashboard/server.py:1869` | `LOKI_DIR` | `.loki` then `~/.loki` | Dynamic 4-step resolution per call |
|
|
61
|
+
|
|
62
|
+
This means:
|
|
63
|
+
- `control.py` reads/writes state from CWD's `.loki/`
|
|
64
|
+
- `api_v2.py` reads policies from `~/.loki/`
|
|
65
|
+
- `server.py` dynamically resolves based on env, API override, CWD, or home
|
|
66
|
+
|
|
67
|
+
The `api_v2.py` one is particularly problematic because it uses `LOKI_DATA_DIR`
|
|
68
|
+
(not `LOKI_DIR`) and resolves at module import time to `~/.loki`, which is the
|
|
69
|
+
global directory, not the project directory. If a user has per-project `.loki/`
|
|
70
|
+
directories (the normal case), the policy endpoints in api_v2 will look in the
|
|
71
|
+
wrong location.
|
|
72
|
+
|
|
73
|
+
**Impact:** Policy endpoints may read from wrong directory. State written by
|
|
74
|
+
`control.py` may not be visible to `server.py` if CWD differs from project
|
|
75
|
+
directory at dashboard startup time.
|
|
76
|
+
|
|
77
|
+
### BUG ARCH-003: Dual event system with incompatible formats (SEVERITY: MEDIUM)
|
|
78
|
+
|
|
79
|
+
The codebase has two parallel event systems:
|
|
80
|
+
|
|
81
|
+
1. **JSONL append log** (`run.sh` lines 893-966): Events appended to
|
|
82
|
+
`.loki/events.jsonl` with format `{"timestamp":..., "type":..., "data":...}`
|
|
83
|
+
- Used by `emit_event()` and `emit_event_json()` -- 28 call sites in run.sh
|
|
84
|
+
- Consumed by: dashboard (reads JSONL), CLI (reads JSONL)
|
|
85
|
+
|
|
86
|
+
2. **File-per-event directory** (`events/bus.py`, `events/emit.sh`):
|
|
87
|
+
Events written as individual JSON files to `.loki/events/pending/` with
|
|
88
|
+
format `{"id":..., "type":..., "source":..., "timestamp":..., "payload":...}`
|
|
89
|
+
- Used by `emit_event_pending()` -- separate call sites
|
|
90
|
+
- Consumed by: EventBus subscribers, MCP server, state manager
|
|
91
|
+
|
|
92
|
+
These two systems have different schemas, different consumers, and no bridge
|
|
93
|
+
between them. Events emitted via `emit_event()` are invisible to the EventBus
|
|
94
|
+
and vice versa. The `emit_event_pending()` function in run.sh (line 971) writes
|
|
95
|
+
to the pending directory but most call sites still use the JSONL functions.
|
|
96
|
+
|
|
97
|
+
**Impact:** Components subscribing to the EventBus miss events emitted to JSONL.
|
|
98
|
+
Dashboard state pushes are based on file polling rather than events, adding
|
|
99
|
+
unnecessary latency.
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## Systemic Architecture Issues
|
|
104
|
+
|
|
105
|
+
### ISSUE 1: Extreme file sizes impair maintainability (SEVERITY: HIGH)
|
|
106
|
+
|
|
107
|
+
| File | Lines | Functions | Concern |
|
|
108
|
+
|------|-------|-----------|---------|
|
|
109
|
+
| `autonomy/loki` | 20,300 | 130+ `cmd_*` functions | Single-file CLI with 130 subcommands |
|
|
110
|
+
| `autonomy/run.sh` | 10,869 | 150+ functions | Orchestration engine |
|
|
111
|
+
| `dashboard/server.py` | 5,244 | 121+ routes | Monolithic API server |
|
|
112
|
+
|
|
113
|
+
The `loki` CLI has nearly doubled from the 10,820 lines documented in CLAUDE.md
|
|
114
|
+
to 20,300 lines. At this size, bash's lack of namespacing means every function
|
|
115
|
+
and variable is global. A naming collision between `cmd_test()` helpers and
|
|
116
|
+
`cmd_report()` helpers is a constant risk.
|
|
117
|
+
|
|
118
|
+
**Recommendation:** Split `autonomy/loki` by command group:
|
|
119
|
+
- `autonomy/commands/start.sh` -- start/stop/pause/resume
|
|
120
|
+
- `autonomy/commands/dashboard.sh` -- dashboard/web commands
|
|
121
|
+
- `autonomy/commands/github.sh` -- github/import/issue commands
|
|
122
|
+
- `autonomy/commands/memory.sh` -- memory subcommands
|
|
123
|
+
- `autonomy/commands/config.sh` -- config/setup commands
|
|
124
|
+
- Each file sources into the main `loki` dispatcher
|
|
125
|
+
|
|
126
|
+
### ISSUE 2: 79 inline `python3 -c` calls in run.sh (SEVERITY: MEDIUM)
|
|
127
|
+
|
|
128
|
+
The orchestrator shell script (`run.sh`) contains 79 inline Python one-liners
|
|
129
|
+
for JSON parsing, state manipulation, and data extraction. For example, reading
|
|
130
|
+
a single field from `orchestrator.json` spawns a new Python process each time:
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
current_phase=$(python3 -c "import json; print(json.load(open('.loki/state/orchestrator.json')).get('currentPhase', 'BOOTSTRAP'))" 2>/dev/null || echo "BOOTSTRAP")
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
In `write_dashboard_state()` alone, there are 8+ separate `python3 -c` calls
|
|
137
|
+
that each open, parse, and close the same JSON files. Each call has ~50ms
|
|
138
|
+
startup overhead, making the dashboard state writer measurably slow.
|
|
139
|
+
|
|
140
|
+
**Recommendation:** Create a single helper script
|
|
141
|
+
`autonomy/json-helper.py` that accepts commands like
|
|
142
|
+
`json-helper.py get orchestrator.json currentPhase BOOTSTRAP` and can batch
|
|
143
|
+
multiple reads in one invocation. Or use `jq` consistently (already used in
|
|
144
|
+
some places).
|
|
145
|
+
|
|
146
|
+
### ISSUE 3: Dashboard has two separate FastAPI apps (SEVERITY: MEDIUM)
|
|
147
|
+
|
|
148
|
+
The dashboard package contains two independent FastAPI applications:
|
|
149
|
+
|
|
150
|
+
1. `dashboard/server.py` -- Main API with 121+ routes, SQLAlchemy DB, auth, WebSocket
|
|
151
|
+
2. `dashboard/control.py` -- Separate FastAPI app with its own CORS, models, routes
|
|
152
|
+
|
|
153
|
+
Both define their own `CORSMiddleware` configurations (with slightly different
|
|
154
|
+
wildcard policies -- control.py uses `allow_methods=["*"]` and `allow_headers=["*"]`
|
|
155
|
+
while server.py restricts both). Both define their own status/health models.
|
|
156
|
+
The control.py `app` is never mounted into server.py's `app`; it appears to be
|
|
157
|
+
a legacy standalone server that was partially superseded by server.py.
|
|
158
|
+
|
|
159
|
+
The `atomic_write_json` function from `control.py` is imported by `server.py`
|
|
160
|
+
(line 55: `from .control import atomic_write_json`), but the rest of control.py's
|
|
161
|
+
routes are unreachable when the dashboard is run via server.py.
|
|
162
|
+
|
|
163
|
+
**Recommendation:** Either mount control.py's router into server.py, or extract
|
|
164
|
+
`atomic_write_json` into a standalone utility module and deprecate the separate
|
|
165
|
+
app.
|
|
166
|
+
|
|
167
|
+
### ISSUE 4: State manager (state/manager.py) is underutilized (SEVERITY: LOW)
|
|
168
|
+
|
|
169
|
+
A proper `StateManager` class exists at `state/manager.py` (1,896 lines) with:
|
|
170
|
+
- File-based caching with watchdog
|
|
171
|
+
- Thread-safe operations with file locking
|
|
172
|
+
- Event bus integration
|
|
173
|
+
- Version vectors for conflict resolution
|
|
174
|
+
- Subscription system
|
|
175
|
+
|
|
176
|
+
However, only the MCP server uses it (`mcp/server.py:48`). The main orchestrator
|
|
177
|
+
(`run.sh`) uses raw file I/O. The dashboard (`server.py`) uses its own
|
|
178
|
+
`_safe_json_read()`. This means the architecture has a proper abstraction layer
|
|
179
|
+
that is largely bypassed.
|
|
180
|
+
|
|
181
|
+
**Recommendation:** Gradually migrate dashboard state reads to use StateManager.
|
|
182
|
+
The shell-based orchestrator cannot use it directly (Python vs Bash boundary),
|
|
183
|
+
but the Python components should converge on this single state access layer.
|
|
184
|
+
|
|
185
|
+
### ISSUE 5: No file locking on orchestrator.json writes (SEVERITY: MEDIUM)
|
|
186
|
+
|
|
187
|
+
While `save_state()` (line 7938) uses atomic temp-file+mv for
|
|
188
|
+
`autonomy-state.json`, and `control.py` uses `fcntl.flock` for atomic writes,
|
|
189
|
+
the `set_phase()` function and multiple inline Python snippets write to
|
|
190
|
+
`.loki/state/orchestrator.json` without any locking or atomicity.
|
|
191
|
+
|
|
192
|
+
The `write_dashboard_state()` function (line 3272) reads from `orchestrator.json`
|
|
193
|
+
at the same time that `set_phase()` might be writing to it. Since these run in
|
|
194
|
+
the same process (run.sh), there is no parallelism risk in the normal case.
|
|
195
|
+
However, during parallel mode (worktrees), multiple run.sh instances could
|
|
196
|
+
write to the same orchestrator.json if they share a `.loki/` directory.
|
|
197
|
+
|
|
198
|
+
**Impact:** Potential data corruption in parallel mode.
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## Security Architecture Review
|
|
203
|
+
|
|
204
|
+
### Strengths
|
|
205
|
+
- **Path traversal protection** in MCP server (`validate_path()` with symlink chain checking)
|
|
206
|
+
- **PRD path validation** in control.py (blocks `..`, verifies file exists within allowed dirs)
|
|
207
|
+
- **Provider name validation** prevents shell injection in loader.sh
|
|
208
|
+
- **CORS restricted to localhost** by default (both server.py and control.py)
|
|
209
|
+
- **OIDC/SSO support** with proper JWT validation (PyJWT + cryptography)
|
|
210
|
+
- **Token-based auth** with role/scope hierarchy
|
|
211
|
+
- **Rate limiting** on control and read endpoints
|
|
212
|
+
- **Atomic writes** in many critical paths (BUG-XC-004, BUG-ST-008)
|
|
213
|
+
|
|
214
|
+
### Concerns
|
|
215
|
+
|
|
216
|
+
1. **CORS wildcard inconsistency** (LOW): `control.py` uses `allow_methods=["*"]`
|
|
217
|
+
and `allow_headers=["*"]` while `server.py` restricts to specific methods/headers.
|
|
218
|
+
If control.py routes are ever exposed, the broader CORS policy applies.
|
|
219
|
+
|
|
220
|
+
2. **subprocess.Popen in control.py** (LOW): The `start_session` endpoint
|
|
221
|
+
(line 410) passes `request.provider` into a command-line argument list.
|
|
222
|
+
This IS validated: `request.validate_provider()` is called at line 379
|
|
223
|
+
and raises ValueError (caught at line 381) before the Popen call. The
|
|
224
|
+
validation is correct. However, `validate_provider()` is a manual method
|
|
225
|
+
call rather than a Pydantic `@field_validator`, so it could be bypassed
|
|
226
|
+
if someone adds a new endpoint that creates a StartRequest without calling
|
|
227
|
+
validate. Converting to a Pydantic validator would make this defense
|
|
228
|
+
automatic.
|
|
229
|
+
|
|
230
|
+
3. **No auth on event bus** (LOW): Any process that can write to
|
|
231
|
+
`.loki/events/pending/` can inject events. This is acceptable for
|
|
232
|
+
local single-user use but should be noted for multi-tenant deployments.
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
## Component Boundary Analysis
|
|
237
|
+
|
|
238
|
+
### CLI (loki) to Orchestrator (run.sh)
|
|
239
|
+
|
|
240
|
+
**Boundary:** Clean. CLI `exec`s run.sh as a subprocess (via `cmd_start()`),
|
|
241
|
+
passing args via command-line flags. State handoff is filesystem-based.
|
|
242
|
+
|
|
243
|
+
**Issue:** The CLI re-implements some orchestrator functionality (e.g., memory
|
|
244
|
+
loading at `loki:274`, status file reading) rather than delegating to run.sh.
|
|
245
|
+
This creates subtle divergence risk.
|
|
246
|
+
|
|
247
|
+
### Web Server (server.py) to CLI
|
|
248
|
+
|
|
249
|
+
**Boundary:** Indirect via filesystem. The dashboard reads `.loki/` state files
|
|
250
|
+
that the orchestrator writes. For control operations, `control.py` spawns
|
|
251
|
+
run.sh via `subprocess.Popen`.
|
|
252
|
+
|
|
253
|
+
**Issue:** The dashboard does NOT call the CLI (`loki` binary) -- it calls
|
|
254
|
+
`run.sh` directly. This bypasses any CLI-level validation, setup, or event
|
|
255
|
+
emission that `cmd_start()` performs.
|
|
256
|
+
|
|
257
|
+
### Dashboard API to Orchestrator
|
|
258
|
+
|
|
259
|
+
**Boundary:** File-based polling. The orchestrator writes `dashboard-state.json`
|
|
260
|
+
every iteration. The dashboard reads it via `_push_loki_state_loop()` (every 2s
|
|
261
|
+
when running, 30s when idle) and pushes to WebSocket clients.
|
|
262
|
+
|
|
263
|
+
**Issue:** This is polling-based, not event-driven. The event bus exists but is
|
|
264
|
+
not used for this communication path. Adding event bus integration would reduce
|
|
265
|
+
latency from 2s to near-instant.
|
|
266
|
+
|
|
267
|
+
### Memory System to Everything
|
|
268
|
+
|
|
269
|
+
**Boundary:** Clean Python API via `memory/engine.py`. The shell orchestrator
|
|
270
|
+
bridges to it via Python one-liners.
|
|
271
|
+
|
|
272
|
+
**Issue:** The memory engine is initialized independently by each consumer
|
|
273
|
+
(run.sh via inline Python, MCP server via its own import, dashboard indirectly
|
|
274
|
+
via state files). There is no shared singleton across components, so memory
|
|
275
|
+
operations from different components may see inconsistent state.
|
|
276
|
+
|
|
277
|
+
### MCP Server to Dashboard + Memory
|
|
278
|
+
|
|
279
|
+
**Boundary:** MCP server uses StateManager for state access and direct memory
|
|
280
|
+
imports. It has no dependency on the dashboard.
|
|
281
|
+
|
|
282
|
+
**Issue:** MCP tools overlap significantly with dashboard API endpoints (task
|
|
283
|
+
management, memory retrieval, state queries). There is no deduplication or
|
|
284
|
+
shared implementation between `/api/tasks` and the MCP `loki_queue_*` tools.
|
|
285
|
+
|
|
286
|
+
### Event Bus to All Components
|
|
287
|
+
|
|
288
|
+
**Boundary:** File-based pub/sub via `.loki/events/pending/`.
|
|
289
|
+
|
|
290
|
+
**Issue:** As documented in BUG ARCH-003, the event bus is underutilized.
|
|
291
|
+
The main orchestrator (run.sh) emits to JSONL, not the event bus. The
|
|
292
|
+
dashboard does not consume events from either system for real-time updates;
|
|
293
|
+
it polls `dashboard-state.json` instead.
|
|
294
|
+
|
|
295
|
+
---
|
|
296
|
+
|
|
297
|
+
## Recommendations Summary (Prioritized)
|
|
298
|
+
|
|
299
|
+
| Priority | Issue | Effort | Impact |
|
|
300
|
+
|----------|-------|--------|--------|
|
|
301
|
+
| P0 | Fix non-atomic write in `set_phase()` | 15 min | Prevents state corruption |
|
|
302
|
+
| P0 | Unify LOKI_DIR resolution in dashboard package | 30 min | Prevents policy lookups from wrong directory |
|
|
303
|
+
| P1 | Consolidate dual event systems | 2-4 hours | Consistent event propagation |
|
|
304
|
+
| P3 | Convert `validate_provider()` to Pydantic field_validator | 10 min | Defense-in-depth validation |
|
|
305
|
+
| P2 | Split `autonomy/loki` into command modules | 1-2 days | Maintainability |
|
|
306
|
+
| P2 | Replace inline `python3 -c` with helper script | 4 hours | Performance improvement |
|
|
307
|
+
| P2 | Merge control.py routes into server.py | 2 hours | Eliminate duplicate FastAPI app |
|
|
308
|
+
| P3 | Adopt StateManager in dashboard | 1 day | Consistent state access |
|
|
309
|
+
| P3 | Connect event bus to dashboard WebSocket push | 4 hours | Real-time updates |
|
|
310
|
+
| P3 | Standardize CORS configuration | 30 min | Security consistency |
|
|
311
|
+
|
|
312
|
+
---
|
|
313
|
+
|
|
314
|
+
## Feedback Loop Verification
|
|
315
|
+
|
|
316
|
+
### Loop 1: Self-review
|
|
317
|
+
- All findings reference specific file paths and line numbers
|
|
318
|
+
- Severity ratings consider both likelihood and impact
|
|
319
|
+
- Recommendations are actionable with effort estimates
|
|
320
|
+
|
|
321
|
+
### Loop 2: Code verification
|
|
322
|
+
- BUG ARCH-001: Confirmed by reading `run.sh:3254-3261` -- no temp file + mv pattern
|
|
323
|
+
- BUG ARCH-002: Confirmed by reading `control.py:30`, `api_v2.py:68`, `server.py:1869`
|
|
324
|
+
- BUG ARCH-003: Confirmed by comparing `emit_event()` at line 893 vs `emit_event_pending()` at line 971
|
|
325
|
+
- File sizes confirmed via `wc -l` (loki=20,300, run.sh=10,869, server.py=5,244)
|
|
326
|
+
- Python3 -c count confirmed via grep (79 occurrences)
|
|
327
|
+
|
|
328
|
+
### Loop 3: Priority validation
|
|
329
|
+
- P0 items are data integrity issues that can cause silent corruption
|
|
330
|
+
- P1 items are defense-in-depth security and consistency
|
|
331
|
+
- P2/P3 items are maintainability and performance improvements
|