loki-mode 6.0.0 → 6.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +20 -0
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/bmad-adapter.py +776 -0
- package/autonomy/loki +393 -0
- package/autonomy/prd-analyzer.py +26 -4
- package/autonomy/run.sh +149 -4
- package/autonomy/sandbox.sh +181 -1
- package/dashboard/__init__.py +1 -1
- package/docs/INSTALLATION.md +1 -1
- package/docs/architecture/bmad-integration-epic.md +271 -0
- package/docs/architecture/bmad-integration-review.md +86 -0
- package/docs/architecture/bmad-integration-validation.md +249 -0
- package/docs/architecture/bmad-loki-voice-agent-council-analysis.md +61 -0
- package/mcp/__init__.py +1 -1
- package/mcp/requirements.txt +1 -0
- package/mcp/server.py +152 -0
- package/package.json +1 -1
- package/templates/clusters/README.md +21 -0
- package/templates/clusters/code-review.json +36 -0
- package/templates/clusters/performance-audit.json +29 -0
- package/templates/clusters/refactoring.json +29 -0
- package/templates/clusters/security-review.json +36 -0
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
# BMAD Integration Adversarial Review
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-02-25
|
|
4
|
+
**Methodology:** BMAD-style adversarial review (zero findings = failure, re-analyze)
|
|
5
|
+
**Reviewers:** 3 blind Opus agents + Devil's Advocate pass
|
|
6
|
+
|
|
7
|
+
## Review Process
|
|
8
|
+
|
|
9
|
+
Three independent reviewers analyzed the full BMAD integration diff across 5 files.
|
|
10
|
+
Combined: 48 unique findings (2 CRITICAL, 12 HIGH, 21 MEDIUM, 13 LOW).
|
|
11
|
+
|
|
12
|
+
After triage: 1 CRITICAL and 5 HIGH findings were fixed before this report.
|
|
13
|
+
|
|
14
|
+
## Findings Fixed (Pre-Merge)
|
|
15
|
+
|
|
16
|
+
| # | Severity | File | Issue | Fix Applied |
|
|
17
|
+
|---|----------|------|-------|-------------|
|
|
18
|
+
| 1 | CRITICAL | bmad-adapter.py:136-140 | Path traversal via config.json outputDir | Added resolve() + project root boundary check |
|
|
19
|
+
| 2 | HIGH | bmad-adapter.py:266,294,419 | Missing errors="replace" on read_text | Replaced all read_text with _safe_read() helper (size limit + encoding safety) |
|
|
20
|
+
| 3 | HIGH | bmad-adapter.py:576-607 | Non-atomic file writes | Added _write_atomic() using tempfile + os.replace pattern |
|
|
21
|
+
| 4 | HIGH | run.sh:7042-7050 | Unbounded BMAD content in prompt | Added head -c size limits (16K arch, 32K tasks, 8K validation) |
|
|
22
|
+
| 5 | HIGH | run.sh:7066-7074 | BMAD context before human_directive in prompt | Moved bmad_context after human_directive and queue_tasks |
|
|
23
|
+
| 6 | HIGH | test-bmad-integration.sh:37,126,163,248 | Trap quoting + inline Python injection | Fixed trap quoting, replaced open('$var') with sys.argv[1] |
|
|
24
|
+
|
|
25
|
+
## Findings Accepted (Known Limitations)
|
|
26
|
+
|
|
27
|
+
| # | Severity | Issue | Rationale for Accepting |
|
|
28
|
+
|---|----------|-------|------------------------|
|
|
29
|
+
| 1 | MEDIUM | Regex YAML parser does not handle block-style lists | BMAD fixtures use flow-style. Block-style support is a future enhancement. Documented in code. |
|
|
30
|
+
| 2 | MEDIUM | populate_bmad_queue() has no standalone test | Function tested indirectly through full integration. Standalone test is a future improvement. |
|
|
31
|
+
| 3 | MEDIUM | prd-analyzer scope estimation includes architecture lines | Conservative choice -- slightly inflated feature count is better than missing features. |
|
|
32
|
+
| 4 | MEDIUM | BMAD_PROJECT_PATH exported but unused by run.sh | Intentionally kept for future provider scripts. Added comment. |
|
|
33
|
+
| 5 | LOW | Error messages show full filesystem paths | Acceptable for CLI tool aimed at developers. Not a production web service. |
|
|
34
|
+
| 6 | LOW | mkdir uses default umask | Standard Python behavior, consistent with rest of codebase. |
|
|
35
|
+
|
|
36
|
+
## Adversarial Scenarios Tested
|
|
37
|
+
|
|
38
|
+
### What happens when BMAD output format changes in V7?
|
|
39
|
+
|
|
40
|
+
The adapter uses loose pattern matching (regex on headings, not exact schema validation).
|
|
41
|
+
Section headings like "## Functional Requirements" and "FR1:" patterns are generic enough
|
|
42
|
+
to survive minor format changes. The frontmatter parser handles unknown keys gracefully.
|
|
43
|
+
**Risk: LOW.** The adapter degrades gracefully -- fewer dimensions matched means lower
|
|
44
|
+
score, not a crash.
|
|
45
|
+
|
|
46
|
+
### What happens with malformed BMAD artifacts?
|
|
47
|
+
|
|
48
|
+
Tested with incomplete fixture (partial workflow state). Adapter:
|
|
49
|
+
- Reports workflow completion percentage
|
|
50
|
+
- Warns about missing artifacts
|
|
51
|
+
- Processes what exists without crashing
|
|
52
|
+
- Uses errors="replace" for encoding safety
|
|
53
|
+
- Has 10MB file size limit
|
|
54
|
+
|
|
55
|
+
**Risk: LOW.** Graceful degradation verified.
|
|
56
|
+
|
|
57
|
+
### What happens when _bmad-output/ contains partial workflow state?
|
|
58
|
+
|
|
59
|
+
Tested with `stepsCompleted: [init, discovery, vision]` (30% complete).
|
|
60
|
+
Adapter reports completion percentage and warns. Does not block processing.
|
|
61
|
+
**Risk: LOW.** User gets informed about incomplete state.
|
|
62
|
+
|
|
63
|
+
### What happens if someone passes --bmad-project to a non-BMAD project?
|
|
64
|
+
|
|
65
|
+
Clear error message and non-zero exit code:
|
|
66
|
+
```
|
|
67
|
+
ERROR: BMAD output directory not found: _bmad-output/planning-artifacts
|
|
68
|
+
This does not appear to be a BMAD project.
|
|
69
|
+
```
|
|
70
|
+
**Risk: NONE.** Clean failure.
|
|
71
|
+
|
|
72
|
+
### What about backward compatibility for non-BMAD projects?
|
|
73
|
+
|
|
74
|
+
All BMAD code paths are guarded:
|
|
75
|
+
- CLI: only activated by explicit --bmad-project flag
|
|
76
|
+
- run.sh: checks for .loki/bmad-metadata.json existence
|
|
77
|
+
- prd-analyzer.py: new patterns only ADD to existing ones, never replace
|
|
78
|
+
|
|
79
|
+
Freeform PRD test confirms identical scoring (5.0/10) before and after changes.
|
|
80
|
+
**Risk: NONE.** Verified by test.
|
|
81
|
+
|
|
82
|
+
## Recommendation
|
|
83
|
+
|
|
84
|
+
**PASS.** All CRITICAL and HIGH findings fixed. Remaining MEDIUM/LOW findings are
|
|
85
|
+
acceptable known limitations with documented rationale. Backward compatibility verified.
|
|
86
|
+
Integration is clean, additive, and well-guarded.
|
|
@@ -0,0 +1,249 @@
|
|
|
1
|
+
# BMAD Integration Validation Report
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-02-25
|
|
4
|
+
**Validator:** Automated analysis against Loki Mode v6.0.0 codebase
|
|
5
|
+
**BMAD Version:** Latest main (cloned 2026-02-25)
|
|
6
|
+
|
|
7
|
+
## Decision: GO (Phase 0 / Epic 1 only)
|
|
8
|
+
|
|
9
|
+
Phase 0 (BMAD Artifact Pipeline) is low-risk, additive, and achievable with the current
|
|
10
|
+
codebase. Phases 1-2 are deferred pending P0 value validation.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## 1. Compatibility Matrix
|
|
15
|
+
|
|
16
|
+
| Integration Point | Status | Notes |
|
|
17
|
+
|---|---|---|
|
|
18
|
+
| PRD format parsing | Compatible (minor gaps) | 7/9 analyzer dimensions match BMAD headings directly |
|
|
19
|
+
| Artifact chain discovery | New capability needed | Adapter must find `_bmad-output/planning-artifacts/` |
|
|
20
|
+
| Agent personas | Complementary | BMAD pre-dev agents + Loki execution agents = full coverage |
|
|
21
|
+
| Voice capabilities | Insufficient for P2 | voice.sh only does 4-section dictation, not structured dialogue |
|
|
22
|
+
| Context budget | Safe | ~8-15K tokens per iteration (step files load one-at-a-time) |
|
|
23
|
+
| License | MIT -- fully compatible | No restrictions on integration or redistribution |
|
|
24
|
+
| Event bus integration | Ready | `.loki/events/pending/` accepts BMAD artifact events |
|
|
25
|
+
| Memory system | Ready | BMAD artifacts can be stored as episodic memory |
|
|
26
|
+
| CLI integration | Straightforward | `--bmad-project` flag pattern matches existing CLI architecture |
|
|
27
|
+
| Dashboard | Deferred (P1) | Would need new "Elicitation" panel |
|
|
28
|
+
|
|
29
|
+
## 2. PRD Format Gap Analysis
|
|
30
|
+
|
|
31
|
+
### BMAD PRD Section Headings vs Loki Analyzer Dimensions
|
|
32
|
+
|
|
33
|
+
| BMAD PRD Section | Loki Dimension | Match Type |
|
|
34
|
+
|---|---|---|
|
|
35
|
+
| `## Executive Summary` | -- | NO MATCH (no heading pattern covers "executive summary") |
|
|
36
|
+
| `## Project Classification` | -- | NO MATCH |
|
|
37
|
+
| `## Success Criteria` | `acceptance_criteria` | PARTIAL (heading pattern matches "criteria" keyword) |
|
|
38
|
+
| `## Product Scope` | `feature_list` | MATCH (heading pattern matches "scope") |
|
|
39
|
+
| `## User Journeys` | `user_stories` | MATCH (heading pattern matches "user.*journey") |
|
|
40
|
+
| `## Domain-Specific Requirements` | -- | NO MATCH |
|
|
41
|
+
| `## Innovation & Novel Patterns` | -- | NO MATCH |
|
|
42
|
+
| `## [ProjectType] Specific Requirements` | `feature_list` | PARTIAL (matches "requirement") |
|
|
43
|
+
| `## Project Scoping & Phased Development` | `feature_list` | PARTIAL (matches "scope") |
|
|
44
|
+
| `## Functional Requirements` | `feature_list` | MATCH (matches "functional" and "requirement") |
|
|
45
|
+
| `## Non-Functional Requirements` | Multiple | MATCH (content patterns for security, deployment, error handling) |
|
|
46
|
+
|
|
47
|
+
### Content Pattern Matches
|
|
48
|
+
|
|
49
|
+
| BMAD Content Pattern | Loki Content Pattern | Match? |
|
|
50
|
+
|---|---|---|
|
|
51
|
+
| `FR{N}: [Actor] can [capability]` | `user can/should/will` in `user_stories` | YES |
|
|
52
|
+
| `Given/When/Then` (in epics) | `given.*when.*then` in `acceptance_criteria` | YES |
|
|
53
|
+
| `As a {role}, I want {action}` (stories) | `as a \w+` in `user_stories` | YES |
|
|
54
|
+
| Tech stack mentions in architecture.md | Tech keyword patterns in `tech_stack` | YES |
|
|
55
|
+
| `### Performance`, `### Security` (NFRs) | Security/deployment heading patterns | YES |
|
|
56
|
+
|
|
57
|
+
### Gaps Requiring Adapter Work
|
|
58
|
+
|
|
59
|
+
1. **Executive Summary** -- BMAD's most prominent section has no matching Loki dimension.
|
|
60
|
+
Adapter should map this to a "project_overview" meta-dimension.
|
|
61
|
+
|
|
62
|
+
2. **Project Classification** -- BMAD includes project type, domain, complexity.
|
|
63
|
+
No Loki equivalent. Adapter should extract and pass as metadata.
|
|
64
|
+
|
|
65
|
+
3. **Domain-Specific Requirements** -- Healthcare, fintech, govtech compliance sections.
|
|
66
|
+
No Loki dimension covers domain compliance. Consider adding as optional dimension.
|
|
67
|
+
|
|
68
|
+
4. **Innovation & Novel Patterns** -- BMAD-specific section.
|
|
69
|
+
Not needed for scoring; pass through as supplementary context.
|
|
70
|
+
|
|
71
|
+
5. **Frontmatter parsing** -- BMAD documents have YAML frontmatter with `stepsCompleted`,
|
|
72
|
+
`inputDocuments`, `workflowType`. Loki's prd-analyzer.py ignores frontmatter entirely.
|
|
73
|
+
Adapter must strip frontmatter before passing to analyzer OR extend analyzer.
|
|
74
|
+
|
|
75
|
+
### Scoring Impact
|
|
76
|
+
|
|
77
|
+
A well-formed BMAD PRD would score approximately **7.5-8.5/10** on the current analyzer
|
|
78
|
+
without any changes:
|
|
79
|
+
- `feature_list`: HIGH (## Functional Requirements + bullet lists)
|
|
80
|
+
- `user_stories`: HIGH (## User Journeys + "As a..." stories in epics)
|
|
81
|
+
- `acceptance_criteria`: HIGH (Given/When/Then in epics)
|
|
82
|
+
- `tech_stack`: PARTIAL-HIGH (architecture.md has tech details, PRD may not)
|
|
83
|
+
- `security`: PARTIAL (## Non-Functional Requirements > ### Security)
|
|
84
|
+
- `deployment`: PARTIAL (may be in architecture.md, not PRD)
|
|
85
|
+
- `data_model`: NONE-PARTIAL (usually in architecture.md, not PRD)
|
|
86
|
+
- `api_spec`: NONE-PARTIAL (usually in architecture.md, not PRD)
|
|
87
|
+
- `error_handling`: PARTIAL (## Non-Functional Requirements may cover this)
|
|
88
|
+
|
|
89
|
+
With an adapter that also feeds architecture.md into the analyzer, score would be **9-10/10**.
|
|
90
|
+
|
|
91
|
+
## 3. Agent Overlap Analysis
|
|
92
|
+
|
|
93
|
+
### BMAD Agents (8) vs Loki Agent Types (41)
|
|
94
|
+
|
|
95
|
+
| BMAD Agent | Role | Loki Equivalent(s) | Relationship |
|
|
96
|
+
|---|---|---|---|
|
|
97
|
+
| Mary (Analyst) | Business analysis, research | `prod-pm` (partial) | Complementary -- BMAD analyst is pre-development |
|
|
98
|
+
| John (PM) | PRD creation, validation | `prod-pm`, `orch-planner` | Overlapping -- Loki PM focuses on execution planning |
|
|
99
|
+
| Winston (Architect) | Architecture design | `eng-infra`, `orch-planner` | Complementary -- BMAD architect is pre-code |
|
|
100
|
+
| Sally (UX Designer) | UX specification | `prod-design` | Complementary -- BMAD UX is spec, Loki is implementation |
|
|
101
|
+
| Amelia (Developer) | Code implementation | `eng-*` (8 agents) | Superseded -- Loki has specialized dev agents |
|
|
102
|
+
| Bob (Scrum Master) | Sprint planning | `orch-coordinator` | Overlapping -- different abstraction level |
|
|
103
|
+
| Quinn (QA) | E2E test generation | `eng-qa` | Overlapping -- both generate tests |
|
|
104
|
+
| Barry (Quick Flow) | Solo rapid dev | No equivalent | Unique to BMAD |
|
|
105
|
+
|
|
106
|
+
### Assessment
|
|
107
|
+
|
|
108
|
+
- **Zero conflict.** BMAD agents operate in the pre-development space (requirements, planning,
|
|
109
|
+
architecture). Loki agents operate in the execution space (coding, testing, deploying).
|
|
110
|
+
- **Clear handoff point:** BMAD produces artifacts (PRD, architecture, epics). Loki consumes
|
|
111
|
+
them. The adapter bridges the format gap.
|
|
112
|
+
- **Party Mode** (multi-agent collaboration) is a BMAD concept that could enrich Loki's
|
|
113
|
+
council-based review in future phases.
|
|
114
|
+
|
|
115
|
+
## 4. Voice Compatibility Assessment
|
|
116
|
+
|
|
117
|
+
### Current Capabilities (voice.sh)
|
|
118
|
+
|
|
119
|
+
| Capability | Status |
|
|
120
|
+
|---|---|
|
|
121
|
+
| STT (Speech-to-Text) | Whisper API, local Whisper, macOS dictation |
|
|
122
|
+
| TTS (Text-to-Speech) | macOS `say`, Linux espeak/festival |
|
|
123
|
+
| Audio recording | sox, ffmpeg, arecord |
|
|
124
|
+
| Guided dictation | 4-section template only (name, overview, requirements, tech stack) |
|
|
125
|
+
| Structured dialogue | NOT SUPPORTED |
|
|
126
|
+
| Agent handoff | NOT SUPPORTED |
|
|
127
|
+
| Session persistence | NOT SUPPORTED |
|
|
128
|
+
| Technical term correction | NOT SUPPORTED |
|
|
129
|
+
|
|
130
|
+
### What BMAD Voice Integration (P2) Would Require
|
|
131
|
+
|
|
132
|
+
| Requirement | Effort | Description |
|
|
133
|
+
|---|---|---|
|
|
134
|
+
| Step-file-to-question mapper | Large | Convert BMAD step instructions to conversational prompts |
|
|
135
|
+
| Multi-turn dialogue manager | Large | Track conversation state, handle clarifications, backtracking |
|
|
136
|
+
| Agent persona injection | Medium | Voice TTS uses agent voice characteristics |
|
|
137
|
+
| Technical term correction loop | Medium | Confirm jargon transcription ("Did you say React or Preact?") |
|
|
138
|
+
| Session persistence | Medium | Resume BMAD workflows across voice sessions |
|
|
139
|
+
| Dual-mode interface | Large | Voice for elicitation, visual for review |
|
|
140
|
+
|
|
141
|
+
### Assessment
|
|
142
|
+
|
|
143
|
+
Voice integration (P2) is the highest-risk phase. The current voice.sh is a thin wrapper
|
|
144
|
+
around STT/TTS tools. Transforming it into a structured dialogue system requires:
|
|
145
|
+
- New conversation state machine (not just record-transcribe-write)
|
|
146
|
+
- BMAD step-file interpreter (convert markdown instructions to conversational flow)
|
|
147
|
+
- Feedback loop for transcription accuracy (critical for technical terms)
|
|
148
|
+
|
|
149
|
+
**Recommendation:** Defer P2 entirely. P0 (artifact pipeline) delivers value without voice.
|
|
150
|
+
Voice integration can be revisited after P0 proves the BMAD artifact format is stable and useful.
|
|
151
|
+
|
|
152
|
+
## 5. Context Budget Analysis
|
|
153
|
+
|
|
154
|
+
### Per-Iteration Context Cost (P0 Only)
|
|
155
|
+
|
|
156
|
+
| Component | Tokens | Source |
|
|
157
|
+
|---|---|---|
|
|
158
|
+
| Loki SKILL.md | ~2,750 | Always loaded |
|
|
159
|
+
| RARV instructions | ~1,500 | build_prompt() |
|
|
160
|
+
| SDLC phases + rules | ~1,000 | build_prompt() |
|
|
161
|
+
| Memory context | ~2,000-5,000 | Memory retrieval |
|
|
162
|
+
| PRD content | ~5,000-12,000 | BMAD PRD document |
|
|
163
|
+
| PRD observations | ~500-1,000 | prd-analyzer output |
|
|
164
|
+
| **BMAD adapter metadata** | **~500-1,000** | **Project classification, artifact chain info** |
|
|
165
|
+
| **BMAD architecture summary** | **~2,000-4,000** | **Condensed architecture decisions** |
|
|
166
|
+
| **BMAD epic summary** | **~1,000-3,000** | **Active epic/story context** |
|
|
167
|
+
| Checklist status | ~500-1,000 | verification-results.json |
|
|
168
|
+
| Queue tasks | ~500-2,000 | .loki/queue/ |
|
|
169
|
+
| **TOTAL** | **~17,250-32,250** | **Well under 150K ceiling** |
|
|
170
|
+
|
|
171
|
+
### Verdict
|
|
172
|
+
|
|
173
|
+
At worst case (~32K tokens), BMAD integration uses ~21% of a 150K context window.
|
|
174
|
+
This leaves ample room for code context, tool outputs, and conversation history.
|
|
175
|
+
No context pressure risk.
|
|
176
|
+
|
|
177
|
+
## 6. Risk Register
|
|
178
|
+
|
|
179
|
+
| Risk | Severity | Likelihood | Mitigation |
|
|
180
|
+
|---|---|---|---|
|
|
181
|
+
| BMAD output format changes in future versions | Medium | Medium | Adapter uses loose pattern matching, not exact schema; version-pin BMAD reference |
|
|
182
|
+
| Malformed BMAD artifacts (partial workflow state) | Low | Medium | Adapter validates artifact completeness; falls back to freeform PRD path |
|
|
183
|
+
| BMAD `_bmad-output/` not found | Low | Low | Clear error message; `--bmad-project` flag is explicit, not auto-detected |
|
|
184
|
+
| prd-analyzer regression on freeform PRDs | High | Low | Test suite includes both BMAD and freeform PRD fixtures; backward compatibility gate |
|
|
185
|
+
| Context budget exceeded with very large BMAD PRDs | Low | Low | PRD content is truncated at 12K tokens; architecture summary is condensed |
|
|
186
|
+
| BMAD trademark concerns | Low | Low | MIT license permits code use; trademark applies to branding, not API integration |
|
|
187
|
+
| Scope creep into P1/P2 during P0 implementation | Medium | Medium | Strict phase gates; P1/P2 deferred until P0 ships and proves value |
|
|
188
|
+
|
|
189
|
+
## 7. Integration Architecture (P0)
|
|
190
|
+
|
|
191
|
+
```
|
|
192
|
+
User's BMAD project
|
|
193
|
+
_bmad-output/
|
|
194
|
+
planning-artifacts/
|
|
195
|
+
product-brief-*.md
|
|
196
|
+
prd-*.md <-- Primary input
|
|
197
|
+
architecture.md <-- Secondary input
|
|
198
|
+
epics.md <-- Story/task source
|
|
199
|
+
implementation-artifacts/
|
|
200
|
+
sprint-status.yaml
|
|
201
|
+
*.story.md
|
|
202
|
+
|
|
203
|
+
|
|
|
204
|
+
v
|
|
205
|
+
|
|
206
|
+
autonomy/bmad-adapter.py
|
|
207
|
+
- Discover _bmad-output/ structure
|
|
208
|
+
- Parse BMAD frontmatter (stepsCompleted, workflowType)
|
|
209
|
+
- Strip frontmatter, normalize headings
|
|
210
|
+
- Extract project classification metadata
|
|
211
|
+
- Feed normalized PRD to prd-analyzer.py
|
|
212
|
+
- Map epics to .loki/queue/ task format
|
|
213
|
+
|
|
214
|
+
|
|
|
215
|
+
v
|
|
216
|
+
|
|
217
|
+
autonomy/prd-analyzer.py (enhanced)
|
|
218
|
+
- New heading patterns for BMAD sections
|
|
219
|
+
- Architecture.md scoring support
|
|
220
|
+
- BMAD quality bonus (structured methodology)
|
|
221
|
+
- Backward-compatible with freeform PRDs
|
|
222
|
+
|
|
223
|
+
|
|
|
224
|
+
v
|
|
225
|
+
|
|
226
|
+
autonomy/loki --bmad-project <path>
|
|
227
|
+
- Discovery: find _bmad-output/ in project
|
|
228
|
+
- Load: run bmad-adapter.py, then prd-analyzer.py
|
|
229
|
+
- Inject: BMAD metadata into build_prompt()
|
|
230
|
+
- Queue: BMAD epics/stories into .loki/queue/
|
|
231
|
+
|
|
232
|
+
|
|
|
233
|
+
v
|
|
234
|
+
|
|
235
|
+
autonomy/run.sh (build_prompt)
|
|
236
|
+
- BMAD context block injected alongside PRD
|
|
237
|
+
- Architecture decisions as supplementary context
|
|
238
|
+
- Epic/story tracking in checklist system
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
## 8. Recommendations
|
|
242
|
+
|
|
243
|
+
1. **Proceed with P0 (Epic 1) only.** BMAD Artifact Pipeline.
|
|
244
|
+
2. **Do not embed BMAD engine.** Read BMAD outputs, do not execute BMAD workflows.
|
|
245
|
+
3. **Do not implement voice integration.** Defer to separate initiative.
|
|
246
|
+
4. **Create adapter as standalone Python module.** `autonomy/bmad-adapter.py` -- stdlib only.
|
|
247
|
+
5. **Enhance prd-analyzer.py conservatively.** Add patterns, do not restructure.
|
|
248
|
+
6. **Test with real BMAD output fixtures.** Create test fixtures from BMAD's own templates.
|
|
249
|
+
7. **Gate P1 on P0 success metrics:** At least 5 real projects use `--bmad-project` flag.
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
# BMAD Method x Loki Mode Voice Agent -- Council Analysis
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-02-25
|
|
4
|
+
**Council:** 3 Opus reviewers (blind review)
|
|
5
|
+
**Verdict:** Unanimous YES with phased approach
|
|
6
|
+
|
|
7
|
+
## Proposal Summary
|
|
8
|
+
|
|
9
|
+
Integrate the BMAD Method (https://github.com/bmad-code-org/BMAD-METHOD) with Loki Mode
|
|
10
|
+
to create a structured requirements elicitation pipeline. BMAD provides a 4-phase
|
|
11
|
+
workflow (Analysis, Planning, Solutioning, Implementation) with agent personas, step-file
|
|
12
|
+
architecture, and adversarial review -- complementing Loki Mode's autonomous execution engine.
|
|
13
|
+
|
|
14
|
+
## Council Findings
|
|
15
|
+
|
|
16
|
+
### Reviewer 1: Architecture Focus
|
|
17
|
+
|
|
18
|
+
**Vote:** YES (phased)
|
|
19
|
+
|
|
20
|
+
- BMAD's step-file architecture aligns with Loki's progressive disclosure model
|
|
21
|
+
- BMAD artifacts (product-brief, PRD, architecture, epics) map cleanly to Loki's SDLC phases
|
|
22
|
+
- Context budget is manageable: BMAD step files load one-at-a-time (~3K tokens each)
|
|
23
|
+
- Integration surface is well-defined: `.loki/queue/`, `build_prompt()`, event bus
|
|
24
|
+
- Risk: BMAD is an external dependency that may change format without notice
|
|
25
|
+
|
|
26
|
+
### Reviewer 2: Integration Feasibility
|
|
27
|
+
|
|
28
|
+
**Vote:** YES (phased)
|
|
29
|
+
|
|
30
|
+
- BMAD PRD output sections (Functional Requirements, Success Criteria, User Journeys)
|
|
31
|
+
match most of prd-analyzer.py's existing dimension patterns
|
|
32
|
+
- Agent overlap is complementary, not conflicting: BMAD covers pre-development, Loki covers execution
|
|
33
|
+
- Voice.sh needs significant extension for structured dialogue (currently only 4-section dictation)
|
|
34
|
+
- MIT license is fully compatible with Loki Mode's distribution
|
|
35
|
+
- Risk: Voice agent layer (Phase 3) has highest uncertainty; STT reliability for technical terms
|
|
36
|
+
|
|
37
|
+
### Reviewer 3: Risk and Quality
|
|
38
|
+
|
|
39
|
+
**Vote:** YES (phased, with gates)
|
|
40
|
+
|
|
41
|
+
- Backward compatibility is achievable: BMAD integration is additive (new flag, new adapter)
|
|
42
|
+
- Existing non-BMAD projects are untouched unless `--bmad-project` is explicitly used
|
|
43
|
+
- Quality gates apply to all new code: 3-reviewer blind review, anti-sycophancy, test coverage
|
|
44
|
+
- BMAD's adversarial review methodology strengthens Loki's existing quality gate system
|
|
45
|
+
- Risk: Scope creep from Epic 2 (engine embedding) and Epic 3 (voice agent) -- phase gates essential
|
|
46
|
+
|
|
47
|
+
## Recommended Phased Approach
|
|
48
|
+
|
|
49
|
+
| Phase | Epic | Priority | Risk |
|
|
50
|
+
|-------|-------|----------|------|
|
|
51
|
+
| P0 | BMAD Artifact Pipeline (parse, score, load) | Must-have | Low |
|
|
52
|
+
| P1 | BMAD Engine Embedding (agent YAML parser, step processor) | Should-have | Medium |
|
|
53
|
+
| P2 | Voice Agent Layer (structured dialogue, BMAD-to-voice) | Nice-to-have | High |
|
|
54
|
+
|
|
55
|
+
## Key Constraints
|
|
56
|
+
|
|
57
|
+
1. P0 must ship independently and prove value before P1/P2 begin
|
|
58
|
+
2. No runtime dependency on BMAD repo -- adapter reads BMAD output artifacts only
|
|
59
|
+
3. Zero regression on existing non-BMAD workflows
|
|
60
|
+
4. All code must pass existing 9-gate quality system
|
|
61
|
+
5. Context budget: BMAD additions must stay under 15K tokens per iteration
|
package/mcp/__init__.py
CHANGED
package/mcp/requirements.txt
CHANGED
package/mcp/server.py
CHANGED
|
@@ -1214,6 +1214,158 @@ async def loki_quality_report() -> str:
|
|
|
1214
1214
|
return json.dumps({"error": str(e)})
|
|
1215
1215
|
|
|
1216
1216
|
|
|
1217
|
+
# ============================================================
|
|
1218
|
+
# CODE SEARCH - ChromaDB-backed semantic code search
|
|
1219
|
+
# ============================================================
|
|
1220
|
+
|
|
1221
|
+
# ChromaDB connection (lazy-initialized)
|
|
1222
|
+
_chroma_client = None
|
|
1223
|
+
_chroma_collection = None
|
|
1224
|
+
|
|
1225
|
+
CHROMA_HOST = os.environ.get("LOKI_CHROMA_HOST", "localhost")
|
|
1226
|
+
CHROMA_PORT = int(os.environ.get("LOKI_CHROMA_PORT", "8100"))
|
|
1227
|
+
CHROMA_COLLECTION = os.environ.get("LOKI_CHROMA_COLLECTION", "loki-codebase")
|
|
1228
|
+
|
|
1229
|
+
|
|
1230
|
+
def _get_chroma_collection():
|
|
1231
|
+
"""Get or create ChromaDB collection (lazy connection)."""
|
|
1232
|
+
global _chroma_client, _chroma_collection
|
|
1233
|
+
if _chroma_collection is not None:
|
|
1234
|
+
return _chroma_collection
|
|
1235
|
+
try:
|
|
1236
|
+
import chromadb
|
|
1237
|
+
_chroma_client = chromadb.HttpClient(host=CHROMA_HOST, port=int(CHROMA_PORT))
|
|
1238
|
+
_chroma_collection = _chroma_client.get_collection(name=CHROMA_COLLECTION)
|
|
1239
|
+
return _chroma_collection
|
|
1240
|
+
except Exception as e:
|
|
1241
|
+
logger.warning(f"ChromaDB not available: {e}")
|
|
1242
|
+
return None
|
|
1243
|
+
|
|
1244
|
+
|
|
1245
|
+
@mcp.tool()
|
|
1246
|
+
async def loki_code_search(
|
|
1247
|
+
query: str,
|
|
1248
|
+
n_results: int = 10,
|
|
1249
|
+
language: Optional[str] = None,
|
|
1250
|
+
file_filter: Optional[str] = None,
|
|
1251
|
+
type_filter: Optional[str] = None,
|
|
1252
|
+
) -> str:
|
|
1253
|
+
"""Search the loki-mode codebase semantically.
|
|
1254
|
+
|
|
1255
|
+
Finds functions, classes, and code sections by meaning, not just keywords.
|
|
1256
|
+
Returns file paths, line numbers, and code snippets ranked by relevance.
|
|
1257
|
+
|
|
1258
|
+
Args:
|
|
1259
|
+
query: Natural language search query (e.g., "rate limit detection",
|
|
1260
|
+
"model selection for RARV tier", "how does the council vote")
|
|
1261
|
+
n_results: Number of results to return (default 10, max 30)
|
|
1262
|
+
language: Filter by language: "shell", "python", "markdown" (optional)
|
|
1263
|
+
file_filter: Filter by file path substring (e.g., "autonomy/", "dashboard/") (optional)
|
|
1264
|
+
type_filter: Filter by chunk type: "function", "class", "header", "section", "file" (optional)
|
|
1265
|
+
"""
|
|
1266
|
+
_emit_tool_event_async('loki_code_search', 'start', query=query)
|
|
1267
|
+
|
|
1268
|
+
collection = _get_chroma_collection()
|
|
1269
|
+
if collection is None:
|
|
1270
|
+
return json.dumps({
|
|
1271
|
+
"error": "ChromaDB not available. Start it with: docker start loki-chroma",
|
|
1272
|
+
"hint": "Re-index with: python3.12 tools/index-codebase.py --reset"
|
|
1273
|
+
})
|
|
1274
|
+
|
|
1275
|
+
n_results = min(max(1, n_results), 30)
|
|
1276
|
+
|
|
1277
|
+
# Build where filter
|
|
1278
|
+
where_clauses = []
|
|
1279
|
+
if language:
|
|
1280
|
+
where_clauses.append({"language": language})
|
|
1281
|
+
if type_filter:
|
|
1282
|
+
where_clauses.append({"type": type_filter})
|
|
1283
|
+
|
|
1284
|
+
where = None
|
|
1285
|
+
if len(where_clauses) == 1:
|
|
1286
|
+
where = where_clauses[0]
|
|
1287
|
+
elif len(where_clauses) > 1:
|
|
1288
|
+
where = {"$and": where_clauses}
|
|
1289
|
+
|
|
1290
|
+
try:
|
|
1291
|
+
results = collection.query(
|
|
1292
|
+
query_texts=[query],
|
|
1293
|
+
n_results=n_results,
|
|
1294
|
+
where=where,
|
|
1295
|
+
include=["documents", "metadatas", "distances"],
|
|
1296
|
+
)
|
|
1297
|
+
|
|
1298
|
+
# Format results
|
|
1299
|
+
output = []
|
|
1300
|
+
for i in range(len(results["ids"][0])):
|
|
1301
|
+
meta = results["metadatas"][0][i]
|
|
1302
|
+
doc = results["documents"][0][i]
|
|
1303
|
+
dist = results["distances"][0][i]
|
|
1304
|
+
|
|
1305
|
+
# Apply file_filter post-query (ChromaDB where doesn't support substring match)
|
|
1306
|
+
if file_filter and file_filter not in meta.get("file", ""):
|
|
1307
|
+
continue
|
|
1308
|
+
|
|
1309
|
+
# Truncate document for response
|
|
1310
|
+
preview = doc[:500] + "..." if len(doc) > 500 else doc
|
|
1311
|
+
|
|
1312
|
+
output.append({
|
|
1313
|
+
"file": meta.get("file", ""),
|
|
1314
|
+
"line": meta.get("line", 0),
|
|
1315
|
+
"name": meta.get("name", ""),
|
|
1316
|
+
"type": meta.get("type", ""),
|
|
1317
|
+
"language": meta.get("language", ""),
|
|
1318
|
+
"relevance": round(1 - dist, 4), # Convert distance to similarity
|
|
1319
|
+
"preview": preview,
|
|
1320
|
+
})
|
|
1321
|
+
|
|
1322
|
+
_emit_tool_event_async('loki_code_search', 'complete',
|
|
1323
|
+
result_status='success', result_count=len(output))
|
|
1324
|
+
return json.dumps({"query": query, "results": output, "total": len(output)})
|
|
1325
|
+
|
|
1326
|
+
except Exception as e:
|
|
1327
|
+
logger.error(f"Code search failed: {e}")
|
|
1328
|
+
_emit_tool_event_async('loki_code_search', 'complete',
|
|
1329
|
+
result_status='error', error=str(e))
|
|
1330
|
+
return json.dumps({"error": str(e)})
|
|
1331
|
+
|
|
1332
|
+
|
|
1333
|
+
@mcp.tool()
|
|
1334
|
+
async def loki_code_search_stats() -> str:
|
|
1335
|
+
"""Get statistics about the code search index.
|
|
1336
|
+
|
|
1337
|
+
Shows total chunks, files indexed, breakdown by language and type.
|
|
1338
|
+
Useful for verifying the index is up to date.
|
|
1339
|
+
"""
|
|
1340
|
+
collection = _get_chroma_collection()
|
|
1341
|
+
if collection is None:
|
|
1342
|
+
return json.dumps({"error": "ChromaDB not available"})
|
|
1343
|
+
|
|
1344
|
+
try:
|
|
1345
|
+
count = collection.count()
|
|
1346
|
+
results = collection.get(limit=count, include=["metadatas"])
|
|
1347
|
+
|
|
1348
|
+
langs = {}
|
|
1349
|
+
types = {}
|
|
1350
|
+
files = set()
|
|
1351
|
+
for meta in results["metadatas"]:
|
|
1352
|
+
lang = meta.get("language", "unknown")
|
|
1353
|
+
typ = meta.get("type", "unknown")
|
|
1354
|
+
langs[lang] = langs.get(lang, 0) + 1
|
|
1355
|
+
types[typ] = types.get(typ, 0) + 1
|
|
1356
|
+
files.add(meta.get("file", ""))
|
|
1357
|
+
|
|
1358
|
+
return json.dumps({
|
|
1359
|
+
"total_chunks": count,
|
|
1360
|
+
"unique_files": len(files),
|
|
1361
|
+
"by_language": langs,
|
|
1362
|
+
"by_type": types,
|
|
1363
|
+
"reindex_command": "python3.12 tools/index-codebase.py --reset",
|
|
1364
|
+
})
|
|
1365
|
+
except Exception as e:
|
|
1366
|
+
return json.dumps({"error": str(e)})
|
|
1367
|
+
|
|
1368
|
+
|
|
1217
1369
|
# ============================================================
|
|
1218
1370
|
# PROMPTS - Pre-built prompt templates
|
|
1219
1371
|
# ============================================================
|
package/package.json
CHANGED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# Cluster Templates
|
|
2
|
+
|
|
3
|
+
Pre-built workflow topologies for common development patterns.
|
|
4
|
+
|
|
5
|
+
## Usage
|
|
6
|
+
|
|
7
|
+
loki cluster list # List available templates
|
|
8
|
+
loki cluster validate <template> # Validate topology
|
|
9
|
+
loki cluster run <template> [args] # Execute workflow
|
|
10
|
+
|
|
11
|
+
## Template Format
|
|
12
|
+
|
|
13
|
+
Each template is a JSON file defining agents, their pub/sub topics,
|
|
14
|
+
and the workflow topology.
|
|
15
|
+
|
|
16
|
+
## Available Templates
|
|
17
|
+
|
|
18
|
+
- security-review: Multi-agent security audit pipeline
|
|
19
|
+
- performance-audit: Performance analysis with profiling
|
|
20
|
+
- refactoring: Structured refactoring with test preservation
|
|
21
|
+
- code-review: Multi-reviewer code review process
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "Code Review",
|
|
3
|
+
"description": "Multi-reviewer code review with conflict resolution",
|
|
4
|
+
"version": "1.0.0",
|
|
5
|
+
"topology": "fan-out-fan-in",
|
|
6
|
+
"agents": [
|
|
7
|
+
{
|
|
8
|
+
"id": "reviewer-arch",
|
|
9
|
+
"type": "review-code",
|
|
10
|
+
"role": "Review architecture and design patterns",
|
|
11
|
+
"subscribes": ["task.start"],
|
|
12
|
+
"publishes": ["review.architecture"]
|
|
13
|
+
},
|
|
14
|
+
{
|
|
15
|
+
"id": "reviewer-security",
|
|
16
|
+
"type": "review-security",
|
|
17
|
+
"role": "Review security implications",
|
|
18
|
+
"subscribes": ["task.start"],
|
|
19
|
+
"publishes": ["review.security"]
|
|
20
|
+
},
|
|
21
|
+
{
|
|
22
|
+
"id": "reviewer-tests",
|
|
23
|
+
"type": "eng-qa",
|
|
24
|
+
"role": "Review test coverage and quality",
|
|
25
|
+
"subscribes": ["task.start"],
|
|
26
|
+
"publishes": ["review.tests"]
|
|
27
|
+
},
|
|
28
|
+
{
|
|
29
|
+
"id": "resolver",
|
|
30
|
+
"type": "review-code",
|
|
31
|
+
"role": "Synthesize all review findings and resolve conflicts",
|
|
32
|
+
"subscribes": ["review.architecture", "review.security", "review.tests"],
|
|
33
|
+
"publishes": ["task.complete"]
|
|
34
|
+
}
|
|
35
|
+
]
|
|
36
|
+
}
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "Performance Audit",
|
|
3
|
+
"description": "Performance analysis with profiling and optimization recommendations",
|
|
4
|
+
"version": "1.0.0",
|
|
5
|
+
"topology": "fan-out-fan-in",
|
|
6
|
+
"agents": [
|
|
7
|
+
{
|
|
8
|
+
"id": "profiler",
|
|
9
|
+
"type": "eng-perf",
|
|
10
|
+
"role": "Profile application and identify bottlenecks",
|
|
11
|
+
"subscribes": ["task.start"],
|
|
12
|
+
"publishes": ["profile.results"]
|
|
13
|
+
},
|
|
14
|
+
{
|
|
15
|
+
"id": "db-analyzer",
|
|
16
|
+
"type": "eng-database",
|
|
17
|
+
"role": "Analyze database queries and index usage",
|
|
18
|
+
"subscribes": ["task.start"],
|
|
19
|
+
"publishes": ["db.analysis"]
|
|
20
|
+
},
|
|
21
|
+
{
|
|
22
|
+
"id": "optimizer",
|
|
23
|
+
"type": "eng-perf",
|
|
24
|
+
"role": "Generate optimization recommendations from all analyses",
|
|
25
|
+
"subscribes": ["profile.results", "db.analysis"],
|
|
26
|
+
"publishes": ["task.complete"]
|
|
27
|
+
}
|
|
28
|
+
]
|
|
29
|
+
}
|