milens 0.6.3 → 0.6.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/adapters/SKILL.md +51 -0
- package/.agents/skills/analyzer/SKILL.md +77 -0
- package/.agents/skills/apps/SKILL.md +66 -0
- package/.agents/skills/docs/SKILL.md +73 -0
- package/.agents/skills/milens/SKILL.md +198 -0
- package/.agents/skills/milens-architect/SKILL.md +128 -0
- package/.agents/skills/milens-code-review/SKILL.md +186 -0
- package/.agents/skills/milens-debugger/SKILL.md +141 -0
- package/.agents/skills/milens-eval/SKILL.md +221 -0
- package/.agents/skills/milens-plan/SKILL.md +227 -0
- package/.agents/skills/milens-refactor-clean/SKILL.md +209 -0
- package/.agents/skills/milens-security-review/SKILL.md +224 -0
- package/.agents/skills/milens-tdd/SKILL.md +156 -0
- package/.agents/skills/orchestrator/SKILL.md +59 -0
- package/.agents/skills/parser/SKILL.md +81 -0
- package/.agents/skills/root/SKILL.md +86 -0
- package/.agents/skills/scripts/SKILL.md +45 -0
- package/.agents/skills/security/SKILL.md +65 -0
- package/.agents/skills/server/SKILL.md +72 -0
- package/.agents/skills/store/SKILL.md +75 -0
- package/.agents/skills/test/SKILL.md +121 -0
- package/LICENSE +21 -75
- package/README.md +356 -453
- package/adapters/README.md +107 -0
- package/adapters/claude-code/.claude/mcp.json +9 -0
- package/adapters/claude-code/CLAUDE.md +58 -0
- package/adapters/codex/.codex/codex.md +52 -0
- package/adapters/copilot/.github/copilot-instructions.md +62 -0
- package/adapters/cursor/.cursorrules +9 -0
- package/adapters/gemini/.gemini/context.md +58 -0
- package/adapters/opencode/.opencode/config.json +9 -0
- package/adapters/opencode/AGENTS.md +58 -0
- package/adapters/zed/.zed/settings.json +8 -0
- package/dist/agents-md.d.ts +3 -0
- package/dist/agents-md.d.ts.map +1 -0
- package/dist/agents-md.js +114 -0
- package/dist/agents-md.js.map +1 -0
- package/dist/analyzer/engine.d.ts +1 -0
- package/dist/analyzer/engine.d.ts.map +1 -1
- package/dist/analyzer/engine.js +37 -7
- package/dist/analyzer/engine.js.map +1 -1
- package/dist/cli.js +1472 -406
- package/dist/cli.js.map +1 -1
- package/dist/metrics.d.ts +51 -0
- package/dist/metrics.d.ts.map +1 -0
- package/dist/metrics.js +64 -0
- package/dist/metrics.js.map +1 -0
- package/dist/orchestrator/orchestrator.d.ts +65 -0
- package/dist/orchestrator/orchestrator.d.ts.map +1 -0
- package/dist/orchestrator/orchestrator.js +178 -0
- package/dist/orchestrator/orchestrator.js.map +1 -0
- package/dist/orchestrator/reporter.d.ts +15 -0
- package/dist/orchestrator/reporter.d.ts.map +1 -0
- package/dist/orchestrator/reporter.js +38 -0
- package/dist/orchestrator/reporter.js.map +1 -0
- package/dist/parser/lang-go.js +47 -47
- package/dist/parser/lang-java.js +29 -29
- package/dist/parser/lang-js.js +105 -105
- package/dist/parser/lang-php.js +38 -38
- package/dist/parser/lang-py.js +34 -34
- package/dist/parser/lang-ruby.js +14 -14
- package/dist/parser/lang-rust.js +30 -30
- package/dist/parser/lang-ts.js +191 -191
- package/dist/security/deps.d.ts +38 -0
- package/dist/security/deps.d.ts.map +1 -0
- package/dist/security/deps.js +685 -0
- package/dist/security/deps.js.map +1 -0
- package/dist/security/rules.d.ts +42 -0
- package/dist/security/rules.d.ts.map +1 -0
- package/dist/security/rules.js +943 -0
- package/dist/security/rules.js.map +1 -0
- package/dist/server/hooks.d.ts +29 -0
- package/dist/server/hooks.d.ts.map +1 -0
- package/dist/server/hooks.js +332 -0
- package/dist/server/hooks.js.map +1 -0
- package/dist/server/mcp-prompts.d.ts +277 -0
- package/dist/server/mcp-prompts.d.ts.map +1 -0
- package/dist/server/mcp-prompts.js +627 -0
- package/dist/server/mcp-prompts.js.map +1 -0
- package/dist/server/mcp.d.ts.map +1 -1
- package/dist/server/mcp.js +1030 -652
- package/dist/server/mcp.js.map +1 -1
- package/dist/server/test-plan.d.ts +20 -0
- package/dist/server/test-plan.d.ts.map +1 -0
- package/dist/server/test-plan.js +100 -0
- package/dist/server/test-plan.js.map +1 -0
- package/dist/server/watcher.d.ts +39 -0
- package/dist/server/watcher.d.ts.map +1 -0
- package/dist/server/watcher.js +134 -0
- package/dist/server/watcher.js.map +1 -0
- package/dist/skills.js +197 -153
- package/dist/skills.js.map +1 -1
- package/dist/store/annotations.d.ts +41 -0
- package/dist/store/annotations.d.ts.map +1 -0
- package/dist/store/annotations.js +195 -0
- package/dist/store/annotations.js.map +1 -0
- package/dist/store/confidence.d.ts +28 -0
- package/dist/store/confidence.d.ts.map +1 -0
- package/dist/store/confidence.js +109 -0
- package/dist/store/confidence.js.map +1 -0
- package/dist/store/db.d.ts +53 -14
- package/dist/store/db.d.ts.map +1 -1
- package/dist/store/db.js +447 -240
- package/dist/store/db.js.map +1 -1
- package/dist/store/schema.sql +143 -116
- package/dist/store/vectors.js +2 -2
- package/dist/types.d.ts +101 -0
- package/dist/types.d.ts.map +1 -1
- package/docs/README.md +22 -0
- package/package.json +81 -66
- package/dist/gateway/analyzer.d.ts +0 -6
- package/dist/gateway/analyzer.d.ts.map +0 -1
- package/dist/gateway/analyzer.js +0 -218
- package/dist/gateway/analyzer.js.map +0 -1
- package/dist/gateway/cache.d.ts +0 -35
- package/dist/gateway/cache.d.ts.map +0 -1
- package/dist/gateway/cache.js +0 -175
- package/dist/gateway/cache.js.map +0 -1
- package/dist/gateway/config.d.ts +0 -10
- package/dist/gateway/config.d.ts.map +0 -1
- package/dist/gateway/config.js +0 -167
- package/dist/gateway/config.js.map +0 -1
- package/dist/gateway/context-memory.d.ts +0 -68
- package/dist/gateway/context-memory.d.ts.map +0 -1
- package/dist/gateway/context-memory.js +0 -157
- package/dist/gateway/context-memory.js.map +0 -1
- package/dist/gateway/observability.d.ts +0 -83
- package/dist/gateway/observability.d.ts.map +0 -1
- package/dist/gateway/observability.js +0 -152
- package/dist/gateway/observability.js.map +0 -1
- package/dist/gateway/privacy.d.ts +0 -27
- package/dist/gateway/privacy.d.ts.map +0 -1
- package/dist/gateway/privacy.js +0 -139
- package/dist/gateway/privacy.js.map +0 -1
- package/dist/gateway/providers.d.ts +0 -66
- package/dist/gateway/providers.d.ts.map +0 -1
- package/dist/gateway/providers.js +0 -377
- package/dist/gateway/providers.js.map +0 -1
- package/dist/gateway/router.d.ts +0 -18
- package/dist/gateway/router.d.ts.map +0 -1
- package/dist/gateway/router.js +0 -102
- package/dist/gateway/router.js.map +0 -1
- package/dist/gateway/server.d.ts +0 -20
- package/dist/gateway/server.d.ts.map +0 -1
- package/dist/gateway/server.js +0 -387
- package/dist/gateway/server.js.map +0 -1
- package/dist/gateway/translator.d.ts +0 -19
- package/dist/gateway/translator.d.ts.map +0 -1
- package/dist/gateway/translator.js +0 -340
- package/dist/gateway/translator.js.map +0 -1
- package/dist/gateway/types.d.ts +0 -215
- package/dist/gateway/types.d.ts.map +0 -1
- package/dist/gateway/types.js +0 -3
- package/dist/gateway/types.js.map +0 -1
- package/dist/store/gateway-schema.sql +0 -53
|
@@ -0,0 +1,186 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: milens-code-review
|
|
3
|
+
description: Automated Code Review — PR analysis, symbol deep-dive, dead code detection, tech debt grep, and review report
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# milens-code-review — Automated Code Review
|
|
7
|
+
|
|
8
|
+
Analyze a PR or local changes with multi-layered review: risk scoring, symbol deep-dives, dead code audit, tech debt search, and a structured report.
|
|
9
|
+
|
|
10
|
+
## Tools Required
|
|
11
|
+
|
|
12
|
+
| Tool | Purpose |
|
|
13
|
+
|---|---|
|
|
14
|
+
| `mcp_milens_review_pr` | List changed files + affected symbols with risk scores |
|
|
15
|
+
| `mcp_milens_review_symbol` | Deep-dive on a single symbol (role, heat, dependents, test status) |
|
|
16
|
+
| `mcp_milens_context` | Incoming references + outgoing dependencies |
|
|
17
|
+
| `mcp_milens_find_dead_code` | Find exported symbols with zero references |
|
|
18
|
+
| `mcp_milens_grep` | Text search for tech debt markers and dangerous patterns |
|
|
19
|
+
|
|
20
|
+
> **CRITICAL:** All milens MCP tool calls MUST include the `repo` parameter set to the **absolute path of the workspace root**.
|
|
21
|
+
|
|
22
|
+
## Workflow
|
|
23
|
+
|
|
24
|
+
### Step 1: PR Overview
|
|
25
|
+
|
|
26
|
+
Start with the full PR-level assessment.
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
mcp_milens_review_pr({repo: "<workspaceRoot>"})
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Review the output for:
|
|
33
|
+
- **Changed files** — what was touched
|
|
34
|
+
- **Affected symbols** — each with a risk score (CRITICAL, HIGH, MEDIUM, LOW)
|
|
35
|
+
- **Risk summary** — distribution of risk levels across the change
|
|
36
|
+
|
|
37
|
+
### Step 2: Deep-Dive on High-Risk Symbols
|
|
38
|
+
|
|
39
|
+
For every symbol rated CRITICAL or HIGH, perform a deep-dive.
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
mcp_milens_review_symbol({name: "<symbolName>", repo: "<workspaceRoot>"})
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
What you get:
|
|
46
|
+
- **Role** — what the symbol does in the architecture
|
|
47
|
+
- **Heat** — frequency of changes, complexity, fan-in/fan-out
|
|
48
|
+
- **Dependents count** — how many things break if this is wrong
|
|
49
|
+
- **Test status** — whether it has test coverage
|
|
50
|
+
|
|
51
|
+
### Step 3: Context for High-Risk Symbols
|
|
52
|
+
|
|
53
|
+
Still for CRITICAL/HIGH symbols, get 360° context.
|
|
54
|
+
|
|
55
|
+
```
|
|
56
|
+
mcp_milens_context({name: "<symbolName>", repo: "<workspaceRoot>"})
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
This reveals:
|
|
60
|
+
- **Incoming references** — who depends on this symbol
|
|
61
|
+
- **Outgoing calls** — what this symbol depends on
|
|
62
|
+
- **Re-export chains** — if the symbol is re-exported through barrel files
|
|
63
|
+
|
|
64
|
+
### Step 4: Dead Code Audit
|
|
65
|
+
|
|
66
|
+
Check for exported symbols with zero references (potential dead code).
|
|
67
|
+
|
|
68
|
+
```
|
|
69
|
+
mcp_milens_find_dead_code({repo: "<workspaceRoot>", limit: 20})
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Flag any dead code that is:
|
|
73
|
+
- Newly introduced by this PR
|
|
74
|
+
- In files touched by this PR
|
|
75
|
+
- Adjacent to changed code (same module)
|
|
76
|
+
|
|
77
|
+
### Step 5: Tech Debt Search
|
|
78
|
+
|
|
79
|
+
Search for tech debt markers and debugging remnants.
|
|
80
|
+
|
|
81
|
+
```
|
|
82
|
+
mcp_milens_grep({pattern: "TODO|FIXME|HACK|console\\.(log|debug)", scope: "code", repo: "<workspaceRoot>"})
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
Also check for:
|
|
86
|
+
- Commented-out code
|
|
87
|
+
- `@ts-ignore` or `@ts-expect-error` (TypeScript)
|
|
88
|
+
- `# noqa` or `# type: ignore` (Python)
|
|
89
|
+
- Empty catch blocks
|
|
90
|
+
|
|
91
|
+
### Step 6: Produce Review Report
|
|
92
|
+
|
|
93
|
+
Consolidate findings into a review report with these sections:
|
|
94
|
+
|
|
95
|
+
1. **Summary** — files changed, risk distribution
|
|
96
|
+
2. **Symbol Deep-Dives** — one entry per CRITICAL/HIGH symbol with findings from Steps 2-3
|
|
97
|
+
3. **Dead Code** — symbols flagged from Step 4, with recommendations
|
|
98
|
+
4. **Tech Debt** — markers found in Step 5, with severity
|
|
99
|
+
5. **Recommendations** — prioritized list of actions (blocking vs. non-blocking)
|
|
100
|
+
6. **Verdict** — APPROVED / NEEDS CHANGES / BLOCKED
|
|
101
|
+
|
|
102
|
+
## Example Session
|
|
103
|
+
|
|
104
|
+
### Input
|
|
105
|
+
|
|
106
|
+
```
|
|
107
|
+
"review the PR for branch feature/auth-refactor"
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### Tool Calls
|
|
111
|
+
|
|
112
|
+
**Step 1 — PR overview:**
|
|
113
|
+
```
|
|
114
|
+
mcp_milens_review_pr({repo: "/home/user/project"})
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
**Output:**
|
|
118
|
+
```
|
|
119
|
+
Changed files (4):
|
|
120
|
+
src/auth/login.ts [modified]
|
|
121
|
+
src/auth/tokens.ts [new]
|
|
122
|
+
src/auth/types.ts [modified]
|
|
123
|
+
src/middleware/auth.ts [modified]
|
|
124
|
+
|
|
125
|
+
Affected symbols:
|
|
126
|
+
authenticate [function] CRITICAL — 15 dependents, no tests
|
|
127
|
+
generateToken [function] HIGH — 8 dependents, 1 test
|
|
128
|
+
UserSession [interface] MEDIUM — 5 dependents
|
|
129
|
+
validateRole [function] LOW — 2 dependents
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
**Step 2 — Deep-dive on CRITICAL symbol:**
|
|
133
|
+
```
|
|
134
|
+
mcp_milens_review_symbol({name: "authenticate", repo: "/home/user/project"})
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
**Output:**
|
|
138
|
+
```
|
|
139
|
+
Symbol: authenticate [function] — CRITICAL
|
|
140
|
+
Role: core auth entry point, all protected routes depend on it
|
|
141
|
+
Heat: 23 commits in last 90 days, 87% modification rate
|
|
142
|
+
Dependents: 15 (12 direct callers + 3 middleware wrappers)
|
|
143
|
+
Test status: 0 tests, 0% coverage
|
|
144
|
+
Complexity: high (cyclomatic 14, 5 branching paths)
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
**Step 3 — Context:**
|
|
148
|
+
```
|
|
149
|
+
mcp_milens_context({name: "authenticate", repo: "/home/user/project"})
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
**Output:** 15 incoming refs (routes, middleware), 4 outgoing deps (token validation, user lookup, logger, rate limiter)
|
|
153
|
+
|
|
154
|
+
**Step 4 — Dead code:**
|
|
155
|
+
```
|
|
156
|
+
mcp_milens_find_dead_code({repo: "/home/user/project", limit: 20})
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
**Output:** `legacyAuthCheck` in the same module has 0 references — candidate for removal.
|
|
160
|
+
|
|
161
|
+
**Step 5 — Tech debt:**
|
|
162
|
+
```
|
|
163
|
+
mcp_milens_grep({pattern: "TODO|FIXME|HACK|console\\.(log|debug)", scope: "code", repo: "/home/user/project"})
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
**Output:** 3 matches — 1 `console.log` in tokens.ts, 2 `TODO` comments in login.ts.
|
|
167
|
+
|
|
168
|
+
**Step 6 — Report produced** (see report format above).
|
|
169
|
+
|
|
170
|
+
## Best Practices
|
|
171
|
+
|
|
172
|
+
1. **Don't skip the deep-dive.** CRITICAL symbols demand `review_symbol` + `context` — surface-level review misses cascading failures.
|
|
173
|
+
2. **Dead code in the diff is a red flag.** If a new symbol has zero references, question whether it belongs in this PR.
|
|
174
|
+
3. **Tech debt in new code is a blocker.** `console.log`, `FIXME`, or debug code should not ship. Flag it as blocking.
|
|
175
|
+
4. **Risk score drives review depth.** HIGH/CRITICAL = full deep-dive. MEDIUM = quick context check. LOW = skip unless it's a new file.
|
|
176
|
+
5. **Produce a written report.** Don't just run the tools — consolidate findings into a review document the team can reference.
|
|
177
|
+
|
|
178
|
+
## Quality Gate
|
|
179
|
+
|
|
180
|
+
| Criteria | Pass | Fail |
|
|
181
|
+
|---|---|---|
|
|
182
|
+
| PR review tool | Review completes without errors | Tool fails or returns partial data |
|
|
183
|
+
| Risk assessment | All CRITICAL/HIGH symbols deep-dived | Any CRITICAL symbol skipped |
|
|
184
|
+
| Dead code check | No newly introduced dead code | New dead code found in changed files |
|
|
185
|
+
| Tech debt grep | No debug code (`console.log`, etc.) | Debug/leftover code in diff |
|
|
186
|
+
| Report completeness | All 6 report sections filled | Missing sections or incomplete findings |
|
|
@@ -0,0 +1,141 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: milens-debugger
|
|
3
|
+
description: Root Cause Analysis — execution trace, blast radius, dependency paths, deep context, and ranked hypotheses with fix suggestions
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# milens-debugger — Root Cause Analysis
|
|
7
|
+
|
|
8
|
+
Debug issues by tracing execution flow, analyzing blast radius, exploring dependency paths, and producing ranked root cause hypotheses with fix suggestions and regression risk assessment.
|
|
9
|
+
|
|
10
|
+
## Tools Required
|
|
11
|
+
|
|
12
|
+
| Tool | Purpose |
|
|
13
|
+
|---|---|
|
|
14
|
+
| `mcp_milens_trace` | Execution flow from entrypoints to target (or reverse) |
|
|
15
|
+
| `mcp_milens_context` | 360° symbol view: incoming refs + outgoing deps |
|
|
16
|
+
| `mcp_milens_impact` | Blast radius: what breaks if target changes |
|
|
17
|
+
| `mcp_milens_explain_relationship` | Shortest path between two symbols |
|
|
18
|
+
| `mcp_milens_smart_context` | Intent-aware context for debug intent |
|
|
19
|
+
| `mcp_milens_overview` | Combined context + impact + grep in one call |
|
|
20
|
+
| `mcp_milens_grep` | Text search across all files (templates, configs, error messages) |
|
|
21
|
+
| `mcp_milens_review_symbol` | Deep-dive single symbol risk assessment |
|
|
22
|
+
| `mcp_milens_get_type_hierarchy` | Class inheritance chain |
|
|
23
|
+
|
|
24
|
+
> **CRITICAL:** All milens MCP tool calls MUST include the `repo` parameter set to the **absolute path of the workspace root**.
|
|
25
|
+
|
|
26
|
+
## Workflow
|
|
27
|
+
|
|
28
|
+
### Step 1: Understand the Target
|
|
29
|
+
|
|
30
|
+
Get deep context on the problematic symbol.
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
mcp_milens_smart_context({name: "<targetSymbol>", intent: "debug", repo: "<workspaceRoot>"})
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
This returns execution paths, data flow, dependencies, and test coverage in one call.
|
|
37
|
+
|
|
38
|
+
### Step 2: Trace Execution
|
|
39
|
+
|
|
40
|
+
Map how code reaches the target from entrypoints.
|
|
41
|
+
|
|
42
|
+
```
|
|
43
|
+
mcp_milens_trace({name: "<targetSymbol>", direction: "to", repo: "<workspaceRoot>"})
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
For downstream impact, trace forward:
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
mcp_milens_trace({name: "<targetSymbol>", direction: "from", repo: "<workspaceRoot>"})
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Step 3: Blast Radius
|
|
53
|
+
|
|
54
|
+
Understand what breaks if the target is modified.
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
mcp_milens_impact({target: "<targetSymbol>", depth: 3, repo: "<workspaceRoot>"})
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Pay special attention to depth-1 dependents — these WILL break.
|
|
61
|
+
|
|
62
|
+
### Step 4: Dependency Paths
|
|
63
|
+
|
|
64
|
+
For each suspect dependency, trace the exact connection path.
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
mcp_milens_explain_relationship({from: "<suspect>", to: "<targetSymbol>", repo: "<workspaceRoot>"})
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
### Step 5: Full Context View
|
|
71
|
+
|
|
72
|
+
Get a 360° view of the target and its immediate neighbors.
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
mcp_milens_context({name: "<targetSymbol>", repo: "<workspaceRoot>"})
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
Check both incoming (who calls this) and outgoing (what this calls).
|
|
79
|
+
|
|
80
|
+
### Step 6: Text Search for Clues
|
|
81
|
+
|
|
82
|
+
Search for error messages, TODOs, or related patterns.
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
mcp_milens_grep({pattern: "<errorMessage or keyword>", repo: "<workspaceRoot>"})
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
Use `include` to narrow scope (e.g., `"**/*.ts"`, `"**/*.md"`).
|
|
89
|
+
|
|
90
|
+
### Step 7: Class Hierarchy
|
|
91
|
+
|
|
92
|
+
If the target is a class, check its inheritance chain.
|
|
93
|
+
|
|
94
|
+
```
|
|
95
|
+
mcp_milens_get_type_hierarchy({name: "<ClassName>", repo: "<workspaceRoot>"})
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
### Step 8: Risk Assessment
|
|
99
|
+
|
|
100
|
+
Deep-dive the target's risk profile.
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
mcp_milens_review_symbol({name: "<targetSymbol>", repo: "<workspaceRoot>"})
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
### Step 9: Root Cause Hypotheses
|
|
107
|
+
|
|
108
|
+
Rank hypotheses by likelihood:
|
|
109
|
+
|
|
110
|
+
| # | Hypothesis | Evidence | Confidence | Suggested Fix | Regression Risk |
|
|
111
|
+
|---|---|---|---|---|---|
|
|
112
|
+
| 1 | | | HIGH/MEDIUM/LOW | | |
|
|
113
|
+
| 2 | | | | | |
|
|
114
|
+
| 3 | | | | | |
|
|
115
|
+
|
|
116
|
+
Each hypothesis MUST cite specific tools and their output as evidence.
|
|
117
|
+
|
|
118
|
+
## Quality Gate
|
|
119
|
+
|
|
120
|
+
### PASS if:
|
|
121
|
+
- [x] Execution trace complete (both directions)
|
|
122
|
+
- [x] Blast radius analyzed to depth 3
|
|
123
|
+
- [x] At least 3 dependency paths traced
|
|
124
|
+
- [x] Full 360° context captured
|
|
125
|
+
- [x] Text search performed for error messages/keywords
|
|
126
|
+
- [x] At least 3 ranked hypotheses with evidence citations
|
|
127
|
+
- [x] Fix suggestions include regression risk assessment
|
|
128
|
+
|
|
129
|
+
### FAIL if:
|
|
130
|
+
- [ ] Any step skipped without documented reason
|
|
131
|
+
- [ ] Hypotheses lack tool evidence citations
|
|
132
|
+
- [ ] Fix suggestions have no regression risk assessment
|
|
133
|
+
- [ ] Only one hypothesis considered (lack of thoroughness)
|
|
134
|
+
|
|
135
|
+
## Never Skip
|
|
136
|
+
|
|
137
|
+
1. Always run `mcp_milens_smart_context` first — it is the most efficient starting point for debugging
|
|
138
|
+
2. Never skip blast radius analysis — a fix that introduces new breakage is not a fix
|
|
139
|
+
3. Always cite specific tool outputs as evidence for each hypothesis
|
|
140
|
+
4. Always include regression risk in fix suggestions
|
|
141
|
+
5. If dead code is suspected, run `mcp_milens_find_dead_code` to check for unused symbols in the trace path
|
|
@@ -0,0 +1,221 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: milens-eval
|
|
3
|
+
description: Evaluation & Quality Gate — plan, implement, verify scope, review risk, run tests, check coverage, and pass/fail
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# milens-eval — Evaluation & Quality Gate
|
|
7
|
+
|
|
8
|
+
A structured quality gate for feature development: plan tests, implement changes, verify scope, review risks, run impacted tests, check coverage gaps, and produce a pass/fail verdict.
|
|
9
|
+
|
|
10
|
+
## Tools Required
|
|
11
|
+
|
|
12
|
+
| Tool | Purpose |
|
|
13
|
+
|---|---|
|
|
14
|
+
| `mcp_milens_test_plan` | Generate test strategy before implementation |
|
|
15
|
+
| `mcp_milens_detect_changes` | Verify only expected files changed after implementation |
|
|
16
|
+
| `mcp_milens_review_pr` | Risk assessment of the changes |
|
|
17
|
+
| `mcp_milens_test_impact` | Identify and run affected test files |
|
|
18
|
+
| `mcp_milens_test_coverage_gaps` | Verify no new untested critical symbols introduced |
|
|
19
|
+
|
|
20
|
+
> **CRITICAL:** All milens MCP tool calls MUST include the `repo` parameter set to the **absolute path of the workspace root**.
|
|
21
|
+
|
|
22
|
+
## Workflow
|
|
23
|
+
|
|
24
|
+
### Step 1: Plan Tests (Before Implementation)
|
|
25
|
+
|
|
26
|
+
Before writing code, plan the test strategy.
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
mcp_milens_test_plan({name: "<symbolName>", repo: "<workspaceRoot>"})
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
This provides:
|
|
33
|
+
- **Mock strategy** — what dependencies to isolate
|
|
34
|
+
- **Test scenarios** — happy path, error states, edge cases
|
|
35
|
+
- **Affected test files** — existing test files that may need updates
|
|
36
|
+
|
|
37
|
+
Review the plan to ensure it covers:
|
|
38
|
+
- Core functionality (happy path)
|
|
39
|
+
- Input validation (error handling)
|
|
40
|
+
- Integration points (service boundaries)
|
|
41
|
+
- Edge cases (null, empty, boundary values)
|
|
42
|
+
|
|
43
|
+
### Step 2: Implement Changes
|
|
44
|
+
|
|
45
|
+
Write the actual code changes. This is the implementation phase:
|
|
46
|
+
1. Write or update the target symbol(s)
|
|
47
|
+
2. Implement tests following the plan from Step 1
|
|
48
|
+
3. Follow existing code conventions and patterns
|
|
49
|
+
4. No `console.log`, no commented-out code, no debug artifacts
|
|
50
|
+
|
|
51
|
+
### Step 3: Detect Changes
|
|
52
|
+
|
|
53
|
+
Verify the scope of changes matches expectations.
|
|
54
|
+
|
|
55
|
+
```
|
|
56
|
+
mcp_milens_detect_changes({repo: "<workspaceRoot>"})
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
Check:
|
|
60
|
+
- **Expected files present** — all intended modifications are listed
|
|
61
|
+
- **No unexpected files** — no config drift, accidental staging, or side effects
|
|
62
|
+
- **No missing files** — if a file you expected to change is absent, investigate
|
|
63
|
+
|
|
64
|
+
### Step 4: Review Risk
|
|
65
|
+
|
|
66
|
+
Assess the quality and risk of the changes.
|
|
67
|
+
|
|
68
|
+
```
|
|
69
|
+
mcp_milens_review_pr({repo: "<workspaceRoot>"})
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Focus on:
|
|
73
|
+
- **CRITICAL symbols** — any symbol rated CRITICAL needs thorough review
|
|
74
|
+
- **Risk distribution** — how many HIGH/MEDIUM/LOW symbols changed
|
|
75
|
+
- **Review suggestions** — automated findings to address
|
|
76
|
+
|
|
77
|
+
### Step 5: Run Tests
|
|
78
|
+
|
|
79
|
+
Identify and run all affected tests.
|
|
80
|
+
|
|
81
|
+
```
|
|
82
|
+
mcp_milens_test_impact({repo: "<workspaceRoot>"})
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
This returns the list of test files impacted by the changes. Run them all:
|
|
86
|
+
- If any test fails: fix before proceeding
|
|
87
|
+
- If a test you didn't expect is affected: investigate why
|
|
88
|
+
- If a test you expected is missing: your test files may not be indexed
|
|
89
|
+
|
|
90
|
+
### Step 6: Check Coverage Gaps
|
|
91
|
+
|
|
92
|
+
Verify no new untested critical symbols were introduced.
|
|
93
|
+
|
|
94
|
+
```
|
|
95
|
+
mcp_milens_test_coverage_gaps({repo: "<workspaceRoot>", limit: 20})
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
Review the output:
|
|
99
|
+
- **Newly introduced symbols** — should have test coverage
|
|
100
|
+
- **Pre-existing gaps** — note them but don't fail the gate (they're not new)
|
|
101
|
+
- **HIGH risk + no tests** — if any HIGH-risk symbol in the change has no tests, this is a failure
|
|
102
|
+
|
|
103
|
+
### Step 7: Quality Gate Verdict
|
|
104
|
+
|
|
105
|
+
Apply the pass/fail criteria (see Quality Gate section below) and produce a verdict:
|
|
106
|
+
|
|
107
|
+
**PASSED:**
|
|
108
|
+
- All test files pass
|
|
109
|
+
- No CRITICAL review_pr findings
|
|
110
|
+
- No unexpected files in detect_changes
|
|
111
|
+
- No new HIGH-risk uncovered symbols
|
|
112
|
+
|
|
113
|
+
**CONDITIONAL PASS:**
|
|
114
|
+
- All tests pass
|
|
115
|
+
- CRITICAL review_pr findings are acknowledged and documented with remediation plan
|
|
116
|
+
- Minor unexpected files that are harmless (lockfile updates)
|
|
117
|
+
|
|
118
|
+
**FAILED:**
|
|
119
|
+
- Any test fails
|
|
120
|
+
- Hardcoded secrets found
|
|
121
|
+
- Blast radius exceeds safe threshold without justification
|
|
122
|
+
- New HIGH-risk symbols with zero test coverage
|
|
123
|
+
|
|
124
|
+
## Example Session
|
|
125
|
+
|
|
126
|
+
### Input
|
|
127
|
+
|
|
128
|
+
```
|
|
129
|
+
"evaluate the feature branch for the new rate limiter"
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
### Tool Calls
|
|
133
|
+
|
|
134
|
+
**Step 1 — Test plan:**
|
|
135
|
+
```
|
|
136
|
+
mcp_milens_test_plan({name: "RateLimiter", repo: "/home/user/project"})
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
**Output:**
|
|
140
|
+
```
|
|
141
|
+
Test Plan for RateLimiter:
|
|
142
|
+
Mock strategy: stub Redis client, stub clock
|
|
143
|
+
Tests:
|
|
144
|
+
1. allows requests within limit
|
|
145
|
+
2. blocks requests exceeding limit
|
|
146
|
+
3. resets counter after window expires
|
|
147
|
+
4. handles Redis connection failure gracefully
|
|
148
|
+
Affected: src/__tests__/middleware.test.ts (new)
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
**Step 2 — Implement** rate limiter class and tests.
|
|
152
|
+
|
|
153
|
+
**Step 3 — Detect changes:**
|
|
154
|
+
```
|
|
155
|
+
mcp_milens_detect_changes({repo: "/home/user/project"})
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
**Output:**
|
|
159
|
+
```
|
|
160
|
+
Changed files:
|
|
161
|
+
src/middleware/rateLimiter.ts [new]
|
|
162
|
+
src/__tests__/middleware.test.ts [new]
|
|
163
|
+
src/middleware/index.ts [modified] — barrel export
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
3 files — all expected.
|
|
167
|
+
|
|
168
|
+
**Step 4 — Review risk:**
|
|
169
|
+
```
|
|
170
|
+
mcp_milens_review_pr({repo: "/home/user/project"})
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
**Output:**
|
|
174
|
+
```
|
|
175
|
+
Affected symbols:
|
|
176
|
+
RateLimiter [class] MEDIUM — 2 callers, 4 tests
|
|
177
|
+
applyRateLimit [function] LOW — 0 callers (internal helper)
|
|
178
|
+
|
|
179
|
+
No CRITICAL findings. 1 LOW suggestion: add JSDoc to RateLimiter class.
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
**Step 5 — Test impact:**
|
|
183
|
+
```
|
|
184
|
+
mcp_milens_test_impact({repo: "/home/user/project"})
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
**Output:**
|
|
188
|
+
```
|
|
189
|
+
Affected test files:
|
|
190
|
+
src/__tests__/middleware.test.ts (new)
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
Run tests: `npm test -- middleware.test.ts` — all 4 pass.
|
|
194
|
+
|
|
195
|
+
**Step 6 — Coverage gaps:**
|
|
196
|
+
```
|
|
197
|
+
mcp_milens_test_coverage_gaps({repo: "/home/user/project", limit: 20})
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
**Output:** No new HIGH-risk symbols introduced. 3 pre-existing LOW/medium gaps unrelated to this change.
|
|
201
|
+
|
|
202
|
+
**Step 7 — Verdict: PASSED.**
|
|
203
|
+
|
|
204
|
+
## Best Practices
|
|
205
|
+
|
|
206
|
+
1. **Test plan before code, not after.** Step 1 exists to prevent "I'll test it later." Write the plan, review it, then implement.
|
|
207
|
+
2. **detect_changes is your safety net.** If it shows a `package-lock.json` change you didn't intend, something went wrong. Investigate every unexpected file.
|
|
208
|
+
3. **Treat CRITICAL review findings as blockers.** Don't rationalize them away. A CRITICAL finding means the change introduces significant risk.
|
|
209
|
+
4. **Coverage gaps are relative to the change.** Don't fail the gate because of pre-existing gaps. Fail only if the change itself introduces new uncovered critical symbols.
|
|
210
|
+
5. **Conditional passes need a deadline.** If you issue a "Conditional Pass," document exactly what must be fixed and by when. An open-ended conditional is a failed gate in disguise.
|
|
211
|
+
|
|
212
|
+
## Quality Gate
|
|
213
|
+
|
|
214
|
+
| Criteria | Pass | Fail |
|
|
215
|
+
|---|---|---|
|
|
216
|
+
| Test plan exists | Test plan generated with ≥3 scenarios | No test plan or < 3 scenarios |
|
|
217
|
+
| Change scope | `detect_changes` shows only expected files | Unexpected files in diff (unless harmless) |
|
|
218
|
+
| Review risk | No CRITICAL findings | Unresolved CRITICAL findings |
|
|
219
|
+
| Test execution | All `test_impact` files pass | Any test file fails |
|
|
220
|
+
| Coverage gaps | No new HIGH-risk symbols without tests | New HIGH-risk symbol has 0 test coverage |
|
|
221
|
+
| Gate verdict | PASSED or CONDITIONAL PASS (with documented remediation) | FAILED — must fix and re-run evaluation |
|