maxsimcli 4.2.3 → 4.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/.tsbuildinfo +1 -1
- package/dist/assets/CHANGELOG.md +7 -0
- package/dist/assets/templates/agents/AGENTS.md +8 -6
- package/dist/assets/templates/agents/maxsim-code-reviewer.md +11 -0
- package/dist/assets/templates/agents/maxsim-executor.md +1 -0
- package/dist/assets/templates/agents/maxsim-planner.md +1 -0
- package/dist/assets/templates/agents/maxsim-project-researcher.md +96 -0
- package/dist/assets/templates/agents/maxsim-research-synthesizer.md +55 -3
- package/dist/assets/templates/agents/maxsim-roadmapper.md +11 -0
- package/dist/assets/templates/references/questioning.md +184 -33
- package/dist/assets/templates/skills/code-review/SKILL.md +5 -3
- package/dist/assets/templates/skills/{batch-worktree → maxsim-batch}/SKILL.md +3 -3
- package/dist/assets/templates/skills/{simplify → maxsim-simplify}/SKILL.md +7 -6
- package/dist/assets/templates/skills/using-maxsim/SKILL.md +4 -2
- package/dist/assets/templates/templates/conventions.md +138 -0
- package/dist/assets/templates/templates/no-gos.md +45 -4
- package/dist/assets/templates/templates/project.md +23 -0
- package/dist/assets/templates/workflows/batch.md +1 -1
- package/dist/assets/templates/workflows/init-existing.md +187 -0
- package/dist/assets/templates/workflows/new-project.md +195 -7
- package/dist/backend-server.cjs +13 -0
- package/dist/backend-server.cjs.map +1 -1
- package/dist/cli.cjs +350 -40
- package/dist/cli.cjs.map +1 -1
- package/dist/cli.js +7 -3
- package/dist/cli.js.map +1 -1
- package/dist/core/core.d.ts +2 -0
- package/dist/core/core.d.ts.map +1 -1
- package/dist/core/core.js +67 -6
- package/dist/core/core.js.map +1 -1
- package/dist/core/index.d.ts +4 -4
- package/dist/core/index.d.ts.map +1 -1
- package/dist/core/index.js +9 -3
- package/dist/core/index.js.map +1 -1
- package/dist/core/init.d.ts +2 -0
- package/dist/core/init.d.ts.map +1 -1
- package/dist/core/init.js +6 -0
- package/dist/core/init.js.map +1 -1
- package/dist/core/milestone.d.ts +1 -1
- package/dist/core/milestone.d.ts.map +1 -1
- package/dist/core/milestone.js +81 -37
- package/dist/core/milestone.js.map +1 -1
- package/dist/core/phase.d.ts +3 -0
- package/dist/core/phase.d.ts.map +1 -1
- package/dist/core/phase.js +213 -0
- package/dist/core/phase.js.map +1 -1
- package/dist/core/state.d.ts +6 -0
- package/dist/core/state.d.ts.map +1 -1
- package/dist/core/state.js +69 -0
- package/dist/core/state.js.map +1 -1
- package/dist/core/types.d.ts +11 -0
- package/dist/core/types.d.ts.map +1 -1
- package/dist/core-RRjCSt0G.cjs.map +1 -1
- package/dist/install/shared.d.ts +1 -1
- package/dist/install/shared.d.ts.map +1 -1
- package/dist/install/shared.js +1 -1
- package/dist/install/shared.js.map +1 -1
- package/dist/install.cjs +4 -2
- package/dist/install.cjs.map +1 -1
- package/dist/mcp-server.cjs +13 -0
- package/dist/mcp-server.cjs.map +1 -1
- package/dist/skills-MYlMkYNt.cjs.map +1 -1
- package/package.json +1 -1
package/dist/assets/CHANGELOG.md
CHANGED
|
@@ -1,3 +1,10 @@
|
|
|
1
|
+
## [4.2.3](https://github.com/maystudios/maxsimcli/compare/v4.2.2...v4.2.3) (2026-03-03)
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
### Bug Fixes
|
|
5
|
+
|
|
6
|
+
* **test:** mock node:fs in adapters test to handle shared.ts side effects ([648176f](https://github.com/maystudios/maxsimcli/commit/648176f411b47d6fe83fada2fb240d83481dd171))
|
|
7
|
+
|
|
1
8
|
## [4.2.2](https://github.com/maystudios/maxsimcli/compare/v4.2.1...v4.2.2) (2026-03-03)
|
|
2
9
|
|
|
3
10
|
|
|
@@ -14,14 +14,14 @@ Skills with `alwaysApply: true` load automatically at conversation start:
|
|
|
14
14
|
|
|
15
15
|
| Agent | Skills | Role |
|
|
16
16
|
|-------|--------|------|
|
|
17
|
-
| `maxsim-executor` | `tdd`, `verification-before-completion`, `using-maxsim` | Implements plan tasks with TDD
|
|
17
|
+
| `maxsim-executor` | `tdd`, `verification-before-completion`, `using-maxsim`, `maxsim-simplify` | Implements plan tasks with TDD, verified completion, and simplification |
|
|
18
18
|
| `maxsim-debugger` | `systematic-debugging`, `verification-before-completion` | Investigates bugs via reproduce-hypothesize-isolate-verify-fix cycle |
|
|
19
19
|
| `maxsim-verifier` | `verification-before-completion` | Checks phase goal achievement with fresh evidence |
|
|
20
|
-
| `maxsim-planner` | `using-maxsim` | Creates executable PLAN.md files for phases |
|
|
20
|
+
| `maxsim-planner` | `using-maxsim`, `brainstorming` | Creates executable PLAN.md files for phases |
|
|
21
21
|
| `maxsim-plan-checker` | `verification-before-completion` | Verifies plans achieve phase goal before execution |
|
|
22
|
-
| `maxsim-code-reviewer` | `verification-before-completion` | Reviews implementation for code quality with evidence |
|
|
22
|
+
| `maxsim-code-reviewer` | `verification-before-completion`, `code-review` | Reviews implementation for code quality with evidence |
|
|
23
23
|
| `maxsim-spec-reviewer` | `verification-before-completion` | Reviews implementation for spec compliance |
|
|
24
|
-
| `maxsim-roadmapper` | `using-maxsim` | Creates project roadmaps with phase breakdown and requirement mapping |
|
|
24
|
+
| `maxsim-roadmapper` | `using-maxsim`, `brainstorming`, `roadmap-writing` | Creates project roadmaps with phase breakdown and requirement mapping |
|
|
25
25
|
| `maxsim-phase-researcher` | `memory-management` | Researches phase implementation domain for planning context |
|
|
26
26
|
| `maxsim-project-researcher` | `memory-management` | Researches project domain ecosystem during init |
|
|
27
27
|
| `maxsim-research-synthesizer` | `memory-management` | Synthesizes parallel research outputs into unified findings |
|
|
@@ -39,5 +39,7 @@ Skills with `alwaysApply: true` load automatically at conversation start:
|
|
|
39
39
|
| `memory-management` | `skills/memory-management/` | Pattern and error persistence |
|
|
40
40
|
| `brainstorming` | `skills/brainstorming/` | Multi-approach exploration before design |
|
|
41
41
|
| `roadmap-writing` | `skills/roadmap-writing/` | Phased planning with success criteria |
|
|
42
|
-
| `simplify` | `skills/simplify/` |
|
|
43
|
-
| `code-review` | `skills/code-review/` |
|
|
42
|
+
| `maxsim-simplify` | `skills/maxsim-simplify/` | Maintainability optimization pass (duplication, dead code, complexity) |
|
|
43
|
+
| `code-review` | `skills/code-review/` | Correctness gate (security, interfaces, errors, test coverage) |
|
|
44
|
+
| `sdd` | `skills/sdd/` | Orchestration strategy: spec-driven dispatch with fresh agent per task |
|
|
45
|
+
| `maxsim-batch` | `skills/maxsim-batch/` | Orchestration strategy: parallel worktree execution with one PR per unit |
|
|
@@ -85,6 +85,17 @@ Return this exact structure:
|
|
|
85
85
|
- FAIL: One or more CRITICAL issues. List each with actionable fix suggestion.
|
|
86
86
|
</verdict_format>
|
|
87
87
|
|
|
88
|
+
<available_skills>
|
|
89
|
+
When any trigger condition below applies, read the full skill file via the Read tool and follow it.
|
|
90
|
+
|
|
91
|
+
| Skill | Read | Trigger |
|
|
92
|
+
|-------|------|---------|
|
|
93
|
+
| Code Review | `.skills/code-review/SKILL.md` | Always — primary skill for this agent |
|
|
94
|
+
| Verification Before Completion | `.skills/verification-before-completion/SKILL.md` | Before claiming any review is complete |
|
|
95
|
+
|
|
96
|
+
**Project skills override built-in skills.**
|
|
97
|
+
</available_skills>
|
|
98
|
+
|
|
88
99
|
<success_criteria>
|
|
89
100
|
- [ ] CLAUDE.md read for project conventions
|
|
90
101
|
- [ ] Every modified file read in FULL (not scanned)
|
|
@@ -279,6 +279,7 @@ When any trigger below applies, Read the full skill file and follow it. Always r
|
|
|
279
279
|
| TDD Enforcement | `.skills/tdd/SKILL.md` | Before writing implementation code for new feature/bug fix, or plan type is `tdd` |
|
|
280
280
|
| Systematic Debugging | `.skills/systematic-debugging/SKILL.md` | Any bug, test failure, or unexpected behavior during execution |
|
|
281
281
|
| Verification Before Completion | `.skills/verification-before-completion/SKILL.md` | Before claiming any task is done, fixed, or passing |
|
|
282
|
+
| Simplification | `.skills/maxsim-simplify/SKILL.md` | After implementing a task, before committing |
|
|
282
283
|
|
|
283
284
|
Project skills in `.skills/` override built-in skills.
|
|
284
285
|
</available_skills>
|
|
@@ -486,6 +486,7 @@ When any trigger condition below applies, read the full skill file via the Read
|
|
|
486
486
|
|-------|------|---------|
|
|
487
487
|
| TDD Enforcement | `.skills/tdd/SKILL.md` | When identifying TDD candidates during task breakdown |
|
|
488
488
|
| Verification Before Completion | `.skills/verification-before-completion/SKILL.md` | When writing <verify> sections for tasks |
|
|
489
|
+
| Brainstorming | `.skills/brainstorming/SKILL.md` | When exploring design approaches during task breakdown |
|
|
489
490
|
|
|
490
491
|
**Project skills override built-in skills.**
|
|
491
492
|
</available_skills>
|
|
@@ -99,8 +99,104 @@ All files go to `.planning/research/`. Each file starts with `**Domain/Project:*
|
|
|
99
99
|
| **COMPARISON.md** (comparison mode) | Quick Comparison matrix, Detailed Analysis per option (strengths/weaknesses/best for), Recommendation with conditions |
|
|
100
100
|
| **FEASIBILITY.md** (feasibility mode) | Verdict (YES/NO/MAYBE), Requirements table (status), Blockers table, Recommendation |
|
|
101
101
|
|
|
102
|
+
### Mandatory Enhanced Sections
|
|
103
|
+
|
|
104
|
+
Every research output file (STACK.md, FEATURES.md, ARCHITECTURE.md, PITFALLS.md) MUST include these 5 sections in addition to the format above:
|
|
105
|
+
|
|
106
|
+
#### 1. Trade-Off Matrix
|
|
107
|
+
|
|
108
|
+
For each major choice, compare top 2-3 options in a structured table:
|
|
109
|
+
|
|
110
|
+
```markdown
|
|
111
|
+
## Trade-Off Matrix
|
|
112
|
+
|
|
113
|
+
| Option | Pros | Cons | Risk | Effort |
|
|
114
|
+
|--------|------|------|------|--------|
|
|
115
|
+
| [Option A] | Fast setup, large community | Vendor lock-in risk | LOW | S |
|
|
116
|
+
| [Option B] | Full control, no lock-in | Steeper learning curve | MED | M |
|
|
117
|
+
| [Option C] | Best performance | Small ecosystem | HIGH | L |
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
Risk levels: LOW / MED / HIGH. Effort sizes: S (hours) / M (days) / L (week) / XL (weeks).
|
|
121
|
+
|
|
122
|
+
#### 2. Decision Rationale
|
|
123
|
+
|
|
124
|
+
For your primary recommendation: explain WHY this over alternatives, and when to reconsider:
|
|
125
|
+
|
|
126
|
+
```markdown
|
|
127
|
+
## Decision Rationale
|
|
128
|
+
|
|
129
|
+
**Recommendation:** [X] over [Y]
|
|
130
|
+
**Why:** [Specific technical reasoning tied to this project's constraints]
|
|
131
|
+
**When to reconsider:** [Conditions that would change this recommendation]
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
#### 3. Code Examples
|
|
135
|
+
|
|
136
|
+
Concrete, copy-pasteable snippets for recommended technologies:
|
|
137
|
+
|
|
138
|
+
```markdown
|
|
139
|
+
## Code Examples
|
|
140
|
+
|
|
141
|
+
### [Technology] Setup
|
|
142
|
+
[Import statements, config snippets, middleware/pattern examples]
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
Include: import statements, basic configuration, one usage pattern per recommended technology. These help downstream agents write correct code from day one.
|
|
146
|
+
|
|
147
|
+
#### 4. Integration Warnings
|
|
148
|
+
|
|
149
|
+
Cross-cutting concerns between recommended technologies:
|
|
150
|
+
|
|
151
|
+
```markdown
|
|
152
|
+
## Integration Warnings
|
|
153
|
+
|
|
154
|
+
- **[Tech A] + [Tech B]:** [What to watch out for, version compatibility, known conflicts]
|
|
155
|
+
- **[Tech C] + [Tech D]:** [Configuration gotchas, ordering requirements]
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
Flag any combination that requires special attention. "If X + Y, watch out for Z" format.
|
|
159
|
+
|
|
160
|
+
#### 5. Effort Estimates
|
|
161
|
+
|
|
162
|
+
T-shirt size complexity estimate per recommendation:
|
|
163
|
+
|
|
164
|
+
```markdown
|
|
165
|
+
## Effort Estimates
|
|
166
|
+
|
|
167
|
+
| Recommendation | Effort | Notes |
|
|
168
|
+
|---------------|--------|-------|
|
|
169
|
+
| [Tech/Pattern A] | S | Drop-in, well-documented |
|
|
170
|
+
| [Tech/Pattern B] | M | Requires config + testing |
|
|
171
|
+
| [Tech/Pattern C] | L | Migration path needed |
|
|
172
|
+
| [Tech/Pattern D] | XL | Significant architecture work |
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
Sizes: S = hours, M = days, L = week, XL = weeks. Be honest about complexity.
|
|
176
|
+
|
|
102
177
|
</output_formats>
|
|
103
178
|
|
|
179
|
+
<web_verification>
|
|
180
|
+
|
|
181
|
+
**MUST verify version numbers and library status via web search.** Use Context7, WebSearch, or WebFetch to confirm:
|
|
182
|
+
|
|
183
|
+
- Current stable versions of recommended libraries
|
|
184
|
+
- Whether recommended libraries are actively maintained (last release date, open issues)
|
|
185
|
+
- Known breaking changes in recent versions
|
|
186
|
+
- Deprecation notices or migration paths
|
|
187
|
+
|
|
188
|
+
Assign confidence levels to every factual claim:
|
|
189
|
+
|
|
190
|
+
| Level | Meaning | When to Use |
|
|
191
|
+
|-------|---------|-------------|
|
|
192
|
+
| **HIGH** | Verified via web (Context7/official docs) | Version numbers, API signatures, feature availability confirmed online |
|
|
193
|
+
| **MEDIUM** | Known from training data, unverified via web | Well-known facts that could not be live-verified (e.g., tool unavailable) |
|
|
194
|
+
| **LOW** | Uncertain or single unverified source | Anything you are not confident about -- flag prominently |
|
|
195
|
+
|
|
196
|
+
Mark each technology recommendation with its confidence level. Flag anything unverifiable with `[CONFIDENCE: LOW - unverified]`.
|
|
197
|
+
|
|
198
|
+
</web_verification>
|
|
199
|
+
|
|
104
200
|
<execution_flow>
|
|
105
201
|
|
|
106
202
|
1. **Receive scope** — Parse project name/description, research mode, specific questions from orchestrator.
|
|
@@ -58,18 +58,70 @@ Read all 4 files from `.planning/research/` and extract:
|
|
|
58
58
|
|
|
59
59
|
Per area (Stack/Features/Architecture/Pitfalls): assign confidence level based on source quality. Identify gaps needing attention during planning.
|
|
60
60
|
|
|
61
|
-
## Step 5:
|
|
61
|
+
## Step 5: Produce Locked Decisions
|
|
62
|
+
|
|
63
|
+
Extract the most impactful decisions from research into a **Locked Decisions** table. These flow to the planner as hard constraints.
|
|
64
|
+
|
|
65
|
+
### Locked Decisions Format
|
|
66
|
+
|
|
67
|
+
```markdown
|
|
68
|
+
## Locked Decisions
|
|
69
|
+
|
|
70
|
+
These decisions have been validated by research and approved by the user. They flow to the planner as constraints.
|
|
71
|
+
|
|
72
|
+
| # | Decision | Rationale | Alternatives Rejected | Effort |
|
|
73
|
+
|---|----------|-----------|----------------------|--------|
|
|
74
|
+
| 1 | [e.g., Use PostgreSQL] | [Why this over alternatives] | [e.g., MongoDB (schema flexibility not needed)] | M |
|
|
75
|
+
| 2 | [e.g., Use NextAuth] | [Why this over alternatives] | [e.g., Clerk (SaaS dependency)] | S |
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
### Rules for Locked Decisions
|
|
79
|
+
|
|
80
|
+
- **Cross-reference PROJECT.md:** Read `.planning/PROJECT.md` Key Decisions section. Do NOT lock decisions that contradict what the user already decided during questioning. User decisions from questioning take precedence over research recommendations.
|
|
81
|
+
- **Limit scope:** Only lock decisions that are architecturally significant (framework, database, auth, hosting, major patterns). Do NOT lock utility library choices.
|
|
82
|
+
- **Include effort:** Every locked decision must have a T-shirt size effort estimate (S/M/L/XL).
|
|
83
|
+
- **Rationale required:** Every locked decision must explain WHY, not just WHAT.
|
|
84
|
+
|
|
85
|
+
## Step 5b: Approval Gate
|
|
86
|
+
|
|
87
|
+
After producing locked decisions, the workflow MUST present them to the user before proceeding:
|
|
88
|
+
|
|
89
|
+
**Approval gate instruction:** Present the Locked Decisions table to the user with the message: "These are the technology decisions from research. You can approve all, override specific decisions, or request changes." The user must explicitly approve before locked decisions flow to the planner. User can override any decision.
|
|
90
|
+
|
|
91
|
+
Do NOT proceed to roadmap creation until the user has approved locked decisions.
|
|
92
|
+
|
|
93
|
+
## Step 5c: Enrich PROJECT.md with Tech Stack Decisions
|
|
94
|
+
|
|
95
|
+
After user approval, add a **Tech Stack Decisions** section to `.planning/PROJECT.md` containing the approved locked decisions. This makes PROJECT.md self-contained for downstream agents.
|
|
96
|
+
|
|
97
|
+
```markdown
|
|
98
|
+
## Tech Stack Decisions
|
|
99
|
+
|
|
100
|
+
> Locked during research phase. Approved by user on {{date}}.
|
|
101
|
+
|
|
102
|
+
| Category | Decision | Rationale |
|
|
103
|
+
|----------|----------|-----------|
|
|
104
|
+
| Database | PostgreSQL | Relational queries dominate; strong ecosystem |
|
|
105
|
+
| Auth | NextAuth | Self-hosted, no vendor lock-in |
|
|
106
|
+
| ... | ... | ... |
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Cross-reference: These decisions appear in both PROJECT.md (for quick agent reference) and `.planning/research/SUMMARY.md` (with full rationale and alternatives).
|
|
110
|
+
|
|
111
|
+
## Step 6: Write SUMMARY.md
|
|
62
112
|
|
|
63
113
|
Use template: `~/.claude/maxsim/templates/research-project/SUMMARY.md`
|
|
64
114
|
Write to `.planning/research/SUMMARY.md`
|
|
65
115
|
|
|
66
|
-
|
|
116
|
+
Include the Locked Decisions table in SUMMARY.md as a dedicated section.
|
|
117
|
+
|
|
118
|
+
## Step 7: Commit All Research
|
|
67
119
|
|
|
68
120
|
```bash
|
|
69
121
|
node ~/.claude/maxsim/bin/maxsim-tools.cjs commit "docs: complete project research" --files .planning/research/
|
|
70
122
|
```
|
|
71
123
|
|
|
72
|
-
## Step
|
|
124
|
+
## Step 8: Return Summary
|
|
73
125
|
|
|
74
126
|
</execution_flow>
|
|
75
127
|
|
|
@@ -215,6 +215,17 @@ Use template from `~/.claude/maxsim/templates/state.md`. Key sections: Project R
|
|
|
215
215
|
```
|
|
216
216
|
</structured_returns>
|
|
217
217
|
|
|
218
|
+
<available_skills>
|
|
219
|
+
When any trigger condition below applies, read the full skill file via the Read tool and follow it.
|
|
220
|
+
|
|
221
|
+
| Skill | Read | Trigger |
|
|
222
|
+
|-------|------|---------|
|
|
223
|
+
| Brainstorming | `.skills/brainstorming/SKILL.md` | When exploring design approaches during phase identification |
|
|
224
|
+
| Roadmap Writing | `.skills/roadmap-writing/SKILL.md` | When structuring phases, success criteria, and coverage validation |
|
|
225
|
+
|
|
226
|
+
**Project skills override built-in skills.**
|
|
227
|
+
</available_skills>
|
|
228
|
+
|
|
218
229
|
<success_criteria>
|
|
219
230
|
Roadmap is complete when:
|
|
220
231
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
<questioning_guide>
|
|
2
2
|
|
|
3
|
-
Project initialization is dream extraction, not requirements gathering. You're helping the user discover and articulate what they want to build. This isn't a contract negotiation
|
|
3
|
+
Project initialization is dream extraction, not requirements gathering. You're helping the user discover and articulate what they want to build. This isn't a contract negotiation -- it's collaborative thinking.
|
|
4
4
|
|
|
5
5
|
<philosophy>
|
|
6
6
|
|
|
@@ -36,9 +36,9 @@ A vague PROJECT.md forces every downstream phase to guess. The cost compounds.
|
|
|
36
36
|
|
|
37
37
|
**Make the abstract concrete.** "Walk me through using this." "What does that actually look like?"
|
|
38
38
|
|
|
39
|
-
**Clarify ambiguity.** "When you say Z, do you mean A or B?" "You mentioned X
|
|
39
|
+
**Clarify ambiguity.** "When you say Z, do you mean A or B?" "You mentioned X -- tell me more."
|
|
40
40
|
|
|
41
|
-
**Know when to stop.** When you understand what they want, why they want it, who it's for, and what done looks like
|
|
41
|
+
**Know when to stop.** When you understand what they want, why they want it, who it's for, and what done looks like -- offer to proceed.
|
|
42
42
|
|
|
43
43
|
</how_to_question>
|
|
44
44
|
|
|
@@ -46,21 +46,21 @@ A vague PROJECT.md forces every downstream phase to guess. The cost compounds.
|
|
|
46
46
|
|
|
47
47
|
Use these as inspiration, not a checklist. Pick what's relevant to the thread.
|
|
48
48
|
|
|
49
|
-
**Motivation
|
|
49
|
+
**Motivation -- why this exists:**
|
|
50
50
|
- "What prompted this?"
|
|
51
51
|
- "What are you doing today that this replaces?"
|
|
52
52
|
- "What would you do if this existed?"
|
|
53
53
|
|
|
54
|
-
**Concreteness
|
|
54
|
+
**Concreteness -- what it actually is:**
|
|
55
55
|
- "Walk me through using this"
|
|
56
|
-
- "You said X
|
|
56
|
+
- "You said X -- what does that actually look like?"
|
|
57
57
|
- "Give me an example"
|
|
58
58
|
|
|
59
|
-
**Clarification
|
|
59
|
+
**Clarification -- what they mean:**
|
|
60
60
|
- "When you say Z, do you mean A or B?"
|
|
61
|
-
- "You mentioned X
|
|
61
|
+
- "You mentioned X -- tell me more about that"
|
|
62
62
|
|
|
63
|
-
**Success
|
|
63
|
+
**Success -- how you'll know it's working:**
|
|
64
64
|
- "How will you know this is working?"
|
|
65
65
|
- "What does done look like?"
|
|
66
66
|
|
|
@@ -79,51 +79,199 @@ Use AskUserQuestion to help users think by presenting concrete options to react
|
|
|
79
79
|
- Generic categories ("Technical", "Business", "Other")
|
|
80
80
|
- Leading options that presume an answer
|
|
81
81
|
- Too many options (2-4 is ideal)
|
|
82
|
-
- Headers longer than 12 characters (hard limit
|
|
82
|
+
- Headers longer than 12 characters (hard limit -- validation will reject them)
|
|
83
83
|
|
|
84
|
-
**Example
|
|
84
|
+
**Example -- vague answer:**
|
|
85
85
|
User says "it should be fast"
|
|
86
86
|
|
|
87
87
|
- header: "Fast"
|
|
88
88
|
- question: "Fast how?"
|
|
89
89
|
- options: ["Sub-second response", "Handles large datasets", "Quick to build", "Let me explain"]
|
|
90
90
|
|
|
91
|
-
**Example
|
|
91
|
+
**Example -- following a thread:**
|
|
92
92
|
User mentions "frustrated with current tools"
|
|
93
93
|
|
|
94
94
|
- header: "Frustration"
|
|
95
95
|
- question: "What specifically frustrates you?"
|
|
96
96
|
- options: ["Too many clicks", "Missing features", "Unreliable", "Let me explain"]
|
|
97
97
|
|
|
98
|
-
**Tip for users
|
|
98
|
+
**Tip for users -- modifying an option:**
|
|
99
99
|
Users who want a slightly modified version of an option can select "Other" and reference the option by number: `#1 but for finger joints only` or `#2 with pagination disabled`. This avoids retyping the full option text.
|
|
100
100
|
|
|
101
101
|
</using_askuserquestion>
|
|
102
102
|
|
|
103
|
-
<
|
|
103
|
+
<domain_checklist>
|
|
104
104
|
|
|
105
|
-
|
|
105
|
+
Track these domains silently as a background checklist. Mark each as COVERED, N/A, or UNCOVERED as conversation progresses. Do NOT show this checklist to the user. Do NOT switch to checklist mode. Do NOT fire rapid questions to cover domains.
|
|
106
106
|
|
|
107
|
-
|
|
108
|
-
- [ ] Why it needs to exist (the problem or desire driving it)
|
|
109
|
-
- [ ] Who it's for (even if just themselves)
|
|
110
|
-
- [ ] What "done" looks like (observable outcomes)
|
|
107
|
+
**Follow the user's thread first. Only weave checklist domains when natural pauses occur or when the user's response opens a related domain.**
|
|
111
108
|
|
|
112
|
-
|
|
109
|
+
### Core
|
|
110
|
+
- [ ] Auth approach (SSO, email/pass, OAuth, magic links, API keys, none)
|
|
111
|
+
- [ ] Data model (relational, document, graph, key-value, file-based)
|
|
112
|
+
- [ ] API style (REST, GraphQL, tRPC, gRPC, WebSocket, none/internal)
|
|
113
|
+
- [ ] Deployment target (serverless, containers, VPS, edge, desktop, mobile)
|
|
114
|
+
- [ ] Error handling strategy (exceptions, Result types, error boundaries, status codes)
|
|
115
|
+
- [ ] Testing strategy (unit, integration, e2e, coverage targets, TDD)
|
|
113
116
|
|
|
114
|
-
|
|
117
|
+
### Infrastructure
|
|
118
|
+
- [ ] Caching strategy (Redis, in-memory, CDN, service worker, none)
|
|
119
|
+
- [ ] Search (full-text, Elasticsearch, Algolia, vector search, none)
|
|
120
|
+
- [ ] Monitoring/logging (structured logs, APM, error tracking, analytics)
|
|
121
|
+
- [ ] CI/CD (GitHub Actions, GitLab CI, CircleCI, none yet)
|
|
122
|
+
- [ ] Environments (dev/staging/prod, single env, preview deploys)
|
|
123
|
+
|
|
124
|
+
### UX/Product
|
|
125
|
+
- [ ] User roles/permissions (RBAC, ABAC, simple admin/user, single-user)
|
|
126
|
+
- [ ] Notifications (email, push, in-app, WebSocket, none)
|
|
127
|
+
- [ ] File uploads (images, documents, media, user-generated content, none)
|
|
128
|
+
- [ ] Internationalization (i18n needed, English only, later)
|
|
129
|
+
- [ ] Accessibility (WCAG compliance level, not applicable)
|
|
130
|
+
|
|
131
|
+
### Scale/Ops
|
|
132
|
+
- [ ] Performance targets (response time, throughput, concurrent users)
|
|
133
|
+
- [ ] Concurrency model (single-threaded, worker pools, event-driven, actors)
|
|
134
|
+
- [ ] Data migration (from existing system, fresh start, import needed)
|
|
135
|
+
- [ ] Backup/recovery (RTO/RPO targets, disaster recovery plan, not applicable)
|
|
136
|
+
- [ ] Rate limiting (API limits, abuse prevention, quotas, not applicable)
|
|
137
|
+
|
|
138
|
+
### Tracking Rules
|
|
139
|
+
|
|
140
|
+
1. **One round = one AskUserQuestion call.** Count these, not messages.
|
|
141
|
+
2. **Mark domains generously as N/A.** Not all 21 domains apply to every project. A CLI tool does not need file uploads, internationalization, backup/recovery, notifications, etc.
|
|
142
|
+
3. **N/A counts as covered** for the coverage calculation.
|
|
143
|
+
4. **Never mention the checklist or domains to the user.** This is your internal tracking only.
|
|
144
|
+
5. **Weave naturally, never interrogate.** If a domain is uncovered after several rounds, find a natural segue from what the user already said.
|
|
145
|
+
|
|
146
|
+
### N/A Decision Tree (examples by project type)
|
|
147
|
+
|
|
148
|
+
**CLI tool:** Mark as N/A: file uploads, internationalization, accessibility, notifications, user roles, caching, search, backup/recovery, rate limiting, environments (unless multi-env deployment). Focus on: deployment target, testing strategy, error handling, CI/CD.
|
|
149
|
+
|
|
150
|
+
**SaaS web app:** Most domains apply. Mark as N/A only if explicitly irrelevant (e.g., data migration if greenfield).
|
|
151
|
+
|
|
152
|
+
**API/backend service:** Mark as N/A: accessibility, internationalization (usually), file uploads (if not applicable). Focus on: API style, auth, rate limiting, caching, monitoring.
|
|
153
|
+
|
|
154
|
+
**Mobile app:** Mark as N/A: search (usually), environments (different model), rate limiting (server-side). Focus on: deployment target, offline strategy, push notifications, auth.
|
|
155
|
+
|
|
156
|
+
**Static site / marketing:** Mark as N/A: auth, data model, caching, search, monitoring, most Scale/Ops. Focus on: deployment target, CI/CD, accessibility, internationalization.
|
|
157
|
+
|
|
158
|
+
</domain_checklist>
|
|
159
|
+
|
|
160
|
+
<gate_logic>
|
|
161
|
+
|
|
162
|
+
The "Ready?" decision gate has strict conditions. Do NOT show it prematurely.
|
|
163
|
+
|
|
164
|
+
### Conditions (ALL must be true)
|
|
165
|
+
|
|
166
|
+
1. **Minimum rounds:** round_count >= 10 (one round = one AskUserQuestion call)
|
|
167
|
+
2. **Coverage threshold:** 80% of relevant domains must be covered. Formula: covered_count / (total_domains - na_count) >= 0.80
|
|
168
|
+
- covered_count = domains with COVERED status
|
|
169
|
+
- na_count = domains with N/A status
|
|
170
|
+
- total_domains = 21
|
|
171
|
+
3. **Core understanding:** You could write a clear PROJECT.md that a stranger would understand
|
|
172
|
+
|
|
173
|
+
### Before showing "Ready?"
|
|
174
|
+
|
|
175
|
+
Display a domain coverage summary by category (this is the ONE time coverage becomes visible):
|
|
176
|
+
|
|
177
|
+
```
|
|
178
|
+
I think I have a solid picture. Here's what we've covered:
|
|
179
|
+
|
|
180
|
+
**Core:** Auth, data model, API style, deployment, error handling, testing
|
|
181
|
+
**Infrastructure:** CI/CD, environments (caching: N/A, search: N/A, monitoring: N/A)
|
|
182
|
+
**UX/Product:** User roles (notifications: N/A, uploads: N/A, i18n: N/A, accessibility: N/A)
|
|
183
|
+
**Scale/Ops:** Performance targets (concurrency: N/A, migration: N/A, backup: N/A, rate limiting: N/A)
|
|
184
|
+
|
|
185
|
+
Coverage: 10/12 relevant domains (83%)
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
Then present the decision gate:
|
|
189
|
+
|
|
190
|
+
- header: "Ready?"
|
|
191
|
+
- question: "I think I understand what you're after. Ready to create PROJECT.md?"
|
|
192
|
+
- options:
|
|
193
|
+
- "Create PROJECT.md" -- Let's move forward
|
|
194
|
+
- "Keep exploring" -- I want to share more / ask me more
|
|
195
|
+
|
|
196
|
+
If "Keep exploring" -- identify which uncovered domains might be relevant and weave them into conversation naturally. Loop until "Create PROJECT.md" selected.
|
|
197
|
+
|
|
198
|
+
### Anti-Pattern Examples (interrogation prevention)
|
|
199
|
+
|
|
200
|
+
**BAD:** "What about caching?" (out of nowhere, no connection to conversation)
|
|
201
|
+
**GOOD:** "You mentioned handling 10K concurrent users -- have you thought about caching strategy?"
|
|
202
|
+
|
|
203
|
+
**BAD:** "Let's talk about internationalization." (topic shift with no context)
|
|
204
|
+
**GOOD:** "You said your users are spread across Europe -- will the interface need to support multiple languages?"
|
|
205
|
+
|
|
206
|
+
**BAD:** "What's your monitoring approach?" (checklist-walking)
|
|
207
|
+
**GOOD:** "With a distributed system like this, how will you know when something goes wrong in production?"
|
|
208
|
+
|
|
209
|
+
**Explicit instruction:** Follow the user's thread first. Only weave checklist domains when natural pauses occur or when the user's response opens a related domain. If a domain is hard to weave naturally, it is probably N/A for this project -- mark it and move on.
|
|
210
|
+
|
|
211
|
+
</gate_logic>
|
|
212
|
+
|
|
213
|
+
<nogos_tracking>
|
|
214
|
+
|
|
215
|
+
No-gos are captured as a side-channel during questioning. You accumulate them silently and present them for confirmation before writing NO-GOS.md.
|
|
216
|
+
|
|
217
|
+
### During Questioning: Watch for Rejection Signals
|
|
218
|
+
|
|
219
|
+
As the user talks, watch for these patterns and silently record them as candidate no-gos:
|
|
220
|
+
|
|
221
|
+
- **Explicit rejections:** "I don't want X", "Never use Y", "No way we're doing Z"
|
|
222
|
+
- **Past failures:** "Last time we tried X and it was terrible", "We burned on Y before"
|
|
223
|
+
- **Strong opinions:** "Absolutely not Z", "Over my dead body", "That's an anti-pattern"
|
|
224
|
+
- **Anti-patterns mentioned:** "The codebase already has too much of X", "I've seen Y fail everywhere"
|
|
225
|
+
|
|
226
|
+
**Do NOT confirm each no-go individually as it comes up.** Silently accumulate them. The confirmation step comes later.
|
|
227
|
+
|
|
228
|
+
### Challenge-Based Probing (after 5+ rounds, challenge-based elicitation)
|
|
229
|
+
|
|
230
|
+
Once rapport is established (5+ rounds), weave these probes naturally when the conversation allows:
|
|
231
|
+
|
|
232
|
+
- "What would make this project fail?"
|
|
233
|
+
- "What shortcuts are tempting but dangerous for a project like this?"
|
|
234
|
+
- "What did a previous version get wrong?" (if applicable)
|
|
235
|
+
- "What's the one decision you'd regret in 6 months?"
|
|
236
|
+
- "If a new developer joined, what mistakes would you warn them about?"
|
|
237
|
+
|
|
238
|
+
**Timing:** These are conversation contributions, not interrogation questions. Weave them when the user pauses, reflects, or mentions past experience. Never fire them in sequence.
|
|
239
|
+
|
|
240
|
+
### Domain-Aware Suggestions (domain-aware anti-pattern suggestions after understanding the project type)
|
|
241
|
+
|
|
242
|
+
Once you understand what they're building, suggest common anti-patterns for that domain. Present these as food for thought, not a checklist:
|
|
243
|
+
|
|
244
|
+
**SaaS:** shared-database multi-tenancy without isolation, storing secrets in code, vendor lock-in without abstraction layers, skipping audit logging, ignoring data residency requirements
|
|
245
|
+
|
|
246
|
+
**CLI tool:** global mutable state, implicit dependencies on environment, breaking flag/option changes between versions, silent failures with zero exit code, writing to stdout when should be stderr
|
|
247
|
+
|
|
248
|
+
**API/backend:** N+1 queries, unbounded response sizes, missing rate limits, sync operations that should be async, tight coupling between services, missing idempotency keys
|
|
249
|
+
|
|
250
|
+
**Mobile app:** assuming always-online, blocking the main thread, ignoring battery/data impact, platform-specific assumptions, missing offline conflict resolution
|
|
251
|
+
|
|
252
|
+
**Real-time system:** assuming ordered delivery, ignoring backpressure, unbounded queues, missing heartbeat/reconnection, no graceful degradation
|
|
253
|
+
|
|
254
|
+
### Before Writing NO-GOS.md: Confirmation Step
|
|
255
|
+
|
|
256
|
+
After the user selects "Create PROJECT.md" but BEFORE writing documents, present ALL collected no-gos:
|
|
257
|
+
|
|
258
|
+
"Before I write the project documents, here are the no-gos I captured from our conversation -- things you want to explicitly avoid or forbid. Anything to add, remove, or adjust?"
|
|
259
|
+
|
|
260
|
+
Present them in a clear list with the source context (what the user said that triggered each one). User confirms or adjusts. Only confirmed no-gos go into NO-GOS.md.
|
|
261
|
+
|
|
262
|
+
</nogos_tracking>
|
|
115
263
|
|
|
116
264
|
<decision_gate>
|
|
117
265
|
|
|
118
|
-
When you could write a clear PROJECT.md, offer to proceed:
|
|
266
|
+
When you could write a clear PROJECT.md AND the gate_logic conditions are met, offer to proceed:
|
|
119
267
|
|
|
120
268
|
- header: "Ready?"
|
|
121
269
|
- question: "I think I understand what you're after. Ready to create PROJECT.md?"
|
|
122
270
|
- options:
|
|
123
|
-
- "Create PROJECT.md"
|
|
124
|
-
- "Keep exploring"
|
|
271
|
+
- "Create PROJECT.md" -- Let's move forward
|
|
272
|
+
- "Keep exploring" -- I want to share more / ask me more
|
|
125
273
|
|
|
126
|
-
If "Keep exploring"
|
|
274
|
+
If "Keep exploring" -- ask what they want to add or identify gaps and probe naturally.
|
|
127
275
|
|
|
128
276
|
Loop until "Create PROJECT.md" selected.
|
|
129
277
|
|
|
@@ -131,14 +279,17 @@ Loop until "Create PROJECT.md" selected.
|
|
|
131
279
|
|
|
132
280
|
<anti_patterns>
|
|
133
281
|
|
|
134
|
-
- **Checklist walking**
|
|
135
|
-
- **Canned questions**
|
|
136
|
-
- **Corporate speak**
|
|
137
|
-
- **Interrogation**
|
|
138
|
-
- **Rushing**
|
|
139
|
-
- **Shallow acceptance**
|
|
140
|
-
- **Premature constraints**
|
|
141
|
-
- **User skills**
|
|
282
|
+
- **Checklist walking** -- Going through domains regardless of what they said
|
|
283
|
+
- **Canned questions** -- "What's your core value?" "What's out of scope?" regardless of context
|
|
284
|
+
- **Corporate speak** -- "What are your success criteria?" "Who are your stakeholders?"
|
|
285
|
+
- **Interrogation** -- Firing questions without building on answers
|
|
286
|
+
- **Rushing** -- Minimizing questions to get to "the work"
|
|
287
|
+
- **Shallow acceptance** -- Taking vague answers without probing
|
|
288
|
+
- **Premature constraints** -- Asking about tech stack before understanding the idea
|
|
289
|
+
- **User skills** -- NEVER ask about user's technical experience. Claude builds.
|
|
290
|
+
- **Visible progress tracking** -- NEVER show domain coverage during questioning (only at the gate)
|
|
291
|
+
- **Rapid-fire domain questions** -- NEVER ask about multiple unrelated domains in one message
|
|
292
|
+
- **Forced relevance** -- Trying to make irrelevant domains seem relevant just to check them off
|
|
142
293
|
|
|
143
294
|
</anti_patterns>
|
|
144
295
|
|
|
@@ -1,9 +1,11 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: code-review
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
error handling, test coverage, and quality
|
|
6
|
-
|
|
4
|
+
Correctness gate: reviews all changed code for security vulnerabilities,
|
|
5
|
+
interface correctness, error handling, test coverage, and quality. Answers
|
|
6
|
+
"Is this code correct and safe?" Use when completing a phase, reviewing
|
|
7
|
+
implementation, or before approving changes for merge. Not a maintainability
|
|
8
|
+
pass — that is maxsim-simplify's job.
|
|
7
9
|
---
|
|
8
10
|
|
|
9
11
|
# Code Review
|
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: batch
|
|
2
|
+
name: maxsim-batch
|
|
3
3
|
description: >-
|
|
4
4
|
Decomposes large tasks into independent units and executes each in an isolated
|
|
5
5
|
git worktree with its own branch and PR. Use when parallelizing work across
|
|
6
|
-
|
|
6
|
+
3-30 independent units or orchestrating worktree-based parallel execution. Not
|
|
7
7
|
for sequential dependencies or fewer than 3 units.
|
|
8
8
|
---
|
|
9
9
|
|
|
@@ -83,7 +83,7 @@ Before reporting completion, confirm:
|
|
|
83
83
|
|
|
84
84
|
## MAXSIM Integration
|
|
85
85
|
|
|
86
|
-
When a plan specifies `skill: "batch
|
|
86
|
+
When a plan specifies `skill: "maxsim-batch"`:
|
|
87
87
|
- The orchestrator decomposes the plan's tasks into independent units
|
|
88
88
|
- Each unit becomes a worktree agent with its own branch and PR
|
|
89
89
|
- The orchestrator tracks progress and reports the final PR list in SUMMARY.md
|
|
@@ -1,10 +1,11 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: simplify
|
|
2
|
+
name: maxsim-simplify
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
before committing, cleaning up implementations,
|
|
7
|
-
review.
|
|
4
|
+
Maintainability optimization pass: finds duplication, dead code, and
|
|
5
|
+
unnecessary complexity. Answers "Is this code as simple as it can be?"
|
|
6
|
+
Use when reviewing code before committing, cleaning up implementations,
|
|
7
|
+
or preparing changes for review. Not a correctness gate — that is
|
|
8
|
+
code-review's job.
|
|
8
9
|
---
|
|
9
10
|
|
|
10
11
|
# Simplify
|
|
@@ -134,7 +135,7 @@ Before reporting completion, confirm:
|
|
|
134
135
|
|
|
135
136
|
## MAXSIM Integration
|
|
136
137
|
|
|
137
|
-
When a plan specifies `skill: "simplify"`:
|
|
138
|
+
When a plan specifies `skill: "maxsim-simplify"`:
|
|
138
139
|
- The orchestrator collects changed files from the implementation step
|
|
139
140
|
- Three parallel reviewers (Reuse, Quality, Efficiency) are spawned
|
|
140
141
|
- Findings are consolidated and fixes applied
|
|
@@ -54,8 +54,10 @@ Skills are behavioral rules that activate automatically based on context:
|
|
|
54
54
|
| `memory-management` | Recurring patterns, errors, or decisions worth persisting |
|
|
55
55
|
| `brainstorming` | Before implementing any significant feature or design |
|
|
56
56
|
| `roadmap-writing` | When creating or restructuring a project roadmap |
|
|
57
|
-
| `simplify` |
|
|
58
|
-
| `code-review` |
|
|
57
|
+
| `maxsim-simplify` | Maintainability pass: reviewing code for duplication, dead code, and unnecessary complexity |
|
|
58
|
+
| `code-review` | Correctness gate: reviewing implementation for security, interfaces, errors, and test coverage |
|
|
59
|
+
| `sdd` | Executing sequential tasks where context rot is a concern (spec-driven dispatch) |
|
|
60
|
+
| `maxsim-batch` | Parallelizing work across 3-30 independent units in isolated worktrees |
|
|
59
61
|
|
|
60
62
|
### Available Agents
|
|
61
63
|
|