@devran-ai/kit 4.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent/CheatSheet.md +350 -0
- package/.agent/README.md +76 -0
- package/.agent/agents/README.md +155 -0
- package/.agent/agents/architect.md +185 -0
- package/.agent/agents/backend-specialist.md +276 -0
- package/.agent/agents/build-error-resolver.md +207 -0
- package/.agent/agents/code-reviewer.md +162 -0
- package/.agent/agents/database-architect.md +138 -0
- package/.agent/agents/devops-engineer.md +144 -0
- package/.agent/agents/doc-updater.md +229 -0
- package/.agent/agents/e2e-runner.md +145 -0
- package/.agent/agents/explorer-agent.md +143 -0
- package/.agent/agents/frontend-specialist.md +144 -0
- package/.agent/agents/go-reviewer.md +128 -0
- package/.agent/agents/knowledge-agent.md +197 -0
- package/.agent/agents/mobile-developer.md +150 -0
- package/.agent/agents/performance-optimizer.md +175 -0
- package/.agent/agents/planner.md +133 -0
- package/.agent/agents/pr-reviewer.md +148 -0
- package/.agent/agents/python-reviewer.md +123 -0
- package/.agent/agents/refactor-cleaner.md +201 -0
- package/.agent/agents/reliability-engineer.md +156 -0
- package/.agent/agents/security-reviewer.md +141 -0
- package/.agent/agents/sprint-orchestrator.md +124 -0
- package/.agent/agents/tdd-guide.md +179 -0
- package/.agent/agents/typescript-reviewer.md +110 -0
- package/.agent/checklists/README.md +102 -0
- package/.agent/checklists/pre-commit.md +93 -0
- package/.agent/checklists/session-end.md +99 -0
- package/.agent/checklists/session-start.md +102 -0
- package/.agent/checklists/task-complete.md +81 -0
- package/.agent/commands/README.md +130 -0
- package/.agent/commands/adr.md +29 -0
- package/.agent/commands/ask.md +28 -0
- package/.agent/commands/build.md +30 -0
- package/.agent/commands/changelog.md +40 -0
- package/.agent/commands/checkpoint.md +28 -0
- package/.agent/commands/code-review.md +65 -0
- package/.agent/commands/compact.md +28 -0
- package/.agent/commands/cook.md +30 -0
- package/.agent/commands/db.md +30 -0
- package/.agent/commands/debug.md +31 -0
- package/.agent/commands/deploy.md +37 -0
- package/.agent/commands/design.md +29 -0
- package/.agent/commands/doc.md +30 -0
- package/.agent/commands/eval.md +30 -0
- package/.agent/commands/fix.md +32 -0
- package/.agent/commands/git.md +32 -0
- package/.agent/commands/help.md +273 -0
- package/.agent/commands/implement.md +30 -0
- package/.agent/commands/integrate.md +32 -0
- package/.agent/commands/learn.md +29 -0
- package/.agent/commands/perf.md +31 -0
- package/.agent/commands/plan.md +56 -0
- package/.agent/commands/pr-describe.md +65 -0
- package/.agent/commands/pr-fix.md +45 -0
- package/.agent/commands/pr-merge.md +45 -0
- package/.agent/commands/pr-review.md +50 -0
- package/.agent/commands/pr-split.md +54 -0
- package/.agent/commands/pr-status.md +56 -0
- package/.agent/commands/pr.md +58 -0
- package/.agent/commands/refactor.md +32 -0
- package/.agent/commands/research.md +28 -0
- package/.agent/commands/scout.md +30 -0
- package/.agent/commands/security-scan.md +33 -0
- package/.agent/commands/setup.md +31 -0
- package/.agent/commands/status.md +59 -0
- package/.agent/commands/tdd.md +73 -0
- package/.agent/commands/verify.md +58 -0
- package/.agent/contexts/brainstorm.md +26 -0
- package/.agent/contexts/debug.md +28 -0
- package/.agent/contexts/implement.md +29 -0
- package/.agent/contexts/plan-quality-log.md +30 -0
- package/.agent/contexts/review.md +27 -0
- package/.agent/contexts/ship.md +28 -0
- package/.agent/decisions/001-trust-grade-governance.md +46 -0
- package/.agent/decisions/002-cross-ide-generation.md +15 -0
- package/.agent/engine/identity.json +4 -0
- package/.agent/engine/loading-rules.json +193 -0
- package/.agent/engine/marketplace-index.json +29 -0
- package/.agent/engine/mcp-servers/filesystem.json +9 -0
- package/.agent/engine/mcp-servers/github.json +11 -0
- package/.agent/engine/mcp-servers/postgres.json +11 -0
- package/.agent/engine/mcp-servers/supabase.json +11 -0
- package/.agent/engine/mcp-servers/vercel.json +11 -0
- package/.agent/engine/reliability-config.json +14 -0
- package/.agent/engine/sdlc-map.json +50 -0
- package/.agent/engine/workflow-state.json +167 -0
- package/.agent/hooks/README.md +101 -0
- package/.agent/hooks/hooks.json +104 -0
- package/.agent/hooks/templates/session-end.md +110 -0
- package/.agent/hooks/templates/session-start.md +95 -0
- package/.agent/manifest.json +466 -0
- package/.agent/rules/agent-upgrade-policy.md +56 -0
- package/.agent/rules/architecture.md +111 -0
- package/.agent/rules/coding-style.md +75 -0
- package/.agent/rules/documentation.md +74 -0
- package/.agent/rules/git-workflow.md +140 -0
- package/.agent/rules/quality-gate.md +117 -0
- package/.agent/rules/security.md +67 -0
- package/.agent/rules/sprint-tracking.md +103 -0
- package/.agent/rules/testing.md +80 -0
- package/.agent/rules/workflow-standards.md +30 -0
- package/.agent/rules.md +293 -0
- package/.agent/session-context.md +69 -0
- package/.agent/session-state.json +27 -0
- package/.agent/skills/README.md +135 -0
- package/.agent/skills/api-patterns/SKILL.md +117 -0
- package/.agent/skills/app-builder/SKILL.md +202 -0
- package/.agent/skills/architecture/SKILL.md +101 -0
- package/.agent/skills/behavioral-modes/SKILL.md +295 -0
- package/.agent/skills/brainstorming/SKILL.md +156 -0
- package/.agent/skills/clean-code/SKILL.md +142 -0
- package/.agent/skills/context-budget/SKILL.md +78 -0
- package/.agent/skills/continuous-learning/SKILL.md +145 -0
- package/.agent/skills/database-design/SKILL.md +303 -0
- package/.agent/skills/debugging-strategies/SKILL.md +158 -0
- package/.agent/skills/deployment-procedures/SKILL.md +191 -0
- package/.agent/skills/docker-patterns/SKILL.md +161 -0
- package/.agent/skills/eval-harness/SKILL.md +89 -0
- package/.agent/skills/frontend-patterns/SKILL.md +141 -0
- package/.agent/skills/git-workflow/SKILL.md +159 -0
- package/.agent/skills/i18n-localization/SKILL.md +191 -0
- package/.agent/skills/intelligent-routing/SKILL.md +180 -0
- package/.agent/skills/mcp-integration/SKILL.md +240 -0
- package/.agent/skills/mobile-design/SKILL.md +191 -0
- package/.agent/skills/nodejs-patterns/SKILL.md +164 -0
- package/.agent/skills/parallel-agents/SKILL.md +200 -0
- package/.agent/skills/performance-profiling/SKILL.md +134 -0
- package/.agent/skills/plan-validation/SKILL.md +192 -0
- package/.agent/skills/plan-writing/SKILL.md +183 -0
- package/.agent/skills/plan-writing/domain-enhancers.md +184 -0
- package/.agent/skills/plan-writing/plan-retrospective.md +116 -0
- package/.agent/skills/plan-writing/plan-schema.md +119 -0
- package/.agent/skills/pr-toolkit/SKILL.md +174 -0
- package/.agent/skills/production-readiness/SKILL.md +126 -0
- package/.agent/skills/security-practices/SKILL.md +109 -0
- package/.agent/skills/shell-conventions/SKILL.md +92 -0
- package/.agent/skills/strategic-compact/SKILL.md +62 -0
- package/.agent/skills/testing-patterns/SKILL.md +141 -0
- package/.agent/skills/typescript-expert/SKILL.md +160 -0
- package/.agent/skills/ui-ux-pro-max/SKILL.md +137 -0
- package/.agent/skills/ui-ux-pro-max/data/charts.csv +26 -0
- package/.agent/skills/ui-ux-pro-max/data/colors.csv +97 -0
- package/.agent/skills/ui-ux-pro-max/data/icons.csv +101 -0
- package/.agent/skills/ui-ux-pro-max/data/landing.csv +31 -0
- package/.agent/skills/ui-ux-pro-max/data/products.csv +97 -0
- package/.agent/skills/ui-ux-pro-max/data/react-performance.csv +45 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/astro.csv +54 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/flutter.csv +53 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/html-tailwind.csv +56 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/jetpack-compose.csv +53 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/nextjs.csv +53 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/nuxt-ui.csv +51 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/nuxtjs.csv +59 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/react-native.csv +52 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/react.csv +54 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/shadcn.csv +61 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/svelte.csv +54 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/swiftui.csv +51 -0
- package/.agent/skills/ui-ux-pro-max/data/stacks/vue.csv +50 -0
- package/.agent/skills/ui-ux-pro-max/data/styles.csv +68 -0
- package/.agent/skills/ui-ux-pro-max/data/typography.csv +58 -0
- package/.agent/skills/ui-ux-pro-max/data/ui-reasoning.csv +101 -0
- package/.agent/skills/ui-ux-pro-max/data/ux-guidelines.csv +100 -0
- package/.agent/skills/ui-ux-pro-max/data/web-interface.csv +31 -0
- package/.agent/skills/ui-ux-pro-max/scripts/core.py +253 -0
- package/.agent/skills/ui-ux-pro-max/scripts/design_system.py +1067 -0
- package/.agent/skills/ui-ux-pro-max/scripts/search.py +114 -0
- package/.agent/skills/verification-loop/SKILL.md +89 -0
- package/.agent/skills/webapp-testing/SKILL.md +175 -0
- package/.agent/templates/adr-template.md +32 -0
- package/.agent/templates/bug-report.md +37 -0
- package/.agent/templates/feature-request.md +32 -0
- package/.agent/workflows/README.md +101 -0
- package/.agent/workflows/brainstorm.md +86 -0
- package/.agent/workflows/create.md +85 -0
- package/.agent/workflows/debug.md +83 -0
- package/.agent/workflows/deploy.md +114 -0
- package/.agent/workflows/enhance.md +85 -0
- package/.agent/workflows/orchestrate.md +106 -0
- package/.agent/workflows/plan.md +105 -0
- package/.agent/workflows/pr-fix.md +163 -0
- package/.agent/workflows/pr-merge.md +117 -0
- package/.agent/workflows/pr-review.md +178 -0
- package/.agent/workflows/pr-split.md +118 -0
- package/.agent/workflows/pr.md +184 -0
- package/.agent/workflows/preflight.md +107 -0
- package/.agent/workflows/preview.md +95 -0
- package/.agent/workflows/quality-gate.md +103 -0
- package/.agent/workflows/retrospective.md +100 -0
- package/.agent/workflows/review.md +104 -0
- package/.agent/workflows/status.md +89 -0
- package/.agent/workflows/test.md +98 -0
- package/.agent/workflows/ui-ux-pro-max.md +93 -0
- package/.agent/workflows/upgrade.md +97 -0
- package/LICENSE +21 -0
- package/README.md +218 -0
- package/bin/kit.js +773 -0
- package/lib/agent-registry.js +228 -0
- package/lib/agent-reputation.js +343 -0
- package/lib/circuit-breaker.js +195 -0
- package/lib/cli-commands.js +322 -0
- package/lib/config-validator.js +274 -0
- package/lib/conflict-detector.js +252 -0
- package/lib/constants.js +47 -0
- package/lib/engineering-manager.js +336 -0
- package/lib/error-budget.js +370 -0
- package/lib/hook-system.js +256 -0
- package/lib/ide-generator.js +434 -0
- package/lib/identity.js +240 -0
- package/lib/io.js +146 -0
- package/lib/learning-engine.js +163 -0
- package/lib/loading-engine.js +421 -0
- package/lib/logger.js +118 -0
- package/lib/marketplace.js +321 -0
- package/lib/plugin-system.js +604 -0
- package/lib/plugin-verifier.js +197 -0
- package/lib/rate-limiter.js +113 -0
- package/lib/security-scanner.js +312 -0
- package/lib/self-healing.js +468 -0
- package/lib/session-manager.js +264 -0
- package/lib/skill-sandbox.js +244 -0
- package/lib/task-governance.js +522 -0
- package/lib/task-model.js +332 -0
- package/lib/updater.js +240 -0
- package/lib/verify.js +279 -0
- package/lib/workflow-engine.js +373 -0
- package/lib/workflow-events.js +166 -0
- package/lib/workflow-persistence.js +160 -0
- package/package.json +57 -0
|
@@ -0,0 +1,192 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: plan-validation
|
|
3
|
+
description: Quality gate for implementation plans. Validates schema compliance, cross-cutting concerns, and completeness scoring before user presentation.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
triggers: [post-plan-creation]
|
|
6
|
+
allowed-tools: Read, Grep
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Plan Validation
|
|
10
|
+
|
|
11
|
+
> Quality gate ensuring every implementation plan meets enterprise standards
|
|
12
|
+
> before being presented to the user for approval.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## Overview
|
|
17
|
+
|
|
18
|
+
This skill is used by the planner agent as a self-validation checklist after creating a plan but BEFORE presenting it to the user. The planner applies the validation pipeline below to its own output, verifying against the quality schema (`plan-schema.md`), checking cross-cutting concerns, and calculating a completeness score. Plans that fail validation are revised before presentation.
|
|
19
|
+
|
|
20
|
+
**Invocation**: The planner runs this checklist during `/plan` workflow step 3.5. This is NOT a separate agent — the planner validates its own plan against these criteria.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Validation Pipeline
|
|
25
|
+
|
|
26
|
+
### Step 1: Task Size Classification
|
|
27
|
+
|
|
28
|
+
Determine the task size from the plan content:
|
|
29
|
+
|
|
30
|
+
| Indicator | Classification |
|
|
31
|
+
|-----------|---------------|
|
|
32
|
+
| Plan references 1-2 files | **Trivial** |
|
|
33
|
+
| Plan references 3-10 files | **Medium** |
|
|
34
|
+
| Plan references 10+ files | **Large** |
|
|
35
|
+
| Estimated effort < 30 minutes | **Trivial** |
|
|
36
|
+
| Estimated effort 1-4 hours | **Medium** |
|
|
37
|
+
| Estimated effort > 4 hours | **Large** |
|
|
38
|
+
|
|
39
|
+
Use the HIGHER classification when indicators conflict.
|
|
40
|
+
|
|
41
|
+
### Step 2: Schema Compliance
|
|
42
|
+
|
|
43
|
+
Verify all required sections are present and substantively populated:
|
|
44
|
+
|
|
45
|
+
**Tier 1 Sections (Always Required)**:
|
|
46
|
+
|
|
47
|
+
| # | Section | Check |
|
|
48
|
+
|---|---------|-------|
|
|
49
|
+
| 1 | Context & Problem Statement | Present and >= 2 sentences |
|
|
50
|
+
| 2 | Goals & Non-Goals | Both goals AND non-goals stated |
|
|
51
|
+
| 3 | Implementation Steps | Steps have file paths and verification criteria |
|
|
52
|
+
| 4 | Testing Strategy | Test types specified with coverage targets |
|
|
53
|
+
| 5 | Security Considerations | Substantive content or explicit "N/A — [reason]" |
|
|
54
|
+
| 6 | Risks & Mitigations | At least 1 risk with severity and mitigation |
|
|
55
|
+
| 7 | Success Criteria | Measurable, checkable outcomes |
|
|
56
|
+
|
|
57
|
+
**Tier 2 Sections (Required for Medium/Large)**:
|
|
58
|
+
|
|
59
|
+
| # | Section | Check |
|
|
60
|
+
|---|---------|-------|
|
|
61
|
+
| 8 | Architecture Impact | Components and files identified |
|
|
62
|
+
| 9 | API / Data Model Changes | Schemas defined (or N/A with reason) |
|
|
63
|
+
| 10 | Rollback Strategy | Concrete undo procedure |
|
|
64
|
+
| 11 | Observability | Logging/metrics plan |
|
|
65
|
+
| 12 | Performance Impact | Assessment provided |
|
|
66
|
+
| 13 | Documentation Updates | Specific docs identified |
|
|
67
|
+
| 14 | Dependencies | Blockers and dependents listed |
|
|
68
|
+
| 15 | Alternatives Considered | At least 1 rejected alternative with reasoning |
|
|
69
|
+
|
|
70
|
+
### Step 3: Cross-Cutting Verification
|
|
71
|
+
|
|
72
|
+
These sections MUST be non-empty regardless of task domain:
|
|
73
|
+
|
|
74
|
+
| Section | Acceptable Content |
|
|
75
|
+
|---------|-------------------|
|
|
76
|
+
| **Security Considerations** | Specific requirements from `rules/security.md` OR `N/A — [valid justification]` |
|
|
77
|
+
| **Testing Strategy** | At least unit test plan with coverage target OR `N/A — [valid justification]` |
|
|
78
|
+
| **Documentation Updates** | Specific docs listed OR `N/A — no docs affected` |
|
|
79
|
+
|
|
80
|
+
**Unacceptable**: Empty section, placeholder text, section completely missing.
|
|
81
|
+
|
|
82
|
+
### Step 4: Specificity Audit
|
|
83
|
+
|
|
84
|
+
Verify that implementation steps are actionable, not vague:
|
|
85
|
+
|
|
86
|
+
| Vague (FAIL) | Specific (PASS) |
|
|
87
|
+
|-------------|-----------------|
|
|
88
|
+
| "Update the component" | "Add `onSubmit` handler to `src/components/LoginForm.tsx`" |
|
|
89
|
+
| "Add tests" | "Create `tests/auth.test.js` with login success/failure cases" |
|
|
90
|
+
| "Fix the bug" | "Change line 42 of `lib/parser.js`: replace `==` with `===`" |
|
|
91
|
+
| "Style the UI" | "Add Tailwind classes `flex gap-4 p-6` to `Header.tsx`" |
|
|
92
|
+
|
|
93
|
+
**Rule**: Every implementation step MUST include a file path.
|
|
94
|
+
|
|
95
|
+
### Step 5: Completeness Scoring
|
|
96
|
+
|
|
97
|
+
Calculate the score using the rubric from `plan-schema.md`:
|
|
98
|
+
|
|
99
|
+
**Tier 1 Scoring** (60 points max):
|
|
100
|
+
|
|
101
|
+
| Section | Points |
|
|
102
|
+
|---------|--------|
|
|
103
|
+
| Context & Problem Statement | 10 |
|
|
104
|
+
| Goals & Non-Goals | 10 |
|
|
105
|
+
| Implementation Steps | 10 |
|
|
106
|
+
| Testing Strategy | 10 |
|
|
107
|
+
| Security Considerations | 10 |
|
|
108
|
+
| Risks & Mitigations | 5 |
|
|
109
|
+
| Success Criteria | 5 |
|
|
110
|
+
|
|
111
|
+
**Tier 2 Scoring** (20 additional points):
|
|
112
|
+
|
|
113
|
+
| Section | Points |
|
|
114
|
+
|---------|--------|
|
|
115
|
+
| Architecture Impact | 4 |
|
|
116
|
+
| API / Data Model Changes | 3 |
|
|
117
|
+
| Rollback Strategy | 3 |
|
|
118
|
+
| Observability | 2 |
|
|
119
|
+
| Performance Impact | 2 |
|
|
120
|
+
| Documentation Updates | 2 |
|
|
121
|
+
| Dependencies | 2 |
|
|
122
|
+
| Alternatives Considered | 2 |
|
|
123
|
+
|
|
124
|
+
**Score Rules**:
|
|
125
|
+
- Section present and substantively populated = full points
|
|
126
|
+
- Section present but placeholder/minimal = half points
|
|
127
|
+
- Section missing = 0 points
|
|
128
|
+
- "N/A" with valid justification = full points
|
|
129
|
+
|
|
130
|
+
**Domain Enhancement Scoring** (bonus/penalty on top of tier score):
|
|
131
|
+
- For each domain in `matchedDomains` from the loading engine:
|
|
132
|
+
- Domain enhancer section present and substantive = **+2 bonus points**
|
|
133
|
+
- Domain matched but enhancer section missing = **-2 penalty points**
|
|
134
|
+
- Domain matched with "N/A — [valid reason]" = no bonus, no penalty
|
|
135
|
+
- Maximum domain bonus: +6 points (3 domains × 2 points)
|
|
136
|
+
- Domain scoring does not change the pass threshold — it provides additional quality signal
|
|
137
|
+
|
|
138
|
+
### Step 6: Verdict
|
|
139
|
+
|
|
140
|
+
| Condition | Verdict | Action |
|
|
141
|
+
|-----------|---------|--------|
|
|
142
|
+
| Score >= 70% of tier max | **PASS** | Present plan to user with score |
|
|
143
|
+
| Score < 70% of tier max | **REVISE** | Identify gaps, revise, re-validate |
|
|
144
|
+
|
|
145
|
+
**Revision Protocol**:
|
|
146
|
+
1. Identify the specific missing or weak sections
|
|
147
|
+
2. Provide targeted instructions to the planner for revision
|
|
148
|
+
3. Re-run validation after revision
|
|
149
|
+
4. Maximum 2 revision cycles — then present with warnings
|
|
150
|
+
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
## Output Format
|
|
154
|
+
|
|
155
|
+
After validation, append to the plan:
|
|
156
|
+
|
|
157
|
+
```markdown
|
|
158
|
+
## Plan Quality Assessment
|
|
159
|
+
|
|
160
|
+
**Task Size**: [Trivial/Medium/Large]
|
|
161
|
+
**Quality Score**: [X]/[max] ([percentage]%) [+N domain bonus / -N domain penalty]
|
|
162
|
+
**Verdict**: [PASS/REVISE]
|
|
163
|
+
|
|
164
|
+
### Validation Results
|
|
165
|
+
|
|
166
|
+
| Check | Status |
|
|
167
|
+
|-------|--------|
|
|
168
|
+
| Schema Compliance | [sections present]/[sections required] |
|
|
169
|
+
| Cross-Cutting Concerns | [All addressed / Missing: X, Y] |
|
|
170
|
+
| Specificity Audit | [All steps have file paths / X steps lack paths] |
|
|
171
|
+
| Domain Enhancement | [N domains matched, N enhancer sections present] |
|
|
172
|
+
| Rules Consulted | [list of rule files referenced] |
|
|
173
|
+
| Matched Domains | [list from loading engine] |
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## Integration
|
|
179
|
+
|
|
180
|
+
- **Invoked by**: `/plan` workflow (step 3.5, between plan creation and user presentation)
|
|
181
|
+
- **Depends on**: `plan-schema.md` for scoring rubric, `domain-enhancers.md` for domain sections
|
|
182
|
+
- **Feeds into**: Plan quality score shown to user alongside the plan
|
|
183
|
+
- **Learning**: Quality scores are logged to `.agent/contexts/plan-quality-log.md` for adaptive improvement
|
|
184
|
+
|
|
185
|
+
---
|
|
186
|
+
|
|
187
|
+
## Principles
|
|
188
|
+
|
|
189
|
+
1. **Validate, don't block**: The goal is quality improvement, not gatekeeping. After 2 revision cycles, present the plan with warnings rather than blocking indefinitely.
|
|
190
|
+
2. **Score transparently**: The user sees the quality score and understands what was checked.
|
|
191
|
+
3. **Learn from outcomes**: Post-implementation retrospectives compare predicted vs. actual to calibrate future scoring.
|
|
192
|
+
4. **Cross-cutting is non-negotiable**: Security, testing, and documentation sections must ALWAYS be addressed. This is the single most impactful quality gate.
|
|
@@ -0,0 +1,183 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: plan-writing
|
|
3
|
+
description: Structured task planning with clear breakdowns, dependencies, and verification criteria.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
allowed-tools: Read, Glob, Grep
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Plan Writing
|
|
9
|
+
|
|
10
|
+
> Small tasks, clear outcomes, verifiable results.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Overview
|
|
15
|
+
|
|
16
|
+
Framework for breaking down work into clear, actionable tasks with verification criteria.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Task Breakdown Principles
|
|
21
|
+
|
|
22
|
+
### 1. Small, Focused Tasks
|
|
23
|
+
|
|
24
|
+
- Each task: 2-5 minutes
|
|
25
|
+
- One clear outcome per task
|
|
26
|
+
- Independently verifiable
|
|
27
|
+
|
|
28
|
+
### 2. Clear Verification
|
|
29
|
+
|
|
30
|
+
- How do you know it's done?
|
|
31
|
+
- What can you check/test?
|
|
32
|
+
- What's the expected output?
|
|
33
|
+
|
|
34
|
+
### 3. Logical Ordering
|
|
35
|
+
|
|
36
|
+
- Dependencies identified
|
|
37
|
+
- Parallel work where possible
|
|
38
|
+
- Critical path highlighted
|
|
39
|
+
- **Verification is always LAST**
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## Planning Principles
|
|
44
|
+
|
|
45
|
+
> 🔴 **NO fixed templates. Each plan's CONTENT is UNIQUE to the task.**
|
|
46
|
+
> ✅ **Every plan MUST satisfy the quality schema in `plan-schema.md`.**
|
|
47
|
+
> Dynamic content within a consistent structure = the standard.
|
|
48
|
+
|
|
49
|
+
### Principle 1: Right-Size to Task Tier
|
|
50
|
+
|
|
51
|
+
Plan length MUST match task complexity:
|
|
52
|
+
|
|
53
|
+
| Task Tier | Max Sections | Max Tasks | Guideline |
|
|
54
|
+
| --------- | ------------ | --------- | --------- |
|
|
55
|
+
| **Trivial** (1-2 files) | Tier 1 only (7 sections) | 5-8 tasks | ~1 page — concise, no specialist synthesis |
|
|
56
|
+
| **Medium** (3-10 files) | Tier 1 + Tier 2 (15 sections) | 8-15 tasks | 2-3 pages — includes specialist input |
|
|
57
|
+
| **Large** (10+ files) | Tier 1 + Tier 2 + domains (15+ sections) | 15-25 tasks | 3-5 pages — full multi-agent synthesis |
|
|
58
|
+
|
|
59
|
+
| ❌ Wrong | ✅ Right |
|
|
60
|
+
| -------- | -------- |
|
|
61
|
+
| 50 tasks with sub-sub-tasks | Right-sized task count per tier |
|
|
62
|
+
| Every micro-step listed | Only actionable items |
|
|
63
|
+
| Verbose descriptions | One-line per task |
|
|
64
|
+
| Large task crammed into 1 page | Large task gets full Tier 2 coverage |
|
|
65
|
+
| Trivial task with 15 sections | Trivial task uses Tier 1 only |
|
|
66
|
+
|
|
67
|
+
> **Rule:** Trivial tasks stay concise (~1 page). Medium/Large tasks expand to cover all required tier sections. Never sacrifice completeness for brevity on complex tasks.
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
### Principle 2: Be SPECIFIC, Not Generic
|
|
72
|
+
|
|
73
|
+
| ❌ Wrong | ✅ Right |
|
|
74
|
+
| -------------------- | -------------------------------------------------------- |
|
|
75
|
+
| "Set up project" | "Run `npx create-next-app`" |
|
|
76
|
+
| "Add authentication" | "Install next-auth, create `/api/auth/[...nextauth].ts`" |
|
|
77
|
+
| "Style the UI" | "Add Tailwind classes to `Header.tsx`" |
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
### Principle 3: Dynamic Content Based on Context
|
|
82
|
+
|
|
83
|
+
**For NEW PROJECT:**
|
|
84
|
+
|
|
85
|
+
- What tech stack?
|
|
86
|
+
- What's the MVP?
|
|
87
|
+
- What's the file structure?
|
|
88
|
+
|
|
89
|
+
**For FEATURE ADDITION:**
|
|
90
|
+
|
|
91
|
+
- Which files are affected?
|
|
92
|
+
- What dependencies needed?
|
|
93
|
+
- How to verify it works?
|
|
94
|
+
|
|
95
|
+
**For BUG FIX:**
|
|
96
|
+
|
|
97
|
+
- What's the root cause?
|
|
98
|
+
- What file/line to change?
|
|
99
|
+
- How to test the fix?
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
### Principle 4: Verification is Simple
|
|
104
|
+
|
|
105
|
+
| ❌ Wrong | ✅ Right |
|
|
106
|
+
| ---------------------------- | -------------------------------------------- |
|
|
107
|
+
| "Verify the component works" | "Run `npm run dev`, click button, see toast" |
|
|
108
|
+
| "Test the API" | "curl localhost:3000/api/users returns 200" |
|
|
109
|
+
| "Check styles" | "Open browser, verify dark mode works" |
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
### Principle 5: Cross-Cutting Concerns Are Mandatory
|
|
114
|
+
|
|
115
|
+
Every plan MUST explicitly address:
|
|
116
|
+
|
|
117
|
+
1. **Security**: Reference `.agent/rules/security.md` — what security implications exist?
|
|
118
|
+
2. **Testing**: Reference `.agent/rules/testing.md` — what test types are needed? Coverage targets?
|
|
119
|
+
3. **Documentation**: Reference `.agent/rules/documentation.md` — which docs need updating?
|
|
120
|
+
|
|
121
|
+
If a concern is genuinely not applicable, state `N/A — [one-line justification]`.
|
|
122
|
+
|
|
123
|
+
**NEVER silently omit these sections.** Silent omission is a plan defect.
|
|
124
|
+
|
|
125
|
+
---
|
|
126
|
+
|
|
127
|
+
### Principle 6: Schema Compliance
|
|
128
|
+
|
|
129
|
+
Every plan MUST satisfy the quality schema defined in `plan-schema.md`:
|
|
130
|
+
|
|
131
|
+
- **Tier 1** sections are ALWAYS required. Omitting any Tier 1 section is a plan defect.
|
|
132
|
+
- **Tier 2** sections are required for Medium and Large tasks (3+ files or 1+ hours).
|
|
133
|
+
- Before presenting a plan, validate it against the schema checklist.
|
|
134
|
+
- Plans scoring below 70% of their tier maximum must be revised before presentation.
|
|
135
|
+
|
|
136
|
+
See also: `domain-enhancers.md` for domain-specific plan sections.
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## Plan Structure (Minimal)
|
|
141
|
+
|
|
142
|
+
```markdown
|
|
143
|
+
# [Task Name]
|
|
144
|
+
|
|
145
|
+
## Goal
|
|
146
|
+
|
|
147
|
+
One sentence: What are we building/fixing?
|
|
148
|
+
|
|
149
|
+
## Tasks
|
|
150
|
+
|
|
151
|
+
- [ ] Task 1: [Action] → Verify: [Check]
|
|
152
|
+
- [ ] Task 2: [Action] → Verify: [Check]
|
|
153
|
+
- [ ] Task 3: [Action] → Verify: [Check]
|
|
154
|
+
|
|
155
|
+
## Done When
|
|
156
|
+
|
|
157
|
+
- [ ] [Main success criteria]
|
|
158
|
+
|
|
159
|
+
## Notes
|
|
160
|
+
|
|
161
|
+
[Any important considerations]
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
> **That's it.** No phases, no sub-sections unless truly needed.
|
|
165
|
+
|
|
166
|
+
---
|
|
167
|
+
|
|
168
|
+
## Best Practices
|
|
169
|
+
|
|
170
|
+
1. **Start with goal** — What are we building/fixing?
|
|
171
|
+
2. **Max 10 tasks** — If more, break into multiple plans
|
|
172
|
+
3. **Each task verifiable** — Clear "done" criteria
|
|
173
|
+
4. **Project-specific** — No copy-paste templates
|
|
174
|
+
5. **Update as you go** — Mark `[x]` when complete
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## When to Use
|
|
179
|
+
|
|
180
|
+
- New project from scratch
|
|
181
|
+
- Adding a feature
|
|
182
|
+
- Fixing a bug (if complex)
|
|
183
|
+
- Refactoring multiple files
|
|
@@ -0,0 +1,184 @@
|
|
|
1
|
+
# Domain-Specific Plan Enhancers
|
|
2
|
+
|
|
3
|
+
> When the loading engine matches specific domains for a task, the planner
|
|
4
|
+
> MUST include the corresponding domain-specific sections below.
|
|
5
|
+
> These sections are additive to the base plan schema (Tier 1 + Tier 2).
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Frontend Domain
|
|
10
|
+
|
|
11
|
+
**Triggered when**: `frontend` domain matched (keywords: react, next.js, component, css, ui, ux, etc.)
|
|
12
|
+
|
|
13
|
+
Include in plan:
|
|
14
|
+
|
|
15
|
+
- **Accessibility (WCAG 2.1 AA)**: Identify components requiring ARIA labels, keyboard navigation, screen reader support, color contrast compliance (minimum 4.5:1 normal text, 3:1 large text)
|
|
16
|
+
- **Responsive Design**: Specify breakpoints to test (mobile 375px, tablet 768px, desktop 1280px), identify layout changes per breakpoint, verify touch targets (minimum 44x44px)
|
|
17
|
+
- **Bundle Size Impact**: Estimate size of new dependencies, identify tree-shaking opportunities, consider code splitting for new routes, set bundle budget (initial JS < 200KB gzipped)
|
|
18
|
+
- **Core Web Vitals**: Assess impact on LCP (< 2.5s), CLS (< 0.1), INP (< 200ms), identify render-blocking resources
|
|
19
|
+
- **Component Composition**: Specify component hierarchy, prop interfaces, state management approach (local vs. global), identify shared components for extraction
|
|
20
|
+
- **Rendering Strategy**: SSR vs CSR vs ISR decision for each route, hydration impact assessment, streaming SSR opportunities
|
|
21
|
+
- **Design System Compliance**: Verify alignment with existing design tokens (colors, spacing, typography), identify new tokens required
|
|
22
|
+
- **Error Boundaries**: Define error boundary placement, fallback UI for each failure mode, error reporting integration
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## Backend Domain
|
|
27
|
+
|
|
28
|
+
**Triggered when**: `backend` domain matched (keywords: api, server, node, express, middleware, endpoint, etc.)
|
|
29
|
+
|
|
30
|
+
Include in plan:
|
|
31
|
+
|
|
32
|
+
- **API Contract**: Define request/response schemas (Zod validation), HTTP methods, status codes, error response format (RFC 7807 Problem Details), versioning strategy
|
|
33
|
+
- **Error Handling**: Specify error response structure, error codes, client-facing messages vs. internal logging, error correlation IDs for tracing
|
|
34
|
+
- **Rate Limiting**: Identify endpoints requiring rate limits, specify limits (requests/minute/user), throttling strategy (sliding window vs. token bucket), response headers (X-RateLimit-*)
|
|
35
|
+
- **Middleware Chain**: Document new middleware additions, execution order, impact on existing middleware stack, short-circuit conditions
|
|
36
|
+
- **Database Interaction**: Query patterns (parameterized), transaction boundaries, connection pooling impact, N+1 query prevention
|
|
37
|
+
- **Input Validation**: Validation layer placement (controller vs. middleware), sanitization strategy, content-type enforcement, request size limits
|
|
38
|
+
- **Idempotency**: Identify non-idempotent operations, implement idempotency keys for critical mutations, retry safety assessment
|
|
39
|
+
- **Observability**: Structured logging format (JSON), request tracing headers (X-Request-ID propagation), health check endpoint specification
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## Database Domain
|
|
44
|
+
|
|
45
|
+
**Triggered when**: `database` domain matched (keywords: database, sql, migration, schema, query, orm, etc.)
|
|
46
|
+
|
|
47
|
+
Include in plan:
|
|
48
|
+
|
|
49
|
+
- **Migration Rollback**: Write both up and down migrations, test rollback procedure before deploying, zero-downtime migration pattern (expand-contract for schema changes)
|
|
50
|
+
- **Index Impact Analysis**: Identify queries affected by schema changes, recommend index additions/removals, estimate query performance impact, verify composite index column order matches query patterns
|
|
51
|
+
- **Data Integrity**: Define constraints (foreign keys, unique, not null, check), cascade behavior for deletions, domain invariant enforcement at database level
|
|
52
|
+
- **Backup Verification**: Verify backup exists before destructive migrations, test restore procedure for critical tables, point-in-time recovery validation
|
|
53
|
+
- **Query Performance**: Benchmark key queries before and after changes (EXPLAIN ANALYZE), set acceptable latency thresholds (p50 < 10ms, p99 < 100ms for OLTP), identify sequential scan risks
|
|
54
|
+
- **Consistency Model**: Specify required consistency level (strong/eventual), transaction isolation level selection (Read Committed default, Serializable for financial), optimistic vs. pessimistic locking strategy
|
|
55
|
+
- **Data Classification**: Identify PII columns requiring encryption at rest, data retention policy compliance, audit trail requirements for sensitive data mutations
|
|
56
|
+
- **Connection Management**: Connection pool sizing for workload (pool_size = num_cores * 2 + disk_spindles), statement timeout configuration, idle connection cleanup
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## DevOps Domain
|
|
61
|
+
|
|
62
|
+
**Triggered when**: `devops` domain matched (keywords: deploy, ci, cd, docker, kubernetes, pipeline, etc.)
|
|
63
|
+
|
|
64
|
+
Include in plan:
|
|
65
|
+
|
|
66
|
+
- **Infrastructure Changes**: Specify IaC modifications (Dockerfile, docker-compose, CI config), environment variable additions, 12-Factor App compliance check
|
|
67
|
+
- **Monitoring & Alerting**: Define new metrics to track, alerting thresholds (SLO-derived), dashboard updates, golden signals coverage (latency, traffic, errors, saturation)
|
|
68
|
+
- **Progressive Rollout**: Strategy for deployment (canary → staged → full), rollback triggers (error rate > 1%, latency p99 > 2x baseline), automated rollback criteria, health check endpoints
|
|
69
|
+
- **Runbook Updates**: Document operational procedures for the new functionality, incident response steps, escalation paths
|
|
70
|
+
- **Environment Parity**: Verify changes work across dev, staging, and production environments, configuration drift detection
|
|
71
|
+
- **GitOps Compliance**: Infrastructure changes committed to version control, declarative configuration (desired state, not imperative scripts), automated drift reconciliation
|
|
72
|
+
- **Container Security**: Base image selection (distroless/alpine preferred), multi-stage build optimization, no secrets in image layers, vulnerability scanning in CI
|
|
73
|
+
- **Observability Pipeline**: Log aggregation configuration, trace sampling strategy, metric cardinality assessment, correlation between logs/traces/metrics
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## Security Domain
|
|
78
|
+
|
|
79
|
+
**Triggered when**: `security` domain matched (keywords or implicit triggers: auth, login, signup, form, payment, etc.)
|
|
80
|
+
|
|
81
|
+
Include in plan (in addition to mandatory security considerations):
|
|
82
|
+
|
|
83
|
+
- **Threat Model (STRIDE)**: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege — assess each for the change with severity rating
|
|
84
|
+
- **Authentication Flow Impact**: How the change affects login, session management, token lifecycle, OAuth 2.0 flow selection (Authorization Code + PKCE for SPAs, Client Credentials for M2M)
|
|
85
|
+
- **Data Classification**: Identify data sensitivity levels (public, internal, confidential, restricted), storage and transmission requirements per level
|
|
86
|
+
- **Compliance Requirements**: GDPR/CCPA implications (data minimization, consent, right to erasure, breach notification within 72 hours)
|
|
87
|
+
- **Secret Management**: New secrets required, rotation policy, storage mechanism (environment variables only), zero hardcoded credentials enforcement
|
|
88
|
+
- **Zero Trust Assessment**: Authentication at every boundary (never trust, always verify), least privilege access for new endpoints/services, micro-segmentation for new network paths
|
|
89
|
+
- **Supply Chain Security**: New dependency audit (license, maintainer, vulnerability scan), lockfile integrity verification, SRI hashes for CDN resources
|
|
90
|
+
- **Input Boundary Defense**: All external inputs validated and sanitized, output encoding for context (HTML/URL/JS), parameterized queries only (no string concatenation)
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
## Performance Domain
|
|
95
|
+
|
|
96
|
+
**Triggered when**: `performance` domain matched (keywords: slow, optimize, speed, bundle, lighthouse, cache, etc.)
|
|
97
|
+
|
|
98
|
+
Include in plan:
|
|
99
|
+
|
|
100
|
+
- **Performance Budget**: Define acceptable thresholds (LCP < 2.5s, FID < 100ms, page load < 3s, API p99 < 500ms, memory < 512MB per process)
|
|
101
|
+
- **Profiling Strategy**: Tools and methods to measure before/after (Lighthouse, Chrome DevTools, load testing with k6/Artillery), baseline measurement requirements
|
|
102
|
+
- **Caching Architecture**: Cache layers (browser → CDN → application → database), TTL values per layer, invalidation strategy (time-based, event-driven, version-key), cache stampede prevention (stale-while-revalidate, locking)
|
|
103
|
+
- **Lazy Loading**: Identify resources for deferred loading, intersection observer patterns, dynamic imports for route-level code splitting, image loading strategy (responsive images, next-gen formats)
|
|
104
|
+
- **Benchmarking**: Define benchmark suite, baseline measurements, regression detection thresholds, automated performance gates in CI
|
|
105
|
+
- **Database Query Optimization**: EXPLAIN ANALYZE for new/modified queries, index coverage verification, N+1 detection, read replica routing for heavy reads
|
|
106
|
+
- **Concurrency Model**: Event loop impact assessment, worker thread candidates for CPU-intensive operations, connection pool saturation risk
|
|
107
|
+
- **CDN Strategy**: Edge caching rules for static assets, cache-control header specification, origin shield configuration, geographic distribution assessment
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## Mobile Domain
|
|
112
|
+
|
|
113
|
+
**Triggered when**: `mobile` domain matched (keywords: mobile, react native, expo, ios, android, etc.)
|
|
114
|
+
|
|
115
|
+
Include in plan:
|
|
116
|
+
|
|
117
|
+
- **Platform Parity**: Identify iOS vs. Android differences in behavior, UI, or API access, platform-specific code paths (#ifdef equivalent)
|
|
118
|
+
- **Offline Support**: Define offline behavior, data sync strategy (optimistic vs. pessimistic), conflict resolution (last-write-wins, CRDT, manual merge), network-aware queries
|
|
119
|
+
- **App Store Guidelines**: Compliance with Apple HIG and Material Design 3, review guideline risks, in-app purchase requirements
|
|
120
|
+
- **Native Modules**: Bridge requirements, native module dependencies, build configuration changes (Podfile/build.gradle)
|
|
121
|
+
- **Device Testing**: Target device matrix, screen size variations, OS version compatibility (minimum iOS 15 / Android API 26)
|
|
122
|
+
- **Navigation Architecture**: Navigation pattern selection (stack, tab, drawer), deep linking support, back navigation handling per platform
|
|
123
|
+
- **Mobile Performance Budget**: App startup time < 2s, frame rate 60fps minimum, memory usage < 150MB, APK/IPA size budget
|
|
124
|
+
- **State Persistence**: Local storage strategy (AsyncStorage, SQLite, MMKV), state rehydration on app resume, background task handling
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## Reliability Domain
|
|
129
|
+
|
|
130
|
+
**Triggered when**: `reliability` domain matched (keywords: reliability, uptime, monitoring, sre, sla, slo, sli, etc.)
|
|
131
|
+
|
|
132
|
+
Include in plan:
|
|
133
|
+
|
|
134
|
+
- **SLO Definition**: Define Service Level Objectives for affected services (availability target, latency targets at p50/p95/p99, error rate budget)
|
|
135
|
+
- **SLI Instrumentation**: Specify Service Level Indicators to measure (request success rate, request latency, system throughput), measurement method and data source
|
|
136
|
+
- **Error Budget Impact**: Assess how the change affects existing error budgets, define acceptable error budget consumption for rollout
|
|
137
|
+
- **Golden Signals**: Monitoring for all four golden signals (latency, traffic, errors, saturation) for new/modified services
|
|
138
|
+
- **Resilience Patterns**: Circuit breaker placement, retry policy (exponential backoff with jitter), timeout configuration, bulkhead isolation for critical paths
|
|
139
|
+
- **Incident Preparedness**: Runbook for new failure modes, alerting rules (page vs. ticket), escalation matrix, blast radius assessment
|
|
140
|
+
- **Chaos Engineering**: Identify failure injection points for validation, steady-state hypothesis, abort conditions for chaos experiments
|
|
141
|
+
- **Capacity Planning**: Resource requirements (CPU, memory, network, storage), scaling triggers (auto-scale thresholds), load testing validation for expected traffic growth
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## Observability Domain
|
|
146
|
+
|
|
147
|
+
**Triggered when**: `observability` domain matched (keywords: logging, tracing, metrics, monitoring, alerting, opentelemetry, etc.)
|
|
148
|
+
|
|
149
|
+
Include in plan:
|
|
150
|
+
|
|
151
|
+
- **Three Pillars Coverage**: Specify logging additions (structured JSON), metrics (counters, histograms, gauges), traces (span creation, context propagation)
|
|
152
|
+
- **OpenTelemetry Integration**: SDK initialization, auto-instrumentation scope, manual span creation for business-critical paths, sampling strategy (head-based vs. tail-based)
|
|
153
|
+
- **Log Architecture**: Log levels and when to use each (ERROR: actionable failures, WARN: degradation, INFO: business events, DEBUG: development only), structured fields, correlation ID propagation
|
|
154
|
+
- **Alerting Strategy**: Alert conditions derived from SLOs, notification channels (PagerDuty/Slack), alert fatigue prevention (multi-window burn rate), silence/snooze policies
|
|
155
|
+
- **Dashboard Design**: Key metrics visualization, RED method (Rate, Errors, Duration) per service, drill-down capability from overview to detail
|
|
156
|
+
- **Cost Management**: Metric cardinality assessment, log volume projection, trace sampling rate optimization, retention policy per signal type
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Distributed Systems Domain
|
|
161
|
+
|
|
162
|
+
**Triggered when**: `architecture` domain matched AND task involves multiple services, message queues, or event-driven patterns
|
|
163
|
+
|
|
164
|
+
Include in plan:
|
|
165
|
+
|
|
166
|
+
- **Consistency Strategy**: CAP theorem trade-off for the specific use case, consistency model selection (strong, eventual, causal), Saga pattern for distributed transactions (choreography vs. orchestration)
|
|
167
|
+
- **Communication Pattern**: Synchronous (REST/gRPC) vs. asynchronous (message queue/event stream) decision per interaction, protocol selection criteria
|
|
168
|
+
- **Fault Tolerance**: Failure mode analysis for each service interaction, fallback behavior, partial failure handling, data loss prevention
|
|
169
|
+
- **Event-Driven Design**: Event schema definition (CloudEvents format), event ordering guarantees, idempotent consumers, dead letter queue strategy
|
|
170
|
+
- **Service Discovery**: Registration mechanism, health check protocol, load balancing strategy (client-side vs. server-side), circuit breaker integration
|
|
171
|
+
- **Data Sovereignty**: Which service owns which data, cross-service data access patterns (API calls, not shared databases), eventual consistency reconciliation
|
|
172
|
+
|
|
173
|
+
---
|
|
174
|
+
|
|
175
|
+
## Usage
|
|
176
|
+
|
|
177
|
+
The planner reads this file when domain-specific sections are needed:
|
|
178
|
+
|
|
179
|
+
1. Loading engine returns `matchedDomains` array
|
|
180
|
+
2. For each matched domain, include the corresponding enhancer section
|
|
181
|
+
3. Domain sections are added AFTER the base plan schema sections
|
|
182
|
+
4. Multiple domains can be active simultaneously (e.g., frontend + backend for a full-stack feature)
|
|
183
|
+
5. Each domain section contributes to the plan quality score (+2 bonus per matched domain section present, -2 penalty per missing)
|
|
184
|
+
6. Domain enhancers leverage the specialized knowledge of their corresponding elevated agents (e.g., reliability domain draws from reliability-engineer's SRE Golden Signals framework)
|
|
@@ -0,0 +1,116 @@
|
|
|
1
|
+
# Plan Retrospective
|
|
2
|
+
|
|
3
|
+
> Post-implementation review protocol for measuring plan accuracy
|
|
4
|
+
> and feeding learnings back into future plan generation.
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Overview
|
|
9
|
+
|
|
10
|
+
After a planned task reaches the VERIFY phase (all implementation complete, tests running), this retrospective compares the original plan against actual implementation to identify accuracy gaps and improve future planning.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## When to Run
|
|
15
|
+
|
|
16
|
+
- **Primary Trigger**: The `plan-complete` hook in `.agent/hooks/hooks.json` fires when workflow state transitions to VERIFY phase
|
|
17
|
+
- **Manual Trigger**: User runs `/retrospective` on a completed plan
|
|
18
|
+
- **Data Flow**: The hook reads the original plan file (`docs/PLAN-{slug}.md`), compares against `git diff --name-only` from the plan's creation timestamp, then appends results to `.agent/contexts/plan-quality-log.md`
|
|
19
|
+
- **Frequency**: After every planned task completes implementation
|
|
20
|
+
- **Blocking**: No — this is a learning activity, not a quality gate (severity: medium, onFailure: log)
|
|
21
|
+
- **Planner Integration**: The planner reads `plan-quality-log.md` during Requirements Analysis (Step 1) to adjust estimates and predictions for future plans
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## Retrospective Dimensions
|
|
26
|
+
|
|
27
|
+
### 1. File Prediction Accuracy
|
|
28
|
+
|
|
29
|
+
Compare files listed in the plan vs. files actually modified:
|
|
30
|
+
|
|
31
|
+
| Metric | Measurement |
|
|
32
|
+
|--------|-------------|
|
|
33
|
+
| **Files Predicted** | Count of unique file paths in the plan |
|
|
34
|
+
| **Files Actually Modified** | Count from `git diff --name-only` against plan start |
|
|
35
|
+
| **Prediction Accuracy** | Predicted / Actual (percentage) |
|
|
36
|
+
| **Surprise Files** | Files modified that were NOT in the plan |
|
|
37
|
+
| **Unused Predictions** | Files in the plan that were NOT modified |
|
|
38
|
+
|
|
39
|
+
### 2. Task Completeness
|
|
40
|
+
|
|
41
|
+
| Metric | Measurement |
|
|
42
|
+
|--------|-------------|
|
|
43
|
+
| **Tasks Planned** | Count of implementation steps in original plan |
|
|
44
|
+
| **Tasks Completed** | Steps that matched actual work |
|
|
45
|
+
| **Surprise Tasks** | Work done that wasn't in the plan |
|
|
46
|
+
| **Dropped Tasks** | Planned tasks that turned out unnecessary |
|
|
47
|
+
| **Completeness Score** | (Completed - Surprise) / Planned |
|
|
48
|
+
|
|
49
|
+
### 3. Estimate Accuracy
|
|
50
|
+
|
|
51
|
+
| Metric | Measurement |
|
|
52
|
+
|--------|-------------|
|
|
53
|
+
| **Estimated Effort** | Total hours from plan |
|
|
54
|
+
| **Actual Effort** | Approximate actual time spent |
|
|
55
|
+
| **Drift** | Actual / Estimated (ratio; 1.0 = perfect) |
|
|
56
|
+
| **Drift Direction** | Over-estimated / Under-estimated / Accurate |
|
|
57
|
+
|
|
58
|
+
### 4. Risk Prediction
|
|
59
|
+
|
|
60
|
+
| Metric | Measurement |
|
|
61
|
+
|--------|-------------|
|
|
62
|
+
| **Risks Identified** | Count of risks in plan |
|
|
63
|
+
| **Risks Materialized** | Planned risks that actually occurred |
|
|
64
|
+
| **Surprise Risks** | Unplanned risks that emerged |
|
|
65
|
+
| **Risk Prediction Rate** | Materialized / (Materialized + Surprise) |
|
|
66
|
+
|
|
67
|
+
### 5. Specialist Contribution Value
|
|
68
|
+
|
|
69
|
+
| Specialist | Contribution Accurate? | Key Insight That Helped |
|
|
70
|
+
|-----------|----------------------|------------------------|
|
|
71
|
+
| Architect | Yes/No/Partial | [what was most useful] |
|
|
72
|
+
| Security-Reviewer | Yes/No/Partial | [what was most useful] |
|
|
73
|
+
| TDD-Guide | Yes/No/Partial | [what was most useful] |
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## Output Format
|
|
78
|
+
|
|
79
|
+
Append one row to `.agent/contexts/plan-quality-log.md`:
|
|
80
|
+
|
|
81
|
+
```markdown
|
|
82
|
+
| [date] | [plan name] | [quality score] | [files predicted] | [files actual] | [surprise count] | [estimate drift] | [key learning] |
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
### Key Learning Format
|
|
86
|
+
|
|
87
|
+
Capture the single most important learning in one sentence:
|
|
88
|
+
|
|
89
|
+
**Good examples**:
|
|
90
|
+
- "Auth tasks consistently require middleware changes not predicted in plans"
|
|
91
|
+
- "Database migration effort was 2x underestimated due to index rebuilding"
|
|
92
|
+
- "Frontend plans should always include accessibility testing as a task"
|
|
93
|
+
|
|
94
|
+
**Bad examples**:
|
|
95
|
+
- "The plan was good" (not actionable)
|
|
96
|
+
- "Everything went as expected" (no learning value)
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## Adaptive Feedback
|
|
101
|
+
|
|
102
|
+
The planner agent reads `plan-quality-log.md` at the start of each planning session to:
|
|
103
|
+
|
|
104
|
+
1. **Adjust estimates**: If historical drift is consistently 1.5x, multiply estimates by 1.5
|
|
105
|
+
2. **Predict surprise files**: If auth tasks consistently miss middleware, proactively include middleware files
|
|
106
|
+
3. **Weight risks**: If certain risk categories historically materialize, elevate their severity
|
|
107
|
+
4. **Improve domain sections**: If specific domain enhancer sections are consistently unhelpful, deprioritize them
|
|
108
|
+
5. **Value specialists**: If security-reviewer contributions are consistently accurate, weight their input more heavily
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## Example Retrospective Entry
|
|
113
|
+
|
|
114
|
+
```
|
|
115
|
+
| 2026-03-16 | PLAN-user-auth | 72/80 | 8 | 11 | 3 (middleware, session config, error handler) | 1.4x | Auth plans should include middleware and session store files by default |
|
|
116
|
+
```
|