tribunal-kit 1.0.0 → 2.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (127) hide show
  1. package/.agent/.shared/ui-ux-pro-max/README.md +3 -3
  2. package/.agent/ARCHITECTURE.md +205 -10
  3. package/.agent/GEMINI.md +37 -7
  4. package/.agent/agents/accessibility-reviewer.md +134 -0
  5. package/.agent/agents/ai-code-reviewer.md +129 -0
  6. package/.agent/agents/frontend-specialist.md +3 -0
  7. package/.agent/agents/game-developer.md +21 -21
  8. package/.agent/agents/logic-reviewer.md +12 -0
  9. package/.agent/agents/mobile-reviewer.md +79 -0
  10. package/.agent/agents/orchestrator.md +56 -26
  11. package/.agent/agents/performance-reviewer.md +36 -0
  12. package/.agent/agents/supervisor-agent.md +156 -0
  13. package/.agent/agents/swarm-worker-contracts.md +166 -0
  14. package/.agent/agents/swarm-worker-registry.md +92 -0
  15. package/.agent/rules/GEMINI.md +134 -5
  16. package/.agent/scripts/bundle_analyzer.py +259 -0
  17. package/.agent/scripts/dependency_analyzer.py +247 -0
  18. package/.agent/scripts/lint_runner.py +188 -0
  19. package/.agent/scripts/patch_skills_meta.py +177 -0
  20. package/.agent/scripts/patch_skills_output.py +285 -0
  21. package/.agent/scripts/schema_validator.py +279 -0
  22. package/.agent/scripts/security_scan.py +224 -0
  23. package/.agent/scripts/session_manager.py +144 -3
  24. package/.agent/scripts/skill_integrator.py +234 -0
  25. package/.agent/scripts/strengthen_skills.py +220 -0
  26. package/.agent/scripts/swarm_dispatcher.py +317 -0
  27. package/.agent/scripts/test_runner.py +192 -0
  28. package/.agent/scripts/test_swarm_dispatcher.py +163 -0
  29. package/.agent/skills/agent-organizer/SKILL.md +132 -0
  30. package/.agent/skills/agentic-patterns/SKILL.md +335 -0
  31. package/.agent/skills/api-patterns/SKILL.md +226 -50
  32. package/.agent/skills/app-builder/SKILL.md +215 -52
  33. package/.agent/skills/architecture/SKILL.md +176 -31
  34. package/.agent/skills/bash-linux/SKILL.md +150 -134
  35. package/.agent/skills/behavioral-modes/SKILL.md +152 -160
  36. package/.agent/skills/brainstorming/SKILL.md +148 -101
  37. package/.agent/skills/brainstorming/dynamic-questioning.md +10 -0
  38. package/.agent/skills/clean-code/SKILL.md +139 -134
  39. package/.agent/skills/code-review-checklist/SKILL.md +177 -80
  40. package/.agent/skills/config-validator/SKILL.md +165 -0
  41. package/.agent/skills/csharp-developer/SKILL.md +107 -0
  42. package/.agent/skills/database-design/SKILL.md +252 -29
  43. package/.agent/skills/deployment-procedures/SKILL.md +122 -175
  44. package/.agent/skills/devops-engineer/SKILL.md +134 -0
  45. package/.agent/skills/devops-incident-responder/SKILL.md +98 -0
  46. package/.agent/skills/documentation-templates/SKILL.md +175 -121
  47. package/.agent/skills/dotnet-core-expert/SKILL.md +103 -0
  48. package/.agent/skills/edge-computing/SKILL.md +213 -0
  49. package/.agent/skills/frontend-design/SKILL.md +76 -0
  50. package/.agent/skills/frontend-design/color-system.md +18 -0
  51. package/.agent/skills/frontend-design/typography-system.md +18 -0
  52. package/.agent/skills/game-development/SKILL.md +69 -0
  53. package/.agent/skills/geo-fundamentals/SKILL.md +158 -99
  54. package/.agent/skills/github-operations/SKILL.md +354 -0
  55. package/.agent/skills/i18n-localization/SKILL.md +158 -96
  56. package/.agent/skills/intelligent-routing/SKILL.md +89 -285
  57. package/.agent/skills/intelligent-routing/router-manifest.md +65 -0
  58. package/.agent/skills/lint-and-validate/SKILL.md +229 -27
  59. package/.agent/skills/llm-engineering/SKILL.md +258 -0
  60. package/.agent/skills/local-first/SKILL.md +203 -0
  61. package/.agent/skills/mcp-builder/SKILL.md +159 -111
  62. package/.agent/skills/mobile-design/SKILL.md +102 -282
  63. package/.agent/skills/nextjs-react-expert/SKILL.md +143 -227
  64. package/.agent/skills/nodejs-best-practices/SKILL.md +201 -254
  65. package/.agent/skills/observability/SKILL.md +285 -0
  66. package/.agent/skills/parallel-agents/SKILL.md +124 -118
  67. package/.agent/skills/performance-profiling/SKILL.md +143 -89
  68. package/.agent/skills/plan-writing/SKILL.md +133 -97
  69. package/.agent/skills/platform-engineer/SKILL.md +135 -0
  70. package/.agent/skills/powershell-windows/SKILL.md +167 -104
  71. package/.agent/skills/python-patterns/SKILL.md +149 -361
  72. package/.agent/skills/python-pro/SKILL.md +114 -0
  73. package/.agent/skills/react-specialist/SKILL.md +107 -0
  74. package/.agent/skills/readme-builder/SKILL.md +270 -0
  75. package/.agent/skills/realtime-patterns/SKILL.md +296 -0
  76. package/.agent/skills/red-team-tactics/SKILL.md +136 -134
  77. package/.agent/skills/rust-pro/SKILL.md +237 -173
  78. package/.agent/skills/seo-fundamentals/SKILL.md +134 -82
  79. package/.agent/skills/server-management/SKILL.md +155 -104
  80. package/.agent/skills/sql-pro/SKILL.md +104 -0
  81. package/.agent/skills/systematic-debugging/SKILL.md +156 -79
  82. package/.agent/skills/tailwind-patterns/SKILL.md +163 -205
  83. package/.agent/skills/tdd-workflow/SKILL.md +148 -88
  84. package/.agent/skills/test-result-analyzer/SKILL.md +299 -0
  85. package/.agent/skills/testing-patterns/SKILL.md +141 -114
  86. package/.agent/skills/trend-researcher/SKILL.md +228 -0
  87. package/.agent/skills/ui-ux-pro-max/SKILL.md +107 -0
  88. package/.agent/skills/ui-ux-researcher/SKILL.md +234 -0
  89. package/.agent/skills/vue-expert/SKILL.md +118 -0
  90. package/.agent/skills/vulnerability-scanner/SKILL.md +228 -188
  91. package/.agent/skills/web-design-guidelines/SKILL.md +148 -33
  92. package/.agent/skills/webapp-testing/SKILL.md +171 -122
  93. package/.agent/skills/whimsy-injector/SKILL.md +349 -0
  94. package/.agent/skills/workflow-optimizer/SKILL.md +219 -0
  95. package/.agent/workflows/api-tester.md +279 -0
  96. package/.agent/workflows/audit.md +168 -0
  97. package/.agent/workflows/brainstorm.md +65 -19
  98. package/.agent/workflows/changelog.md +144 -0
  99. package/.agent/workflows/create.md +67 -14
  100. package/.agent/workflows/debug.md +122 -30
  101. package/.agent/workflows/deploy.md +82 -31
  102. package/.agent/workflows/enhance.md +59 -27
  103. package/.agent/workflows/fix.md +143 -0
  104. package/.agent/workflows/generate.md +84 -20
  105. package/.agent/workflows/migrate.md +163 -0
  106. package/.agent/workflows/orchestrate.md +66 -17
  107. package/.agent/workflows/performance-benchmarker.md +305 -0
  108. package/.agent/workflows/plan.md +76 -33
  109. package/.agent/workflows/preview.md +73 -17
  110. package/.agent/workflows/refactor.md +153 -0
  111. package/.agent/workflows/review-ai.md +140 -0
  112. package/.agent/workflows/review.md +83 -16
  113. package/.agent/workflows/session.md +154 -0
  114. package/.agent/workflows/status.md +74 -18
  115. package/.agent/workflows/strengthen-skills.md +99 -0
  116. package/.agent/workflows/swarm.md +194 -0
  117. package/.agent/workflows/test.md +80 -31
  118. package/.agent/workflows/tribunal-backend.md +55 -13
  119. package/.agent/workflows/tribunal-database.md +62 -18
  120. package/.agent/workflows/tribunal-frontend.md +58 -12
  121. package/.agent/workflows/tribunal-full.md +70 -11
  122. package/.agent/workflows/tribunal-mobile.md +123 -0
  123. package/.agent/workflows/tribunal-performance.md +152 -0
  124. package/.agent/workflows/ui-ux-pro-max.md +100 -82
  125. package/README.md +117 -62
  126. package/bin/tribunal-kit.js +542 -288
  127. package/package.json +10 -6
@@ -8,7 +8,18 @@ $ARGUMENTS
8
8
 
9
9
  ---
10
10
 
11
- This command coordinates multiple specialists to solve a problem that requires more than one domain. One agent is not orchestration.
11
+ This command coordinates multiple specialists to solve a problem that requires more than one domain. **One agent is not orchestration.**
12
+
13
+ ---
14
+
15
+ ## When to Use /orchestrate vs Other Commands
16
+
17
+ | Use `/orchestrate` when... | Use something else when... |
18
+ |---|---|
19
+ | Task requires 3+ domain specialists | Single domain → use the right `/tribunal-*` |
20
+ | Sequential work with review gates between waves | Parallel, independent tasks → `/swarm` |
21
+ | Existing codebase with complex dependencies | Greenfield project → `/create` |
22
+ | Human gates required between every wave | Maximum parallel output → `/swarm` |
12
23
 
13
24
  ---
14
25
 
@@ -30,6 +41,7 @@ This command coordinates multiple specialists to solve a problem that requires m
30
41
  | Complete product | `project-planner` + `frontend-specialist` + `backend-specialist` + `devops-engineer` |
31
42
  | Security investigation | `security-auditor` + `penetration-tester` + `devops-engineer` |
32
43
  | Complex bug | `debugger` + `explorer-agent` + `test-engineer` |
44
+ | New codebase or unknown repo | `explorer-agent` + relevant specialists |
33
45
 
34
46
  ---
35
47
 
@@ -41,7 +53,7 @@ Only two agents are allowed during planning:
41
53
 
42
54
  ```
43
55
  project-planner → writes docs/PLAN-{slug}.md
44
- explorer-agent → (if working in existing code) maps the codebase
56
+ explorer-agent → (if working in existing code) maps the codebase structure
45
57
  ```
46
58
 
47
59
  No other agent runs. No code is produced.
@@ -56,40 +68,76 @@ Approve to start implementation? (Y / N)
56
68
 
57
69
  **Phase B does NOT start without a Y.**
58
70
 
59
- ### Phase B — Implementation (Parallel)
71
+ ### Phase B — Implementation (Manager & Micro-Workers)
60
72
 
61
- After approval, specialists activate:
73
+ After approval, the Orchestrator acts as Manager and dispatches Micro-Workers using **isolated JSON payloads**.
62
74
 
63
75
  ```
64
- Foundation tier: database-architect + security-auditor (run first)
65
- Core tier: backend-specialist + frontend-specialist (after foundation)
66
- Quality tier: test-engineer + qa-automation-engineer (after core)
76
+ Wave 1: database-architect + security-auditor (JSON dispatch #1)
77
+
78
+ [Wait for completion & Tribunal review]
79
+
80
+ Wave 2: backend-specialist + frontend-specialist (JSON dispatch #2)
81
+
82
+ [Wait for completion & Tribunal review]
83
+
84
+ Wave 3: test-engineer (JSON dispatch #3)
85
+
86
+ [Wait for completion & Human Gate]
67
87
  ```
68
88
 
69
- Each tier's output goes through its Tribunal gate before the next tier begins.
89
+ Workers execute in parallel **within** their wave, but waves execute **sequentially**. Each wave waits for the previous wave's Tribunal gate before proceeding.
70
90
 
71
91
  ---
72
92
 
73
- ## Cross-Agent Context Handoff
93
+ ## Hierarchical Context Pruning
94
+
95
+ When dispatching workers, the Orchestrator MUST use the `dispatch_micro_workers` JSON format.
96
+
97
+ **Context discipline is strictly enforced:**
98
+
99
+ ```
100
+ ❌ Never pass full chat histories to workers
101
+ ❌ Never attach every file — attach only files the worker will actually read
102
+ ✅ The context_summary injected by the Orchestrator is the ONLY shared context
103
+ ✅ Files attached are strictly limited to what's needed for that specific task
104
+ ```
105
+
106
+ **Per-worker context limit:** Excerpt only the relevant function or schema section — never the entire file.
74
107
 
75
- When one agent's output feeds the next:
108
+ ---
109
+
110
+ ## Retry Protocol
76
111
 
77
112
  ```
78
- The context passed to Agent B must include:
79
- "Agent A produced: [result]
80
- Build on this. Do not re-derive it."
113
+ Attempt 1 → Worker runs with original parameters
114
+ Attempt 2 Worker runs with stricter constraints + failure feedback
115
+ Attempt 3 → Worker runs with max constraints + full context dump
116
+ Attempt 4 → HALT. Report to human with full failure history.
81
117
  ```
82
118
 
83
- Never let Agent B re-invent what Agent A already established.
119
+ Hard limit: **3 retries per worker**. After 3 failures, escalate — do not silently proceed.
84
120
 
85
121
  ---
86
122
 
87
123
  ## Hallucination Guard
88
124
 
89
125
  - Every agent's output goes through Tribunal before it reaches the user
90
- - The Human Gate fires before any file is written — the user sees the diff and approves
91
- - Retry limit: 3 Maker revisions per agent. After 3 failures, stop and report to the user.
92
- - Per-agent scope is enforced `frontend-specialist` never writes DB migrations
126
+ - The Human Gate fires before any file is written — user sees the diff and approves
127
+ - Per-agent scope is enforced `frontend-specialist` **never** writes DB migrations
128
+ - Retry limit: 3 Maker revisions per agent; after 3 failures → stop and report
129
+ - `context_summary` is the only mechanism for sharing context across agents — no full dumps
130
+
131
+ ---
132
+
133
+ ## Cross-Workflow Navigation
134
+
135
+ | When /orchestrate reveals... | Go to |
136
+ |---|---|
137
+ | Worker keeps failing after 3 retries | `/debug` the isolated worker task |
138
+ | Plan needed before orchestrating | `/plan` first, then run `/orchestrate` against it |
139
+ | Fully parallel independent sub-tasks | `/swarm` is more efficient |
140
+ | Single domain needs specialist audit | Use the domain-specific `/tribunal-*` |
93
141
 
94
142
  ---
95
143
 
@@ -99,4 +147,5 @@ Never let Agent B re-invent what Agent A already established.
99
147
  /orchestrate build a complete auth system with JWT and refresh tokens
100
148
  /orchestrate review the entire API layer for security issues
101
149
  /orchestrate build a multi-tenant SaaS onboarding flow
150
+ /orchestrate analyze this repo and implement all security findings
102
151
  ```
@@ -0,0 +1,305 @@
1
+ ---
2
+ description: Run standardized performance benchmarks including Lighthouse, bundle analysis, and latency checks.
3
+ ---
4
+
5
+ # /performance-benchmarker — Automated Performance Audit
6
+
7
+ $ARGUMENTS
8
+
9
+ ---
10
+
11
+ This command runs a comprehensive suite of performance benchmarks against your project and generates a structured report with numerical scores, regression detection, and prioritized actionable fixes.
12
+
13
+ ---
14
+
15
+ ## When to Use
16
+
17
+ - Before any `/deploy` to catch performance regressions.
18
+ - After adding new dependencies or large features.
19
+ - When user reports "it feels slow" or asks to "check performance".
20
+ - When triggered by `benchmark`, `lighthouse`, `bundle size`, or `latency` keywords.
21
+
22
+ ---
23
+
24
+ ## Pipeline Flow
25
+
26
+ ```
27
+ Request (scope: full / web-vitals / bundle / api)
28
+
29
+
30
+ Environment detection — framework, build tool, package manager
31
+
32
+
33
+ Tool availability check — lighthouse? build script? dev server?
34
+
35
+
36
+ Benchmark execution — run selected checks
37
+
38
+
39
+ Score calculation — weighted composite
40
+
41
+
42
+ Regression detection — compare against previous baselines (if available)
43
+
44
+
45
+ Report — scores, pass/fail, recommendations, fix priority
46
+ ```
47
+
48
+ ---
49
+
50
+ ## Benchmark Suite
51
+
52
+ ### 1. Web Vitals (Frontend Performance)
53
+
54
+ | Metric | Good | Needs Work | Poor | Measurement |
55
+ |---|---|---|---|---|
56
+ | LCP (Largest Contentful Paint) | < 2.5s | 2.5-4.0s | > 4.0s | Lighthouse or `web-vitals` library |
57
+ | INP (Interaction to Next Paint) | < 200ms | 200-500ms | > 500ms | Lab approximation via TBT |
58
+ | CLS (Cumulative Layout Shift) | < 0.1 | 0.1-0.25 | > 0.25 | Layout shift detection |
59
+ | TTFB (Time to First Byte) | < 800ms | 800-1800ms | > 1800ms | Server response timing |
60
+ | FCP (First Contentful Paint) | < 1.8s | 1.8-3.0s | > 3.0s | Lighthouse |
61
+ | Speed Index | < 3.4s | 3.4-5.8s | > 5.8s | Lighthouse |
62
+
63
+ **How to Run:**
64
+ ```bash
65
+ # If lighthouse is available
66
+ npx lighthouse http://localhost:3000 --output json --chrome-flags="--headless"
67
+
68
+ # If web-vitals is installed, inject into page and measure
69
+ # VERIFY: check if lighthouse-cli is available before running
70
+ ```
71
+
72
+ **Common Fixes by Metric:**
73
+
74
+ | Metric | Fix | Impact |
75
+ |---|---|---|
76
+ | LCP slow | Preload hero image, use `fetchpriority="high"` | High |
77
+ | LCP slow | Eliminate render-blocking CSS/JS | High |
78
+ | INP slow | Break long tasks > 50ms into smaller chunks | High |
79
+ | INP slow | Use `requestIdleCallback` for non-critical work | Medium |
80
+ | CLS high | Set explicit `width`/`height` on images/embeds | High |
81
+ | CLS high | Use `font-display: swap` + font preload | Medium |
82
+ | TTFB slow | Add caching headers, use CDN | High |
83
+ | TTFB slow | Optimize database queries, add indexes | High |
84
+ | FCP slow | Inline critical CSS, defer non-critical | High |
85
+
86
+ ### 2. Bundle Analysis (JavaScript/CSS)
87
+
88
+ | Check | Target | Warning | Fail | Tool |
89
+ |---|---|---|---|---|
90
+ | Total JS (gzipped) | < 100KB | 100-200KB | > 200KB | Build output |
91
+ | Largest chunk (gzipped) | < 50KB | 50-100KB | > 100KB | Build output |
92
+ | CSS total | < 50KB | 50-100KB | > 100KB | Build output |
93
+ | Unused CSS | < 5% | 5-15% | > 15% | PurgeCSS |
94
+ | Duplicate packages | 0 | 1-2 | > 2 | Bundle analyzer |
95
+ | Tree-shaking | No side-effect barrel exports | — | Side-effect imports found | Manual analysis |
96
+
97
+ **How to Run:**
98
+ ```bash
99
+ # Build and analyze
100
+ npm run build -- --stats
101
+ # VERIFY: check if the build script supports --stats flag
102
+
103
+ # Alternative: analyze existing build output
104
+ npx source-map-explorer dist/**/*.js
105
+ # VERIFY: check if source-map-explorer is available
106
+ ```
107
+
108
+ **Common Fixes:**
109
+
110
+ | Issue | Fix | Savings |
111
+ |---|---|---|
112
+ | Large lodash import | `import debounce from 'lodash/debounce'` not `import { debounce } from 'lodash'` | 50-80KB |
113
+ | Moment.js | Replace with `dayjs` or `date-fns` | 60-70KB |
114
+ | Full icon library | Use tree-shakeable imports or individual icon files | 20-100KB |
115
+ | Uncompressed images | Use WebP/AVIF, add `loading="lazy"` | 50-500KB |
116
+ | CSS framework unused | PurgeCSS or `content` config in Tailwind | 30-90KB |
117
+
118
+ ### 3. API Latency (Backend Performance)
119
+
120
+ | Check | Target | Warning | Fail | Method |
121
+ |---|---|---|---|---|
122
+ | Avg response (simple GET) | < 100ms | 100-300ms | > 300ms | 10 sequential requests |
123
+ | Avg response (complex query) | < 300ms | 300-800ms | > 800ms | 10 sequential requests |
124
+ | P95 response | < 500ms | 500-1000ms | > 1000ms | Sort, pick 95th percentile |
125
+ | P99 response | < 1000ms | 1-3s | > 3s | Sort, pick 99th percentile |
126
+ | Cold start | < 1s | 1-3s | > 3s | First request after 30s idle |
127
+ | Concurrent handling | Linear scaling up to 10 req | — | Exponential degradation | 10 parallel requests |
128
+
129
+ **How to Run:**
130
+ ```bash
131
+ # Using curl timing
132
+ curl -o /dev/null -s -w "time_total: %{time_total}s\n" http://localhost:3000/api/health
133
+
134
+ # Loop for average
135
+ for i in $(seq 1 10); do
136
+ curl -o /dev/null -s -w "%{time_total}\n" http://localhost:3000/api/endpoint
137
+ done
138
+ ```
139
+
140
+ **Common Fixes:**
141
+
142
+ | Symptom | Likely Cause | Fix |
143
+ |---|---|---|
144
+ | Slow first request | Cold start, no connection pool | Pre-warm, use connection pooling |
145
+ | Slow list endpoints | N+1 queries | Add eager loading / `include` |
146
+ | Slow under load | No caching | Add Redis/in-memory cache for hot paths |
147
+ | Inconsistent P95 | GC pauses | Optimize memory allocation, reduce object churn |
148
+
149
+ ### 4. Build Performance (DX)
150
+
151
+ | Check | Target | Warning | Fail |
152
+ |---|---|---|---|
153
+ | Dev server cold start | < 3s | 3-8s | > 8s |
154
+ | Hot reload (HMR) | < 200ms | 200-500ms | > 500ms |
155
+ | Full production build | < 30s | 30-60s | > 60s |
156
+ | TypeScript type-check | < 15s | 15-30s | > 30s |
157
+
158
+ ---
159
+
160
+ ## Composite Score
161
+
162
+ ```
163
+ Performance Score = (
164
+ Web_Vitals_Score × 0.35 +
165
+ Bundle_Score × 0.25 +
166
+ API_Score × 0.25 +
167
+ Build_Score × 0.15
168
+ ) × 100
169
+
170
+ Grade:
171
+ 90-100 → A (Ship with confidence)
172
+ 75-89 → B (Minor optimizations available)
173
+ 60-74 → C (Notable performance issues)
174
+ 40-59 → D (Significant problems — fix before deploy)
175
+ < 40 → F (Critical — likely impacts user retention)
176
+ ```
177
+
178
+ Each sub-score is calculated as: `(checks_passed / total_checks)` weighted by target (1.0), warning (0.6), fail (0.0).
179
+
180
+ ---
181
+
182
+ ## Output Format
183
+
184
+ ```
185
+ ━━━ Performance Benchmark Report ━━━━━━━━━
186
+
187
+ Project: [name]
188
+ Date: [timestamp]
189
+ Score: [0-100] / 100 → Grade [A-F]
190
+
191
+ ━━━ Web Vitals ━━━━━━━━━━━━━━━━━━━━━━━━━
192
+
193
+ LCP: 1.8s ✅ Good (target: < 2.5s)
194
+ INP: 95ms ✅ Good (target: < 200ms)
195
+ CLS: 0.05 ✅ Good (target: < 0.1)
196
+ TTFB: 420ms ✅ Good (target: < 800ms)
197
+ FCP: 1.2s ✅ Good (target: < 1.8s)
198
+ Score: 92/100
199
+
200
+ ━━━ Bundle ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
201
+
202
+ Total JS: 156KB gzipped 🟡 Warning (target: < 100KB)
203
+ Largest chunk: 82KB gzipped 🟡 Warning (target: < 50KB)
204
+ CSS total: 28KB gzipped ✅ Good
205
+ Unused CSS: 4.2% ✅ Good
206
+ Duplicates: 0 ✅ Good
207
+ Score: 72/100
208
+
209
+ ━━━ API Latency ━━━━━━━━━━━━━━━━━━━━━━━━
210
+
211
+ GET /api/users: avg 89ms ✅ | p95 142ms ✅
212
+ POST /api/auth: avg 210ms 🟡 | p95 480ms 🟡
213
+ GET /api/dashboard: avg 340ms ❌ | p95 820ms ❌
214
+ Cold start: 680ms ✅
215
+ Score: 58/100
216
+
217
+ ━━━ Build ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
218
+
219
+ Dev cold start: 2.1s ✅
220
+ HMR: 89ms ✅
221
+ Production build: 18s ✅
222
+ Type-check: 12s ✅
223
+ Score: 100/100
224
+
225
+ ━━━ Fix Priority (by impact) ━━━━━━━━━━━
226
+
227
+ 1. 🔴 GET /api/dashboard avg 340ms
228
+ → Add database index on dashboard query joins
229
+ → Expected: < 100ms (70% improvement)
230
+
231
+ 2. 🟡 Total JS 156KB
232
+ → Lazy-load chart library (80KB)
233
+ → Expected: < 80KB initial (50% reduction)
234
+
235
+ 3. 🟡 POST /api/auth avg 210ms
236
+ → Cache user lookup in auth flow
237
+ → Expected: < 100ms (50% improvement)
238
+
239
+ ━━━ Trend (if baseline available) ━━━━━━
240
+
241
+ LCP: 1.8s → 1.8s → (no change)
242
+ Bundle: 140KB → 156KB ↑ (+11%) ⚠️ Regression
243
+ API p95: 400ms → 480ms ↑ (+20%) ⚠️ Regression
244
+ ```
245
+
246
+ ---
247
+
248
+ ## Regression Detection
249
+
250
+ If a previous benchmark baseline exists (stored in `perf-baseline.json` or similar):
251
+
252
+ | Metric | Change | Status |
253
+ |---|---|---|
254
+ | < 5% increase | No change | ✅ Stable |
255
+ | 5-15% increase | Minor regression | 🟡 Flag |
256
+ | > 15% increase | Significant regression | 🔴 Block deploy |
257
+ | Any decrease | Improvement | 🎉 Celebrate |
258
+
259
+ ---
260
+
261
+ ## Baseline Management
262
+
263
+ After a successful benchmark, save a baseline to detect future regressions:
264
+
265
+ ```bash
266
+ # Save current benchmark as baseline
267
+ python .agent/scripts/bundle_analyzer.py . --save-baseline
268
+ ```
269
+
270
+ The baseline file is `perf-baseline.json` in the project root. Check it into version control so regressions are caught in CI.
271
+
272
+ ---
273
+
274
+ ## Cross-Workflow Navigation
275
+
276
+ | After /performance-benchmarker shows... | Go to |
277
+ |---|---|
278
+ | Grade D or F | `/tribunal-performance` on the slowest code paths |
279
+ | Bundle regression (+15%) | `/audit` for dependency analysis, then `/fix` |
280
+ | API latency P95 > 500ms | `/debug` to identify the slow query or operation |
281
+ | Web vitals LCP > 4s | `/enhance` to add image preloading and critical CSS |
282
+ | Grade A or B, ready for deploy | `/deploy` following pre-flight checklist |
283
+
284
+ ---
285
+
286
+ ## Hallucination Guard
287
+
288
+ - **Only run benchmarks with installed tools** — check with `which` or `npx --dry-run` first.
289
+ - **Never fabricate benchmark numbers** — report "SKIPPED: [tool] not installed" if unavailable.
290
+ - **Flag anomalies**: `// NOTE: unusually fast — may be cached` or `// NOTE: first run, cold start included`.
291
+ - **Mark tool availability**: `// VERIFY: lighthouse-cli not detected, using fallback estimation`.
292
+ - **Don't guess fixes** — only recommend fixes for issues that have measured evidence.
293
+
294
+ ---
295
+
296
+ ## Usage
297
+
298
+ ```
299
+ /performance-benchmarker full audit
300
+ /performance-benchmarker web vitals only
301
+ /performance-benchmarker bundle analysis
302
+ /performance-benchmarker api latency for /api/users /api/posts
303
+ /performance-benchmarker build performance
304
+ /performance-benchmarker compare with baseline
305
+ ```
@@ -1,5 +1,5 @@
1
1
  ---
2
- description: Create project plan using project-planner agent. No code writing - only plan file generation.
2
+ description: Create project plan using project-planner agent. No code writing only plan file generation.
3
3
  ---
4
4
 
5
5
  # /plan — Write the Plan First
@@ -8,7 +8,18 @@ $ARGUMENTS
8
8
 
9
9
  ---
10
10
 
11
- This command produces one thing: a structured plan file. Nothing is implemented. No code is written. The plan is the output.
11
+ This command produces one thing: **a structured plan file**. Nothing is implemented. No code is written. The plan is the output.
12
+
13
+ ---
14
+
15
+ ## When to Use /plan vs Other Commands
16
+
17
+ | Use `/plan` when... | Use something else when... |
18
+ |---|---|
19
+ | Requirements are unclear or large | You already know what to build → `/create` |
20
+ | Multi-agent work needs coordination | Single function needed → `/generate` |
21
+ | You want written scope agreement before coding | Ready to build immediately → `/create` |
22
+ | Stakeholder review is needed before work starts | Just a quick discussion → ask directly |
12
23
 
13
24
  ---
14
25
 
@@ -23,38 +34,34 @@ This command produces one thing: a structured plan file. Nothing is implemented.
23
34
 
24
35
  ### Gate: Clarify Before You Plan
25
36
 
26
- The `project-planner` agent asks:
37
+ The `project-planner` agent asks — and gets answers to — these four questions before writing a single line of the plan:
27
38
 
28
39
  ```
29
- What outcome needs to exist that doesn't exist today?
30
- What are the hard constraints? (stack, existing code, deadline)
31
- What's explicitly not being built in this version?
32
- How will we confirm it's done?
40
+ 1. What outcome needs to exist that doesn't exist today?
41
+ 2. What are the hard constraints? (stack, existing code, deadline)
42
+ 3. What's explicitly not being built in this version?
43
+ 4. How will we confirm it's done? (observable done condition)
33
44
  ```
34
45
 
35
- If any answer is "I don't know" — those are clarified before the plan is written, not after.
36
-
37
- ### Plan File Creation
46
+ If any answer is "I don't know" — those are clarified **before** the plan is written, not after.
38
47
 
39
- ```
40
- Location: docs/PLAN-{task-slug}.md
48
+ > ⚠️ An unclear "done condition" is the most common cause of scope creep. It must be specific and observable.
41
49
 
42
- Slug naming:
43
- Pull 2–3 key words from the request
44
- Lowercase + hyphens
45
- Max 30 characters
46
- "build auth with JWT" → PLAN-auth-jwt.md
47
- "shopping cart checkout" → PLAN-cart-checkout.md
48
- ```
50
+ ---
49
51
 
50
- ### After the File is Written
52
+ ### Plan File Creation
51
53
 
52
54
  ```
53
- ✅ Plan written: docs/PLAN-{slug}.md
55
+ Location: docs/PLAN-{task-slug}.md
54
56
 
55
- Review it, then:
56
- Run /create to begin implementation
57
- Or edit the file to refine scope first
57
+ Slug naming rules:
58
+ - Pull 2–3 key words from the request
59
+ - Lowercase + hyphens
60
+ - Max 30 characters
61
+ Examples:
62
+ "build auth with JWT" → PLAN-auth-jwt.md
63
+ "shopping cart checkout" → PLAN-cart-checkout.md
64
+ "multi-tenant user roles" → PLAN-user-roles.md
58
65
  ```
59
66
 
60
67
  ---
@@ -65,37 +72,71 @@ Review it, then:
65
72
  # Plan: [Feature Name]
66
73
 
67
74
  ## What Done Looks Like
68
- [Observable outcome — one sentence]
75
+ [Observable outcome — one specific, testable sentence]
69
76
 
70
77
  ## Won't Include in This Version
71
- - [Explicit exclusion]
78
+ - [Explicit exclusion 1]
79
+ - [Explicit exclusion 2]
72
80
 
73
81
  ## Unresolved Questions
74
- - [Thing that needs external confirmation: VERIFY]
82
+ - [Item needing external confirmation: VERIFY]
75
83
 
76
84
  ## Estimates (Ranges + Confidence)
77
- All time estimates include: optimistic / realistic / pessimistic + confidence level
85
+ | Task | Optimistic | Realistic | Pessimistic | Confidence |
86
+ |------|-----------|-----------|-------------|------------|
87
+ | DB schema | 30min | 1h | 2h | High |
88
+ | API layer | 2h | 4h | 8h | Medium |
89
+ | Frontend | 3h | 6h | 12h | Low |
78
90
 
79
91
  ## Task Table
80
92
  | # | Task | Agent | Depends on | Done when |
81
- |---|------|-------|-----------|-----------|
82
- | 1 | ... | database-architect | none | migration runs |
83
- | 2 | ... | backend-specialist | #1 | returns 201 |
93
+ |---|------|-------|-----------|-----------|
94
+ | 1 | DB schema | database-architect | none | migration runs |
95
+ | 2 | API routes | backend-specialist | #1 | returns 201 |
96
+ | 3 | Frontend component | frontend-specialist | #2 | renders without errors |
97
+ | 4 | Tests | test-engineer | #2 | all specs pass |
84
98
 
85
99
  ## Review Gates
86
100
  | Task | Tribunal |
87
101
  |---|---|
88
102
  | #1 schema | /tribunal-database |
89
103
  | #2 API | /tribunal-backend |
104
+ | #3 UI | /tribunal-frontend |
105
+ | #4 tests | test-coverage-reviewer |
106
+ ```
107
+
108
+ ---
109
+
110
+ ### After the File is Written
111
+
112
+ ```
113
+ ✅ Plan written: docs/PLAN-{slug}.md
114
+
115
+ Review it, then:
116
+ /create → Begin full implementation (uses this plan)
117
+ /generate → Implement a single task from the table
118
+ /orchestrate → Coordinate all agents across the full plan
90
119
  ```
91
120
 
92
121
  ---
93
122
 
94
123
  ## Hallucination Guard
95
124
 
96
- - Every tool/library mentioned in the plan must be real and verified
97
- - All time estimates are ranges with a confidence label — never single-point guarantees
125
+ - Every tool, library, or API mentioned in the plan must be **real and verified** before being listed
126
+ - Time estimates are **ranges with confidence labels** — never single-point guarantees
98
127
  - External dependencies that aren't confirmed get a `[VERIFY: check this exists]` tag
128
+ - The done condition is **observable and specific** — "it works" is not a done condition
129
+
130
+ ---
131
+
132
+ ## Cross-Workflow Navigation
133
+
134
+ | After /plan produces the file... | Go to |
135
+ |---|---|
136
+ | Ready to build the full plan | `/create` reads the plan and starts building |
137
+ | Need a single task implemented | `/generate [task description]` |
138
+ | Multi-agent coordination needed | `/orchestrate` to run the plan as a managed build |
139
+ | Need to review existing code first | `explorer-agent` before committing to the plan |
99
140
 
100
141
  ---
101
142
 
@@ -105,4 +146,6 @@ All time estimates include: optimistic / realistic / pessimistic + confidence le
105
146
  /plan REST API with user auth
106
147
  /plan dark mode toggle for the settings page
107
148
  /plan multi-tenant account switching
149
+ /plan event-driven notification system with queues
150
+ /plan admin dashboard with user management and analytics
108
151
  ```