tribunal-kit 1.0.0 → 2.4.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent/.shared/ui-ux-pro-max/README.md +3 -3
- package/.agent/ARCHITECTURE.md +205 -10
- package/.agent/GEMINI.md +37 -7
- package/.agent/agents/accessibility-reviewer.md +134 -0
- package/.agent/agents/ai-code-reviewer.md +129 -0
- package/.agent/agents/frontend-specialist.md +3 -0
- package/.agent/agents/game-developer.md +21 -21
- package/.agent/agents/logic-reviewer.md +12 -0
- package/.agent/agents/mobile-reviewer.md +79 -0
- package/.agent/agents/orchestrator.md +56 -26
- package/.agent/agents/performance-reviewer.md +36 -0
- package/.agent/agents/supervisor-agent.md +156 -0
- package/.agent/agents/swarm-worker-contracts.md +166 -0
- package/.agent/agents/swarm-worker-registry.md +92 -0
- package/.agent/rules/GEMINI.md +134 -5
- package/.agent/scripts/bundle_analyzer.py +259 -0
- package/.agent/scripts/dependency_analyzer.py +247 -0
- package/.agent/scripts/lint_runner.py +188 -0
- package/.agent/scripts/patch_skills_meta.py +177 -0
- package/.agent/scripts/patch_skills_output.py +285 -0
- package/.agent/scripts/schema_validator.py +279 -0
- package/.agent/scripts/security_scan.py +224 -0
- package/.agent/scripts/session_manager.py +144 -3
- package/.agent/scripts/skill_integrator.py +234 -0
- package/.agent/scripts/strengthen_skills.py +220 -0
- package/.agent/scripts/swarm_dispatcher.py +317 -0
- package/.agent/scripts/test_runner.py +192 -0
- package/.agent/scripts/test_swarm_dispatcher.py +163 -0
- package/.agent/skills/agent-organizer/SKILL.md +132 -0
- package/.agent/skills/agentic-patterns/SKILL.md +335 -0
- package/.agent/skills/api-patterns/SKILL.md +226 -50
- package/.agent/skills/app-builder/SKILL.md +215 -52
- package/.agent/skills/architecture/SKILL.md +176 -31
- package/.agent/skills/bash-linux/SKILL.md +150 -134
- package/.agent/skills/behavioral-modes/SKILL.md +152 -160
- package/.agent/skills/brainstorming/SKILL.md +148 -101
- package/.agent/skills/brainstorming/dynamic-questioning.md +10 -0
- package/.agent/skills/clean-code/SKILL.md +139 -134
- package/.agent/skills/code-review-checklist/SKILL.md +177 -80
- package/.agent/skills/config-validator/SKILL.md +165 -0
- package/.agent/skills/csharp-developer/SKILL.md +107 -0
- package/.agent/skills/database-design/SKILL.md +252 -29
- package/.agent/skills/deployment-procedures/SKILL.md +122 -175
- package/.agent/skills/devops-engineer/SKILL.md +134 -0
- package/.agent/skills/devops-incident-responder/SKILL.md +98 -0
- package/.agent/skills/documentation-templates/SKILL.md +175 -121
- package/.agent/skills/dotnet-core-expert/SKILL.md +103 -0
- package/.agent/skills/edge-computing/SKILL.md +213 -0
- package/.agent/skills/frontend-design/SKILL.md +76 -0
- package/.agent/skills/frontend-design/color-system.md +18 -0
- package/.agent/skills/frontend-design/typography-system.md +18 -0
- package/.agent/skills/game-development/SKILL.md +69 -0
- package/.agent/skills/geo-fundamentals/SKILL.md +158 -99
- package/.agent/skills/github-operations/SKILL.md +354 -0
- package/.agent/skills/i18n-localization/SKILL.md +158 -96
- package/.agent/skills/intelligent-routing/SKILL.md +89 -285
- package/.agent/skills/intelligent-routing/router-manifest.md +65 -0
- package/.agent/skills/lint-and-validate/SKILL.md +229 -27
- package/.agent/skills/llm-engineering/SKILL.md +258 -0
- package/.agent/skills/local-first/SKILL.md +203 -0
- package/.agent/skills/mcp-builder/SKILL.md +159 -111
- package/.agent/skills/mobile-design/SKILL.md +102 -282
- package/.agent/skills/nextjs-react-expert/SKILL.md +143 -227
- package/.agent/skills/nodejs-best-practices/SKILL.md +201 -254
- package/.agent/skills/observability/SKILL.md +285 -0
- package/.agent/skills/parallel-agents/SKILL.md +124 -118
- package/.agent/skills/performance-profiling/SKILL.md +143 -89
- package/.agent/skills/plan-writing/SKILL.md +133 -97
- package/.agent/skills/platform-engineer/SKILL.md +135 -0
- package/.agent/skills/powershell-windows/SKILL.md +167 -104
- package/.agent/skills/python-patterns/SKILL.md +149 -361
- package/.agent/skills/python-pro/SKILL.md +114 -0
- package/.agent/skills/react-specialist/SKILL.md +107 -0
- package/.agent/skills/readme-builder/SKILL.md +270 -0
- package/.agent/skills/realtime-patterns/SKILL.md +296 -0
- package/.agent/skills/red-team-tactics/SKILL.md +136 -134
- package/.agent/skills/rust-pro/SKILL.md +237 -173
- package/.agent/skills/seo-fundamentals/SKILL.md +134 -82
- package/.agent/skills/server-management/SKILL.md +155 -104
- package/.agent/skills/sql-pro/SKILL.md +104 -0
- package/.agent/skills/systematic-debugging/SKILL.md +156 -79
- package/.agent/skills/tailwind-patterns/SKILL.md +163 -205
- package/.agent/skills/tdd-workflow/SKILL.md +148 -88
- package/.agent/skills/test-result-analyzer/SKILL.md +299 -0
- package/.agent/skills/testing-patterns/SKILL.md +141 -114
- package/.agent/skills/trend-researcher/SKILL.md +228 -0
- package/.agent/skills/ui-ux-pro-max/SKILL.md +107 -0
- package/.agent/skills/ui-ux-researcher/SKILL.md +234 -0
- package/.agent/skills/vue-expert/SKILL.md +118 -0
- package/.agent/skills/vulnerability-scanner/SKILL.md +228 -188
- package/.agent/skills/web-design-guidelines/SKILL.md +148 -33
- package/.agent/skills/webapp-testing/SKILL.md +171 -122
- package/.agent/skills/whimsy-injector/SKILL.md +349 -0
- package/.agent/skills/workflow-optimizer/SKILL.md +219 -0
- package/.agent/workflows/api-tester.md +279 -0
- package/.agent/workflows/audit.md +168 -0
- package/.agent/workflows/brainstorm.md +65 -19
- package/.agent/workflows/changelog.md +144 -0
- package/.agent/workflows/create.md +67 -14
- package/.agent/workflows/debug.md +122 -30
- package/.agent/workflows/deploy.md +82 -31
- package/.agent/workflows/enhance.md +59 -27
- package/.agent/workflows/fix.md +143 -0
- package/.agent/workflows/generate.md +84 -20
- package/.agent/workflows/migrate.md +163 -0
- package/.agent/workflows/orchestrate.md +66 -17
- package/.agent/workflows/performance-benchmarker.md +305 -0
- package/.agent/workflows/plan.md +76 -33
- package/.agent/workflows/preview.md +73 -17
- package/.agent/workflows/refactor.md +153 -0
- package/.agent/workflows/review-ai.md +140 -0
- package/.agent/workflows/review.md +83 -16
- package/.agent/workflows/session.md +154 -0
- package/.agent/workflows/status.md +74 -18
- package/.agent/workflows/strengthen-skills.md +99 -0
- package/.agent/workflows/swarm.md +194 -0
- package/.agent/workflows/test.md +80 -31
- package/.agent/workflows/tribunal-backend.md +55 -13
- package/.agent/workflows/tribunal-database.md +62 -18
- package/.agent/workflows/tribunal-frontend.md +58 -12
- package/.agent/workflows/tribunal-full.md +70 -11
- package/.agent/workflows/tribunal-mobile.md +123 -0
- package/.agent/workflows/tribunal-performance.md +152 -0
- package/.agent/workflows/ui-ux-pro-max.md +100 -82
- package/README.md +117 -62
- package/bin/tribunal-kit.js +542 -288
- package/package.json +10 -6
|
@@ -8,7 +8,18 @@ $ARGUMENTS
|
|
|
8
8
|
|
|
9
9
|
---
|
|
10
10
|
|
|
11
|
-
This command coordinates multiple specialists to solve a problem that requires more than one domain. One agent is not orchestration
|
|
11
|
+
This command coordinates multiple specialists to solve a problem that requires more than one domain. **One agent is not orchestration.**
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## When to Use /orchestrate vs Other Commands
|
|
16
|
+
|
|
17
|
+
| Use `/orchestrate` when... | Use something else when... |
|
|
18
|
+
|---|---|
|
|
19
|
+
| Task requires 3+ domain specialists | Single domain → use the right `/tribunal-*` |
|
|
20
|
+
| Sequential work with review gates between waves | Parallel, independent tasks → `/swarm` |
|
|
21
|
+
| Existing codebase with complex dependencies | Greenfield project → `/create` |
|
|
22
|
+
| Human gates required between every wave | Maximum parallel output → `/swarm` |
|
|
12
23
|
|
|
13
24
|
---
|
|
14
25
|
|
|
@@ -30,6 +41,7 @@ This command coordinates multiple specialists to solve a problem that requires m
|
|
|
30
41
|
| Complete product | `project-planner` + `frontend-specialist` + `backend-specialist` + `devops-engineer` |
|
|
31
42
|
| Security investigation | `security-auditor` + `penetration-tester` + `devops-engineer` |
|
|
32
43
|
| Complex bug | `debugger` + `explorer-agent` + `test-engineer` |
|
|
44
|
+
| New codebase or unknown repo | `explorer-agent` + relevant specialists |
|
|
33
45
|
|
|
34
46
|
---
|
|
35
47
|
|
|
@@ -41,7 +53,7 @@ Only two agents are allowed during planning:
|
|
|
41
53
|
|
|
42
54
|
```
|
|
43
55
|
project-planner → writes docs/PLAN-{slug}.md
|
|
44
|
-
explorer-agent → (if working in existing code) maps the codebase
|
|
56
|
+
explorer-agent → (if working in existing code) maps the codebase structure
|
|
45
57
|
```
|
|
46
58
|
|
|
47
59
|
No other agent runs. No code is produced.
|
|
@@ -56,40 +68,76 @@ Approve to start implementation? (Y / N)
|
|
|
56
68
|
|
|
57
69
|
**Phase B does NOT start without a Y.**
|
|
58
70
|
|
|
59
|
-
### Phase B — Implementation (
|
|
71
|
+
### Phase B — Implementation (Manager & Micro-Workers)
|
|
60
72
|
|
|
61
|
-
After approval,
|
|
73
|
+
After approval, the Orchestrator acts as Manager and dispatches Micro-Workers using **isolated JSON payloads**.
|
|
62
74
|
|
|
63
75
|
```
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
76
|
+
Wave 1: database-architect + security-auditor (JSON dispatch #1)
|
|
77
|
+
↓
|
|
78
|
+
[Wait for completion & Tribunal review]
|
|
79
|
+
↓
|
|
80
|
+
Wave 2: backend-specialist + frontend-specialist (JSON dispatch #2)
|
|
81
|
+
↓
|
|
82
|
+
[Wait for completion & Tribunal review]
|
|
83
|
+
↓
|
|
84
|
+
Wave 3: test-engineer (JSON dispatch #3)
|
|
85
|
+
↓
|
|
86
|
+
[Wait for completion & Human Gate]
|
|
67
87
|
```
|
|
68
88
|
|
|
69
|
-
Each
|
|
89
|
+
Workers execute in parallel **within** their wave, but waves execute **sequentially**. Each wave waits for the previous wave's Tribunal gate before proceeding.
|
|
70
90
|
|
|
71
91
|
---
|
|
72
92
|
|
|
73
|
-
##
|
|
93
|
+
## Hierarchical Context Pruning
|
|
94
|
+
|
|
95
|
+
When dispatching workers, the Orchestrator MUST use the `dispatch_micro_workers` JSON format.
|
|
96
|
+
|
|
97
|
+
**Context discipline is strictly enforced:**
|
|
98
|
+
|
|
99
|
+
```
|
|
100
|
+
❌ Never pass full chat histories to workers
|
|
101
|
+
❌ Never attach every file — attach only files the worker will actually read
|
|
102
|
+
✅ The context_summary injected by the Orchestrator is the ONLY shared context
|
|
103
|
+
✅ Files attached are strictly limited to what's needed for that specific task
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
**Per-worker context limit:** Excerpt only the relevant function or schema section — never the entire file.
|
|
74
107
|
|
|
75
|
-
|
|
108
|
+
---
|
|
109
|
+
|
|
110
|
+
## Retry Protocol
|
|
76
111
|
|
|
77
112
|
```
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
113
|
+
Attempt 1 → Worker runs with original parameters
|
|
114
|
+
Attempt 2 → Worker runs with stricter constraints + failure feedback
|
|
115
|
+
Attempt 3 → Worker runs with max constraints + full context dump
|
|
116
|
+
Attempt 4 → HALT. Report to human with full failure history.
|
|
81
117
|
```
|
|
82
118
|
|
|
83
|
-
|
|
119
|
+
Hard limit: **3 retries per worker**. After 3 failures, escalate — do not silently proceed.
|
|
84
120
|
|
|
85
121
|
---
|
|
86
122
|
|
|
87
123
|
## Hallucination Guard
|
|
88
124
|
|
|
89
125
|
- Every agent's output goes through Tribunal before it reaches the user
|
|
90
|
-
- The Human Gate fires before any file is written —
|
|
91
|
-
-
|
|
92
|
-
-
|
|
126
|
+
- The Human Gate fires before any file is written — user sees the diff and approves
|
|
127
|
+
- Per-agent scope is enforced — `frontend-specialist` **never** writes DB migrations
|
|
128
|
+
- Retry limit: 3 Maker revisions per agent; after 3 failures → stop and report
|
|
129
|
+
- `context_summary` is the only mechanism for sharing context across agents — no full dumps
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## Cross-Workflow Navigation
|
|
134
|
+
|
|
135
|
+
| When /orchestrate reveals... | Go to |
|
|
136
|
+
|---|---|
|
|
137
|
+
| Worker keeps failing after 3 retries | `/debug` the isolated worker task |
|
|
138
|
+
| Plan needed before orchestrating | `/plan` first, then run `/orchestrate` against it |
|
|
139
|
+
| Fully parallel independent sub-tasks | `/swarm` is more efficient |
|
|
140
|
+
| Single domain needs specialist audit | Use the domain-specific `/tribunal-*` |
|
|
93
141
|
|
|
94
142
|
---
|
|
95
143
|
|
|
@@ -99,4 +147,5 @@ Never let Agent B re-invent what Agent A already established.
|
|
|
99
147
|
/orchestrate build a complete auth system with JWT and refresh tokens
|
|
100
148
|
/orchestrate review the entire API layer for security issues
|
|
101
149
|
/orchestrate build a multi-tenant SaaS onboarding flow
|
|
150
|
+
/orchestrate analyze this repo and implement all security findings
|
|
102
151
|
```
|
|
@@ -0,0 +1,305 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Run standardized performance benchmarks including Lighthouse, bundle analysis, and latency checks.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# /performance-benchmarker — Automated Performance Audit
|
|
6
|
+
|
|
7
|
+
$ARGUMENTS
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
This command runs a comprehensive suite of performance benchmarks against your project and generates a structured report with numerical scores, regression detection, and prioritized actionable fixes.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## When to Use
|
|
16
|
+
|
|
17
|
+
- Before any `/deploy` to catch performance regressions.
|
|
18
|
+
- After adding new dependencies or large features.
|
|
19
|
+
- When user reports "it feels slow" or asks to "check performance".
|
|
20
|
+
- When triggered by `benchmark`, `lighthouse`, `bundle size`, or `latency` keywords.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Pipeline Flow
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
Request (scope: full / web-vitals / bundle / api)
|
|
28
|
+
│
|
|
29
|
+
▼
|
|
30
|
+
Environment detection — framework, build tool, package manager
|
|
31
|
+
│
|
|
32
|
+
▼
|
|
33
|
+
Tool availability check — lighthouse? build script? dev server?
|
|
34
|
+
│
|
|
35
|
+
▼
|
|
36
|
+
Benchmark execution — run selected checks
|
|
37
|
+
│
|
|
38
|
+
▼
|
|
39
|
+
Score calculation — weighted composite
|
|
40
|
+
│
|
|
41
|
+
▼
|
|
42
|
+
Regression detection — compare against previous baselines (if available)
|
|
43
|
+
│
|
|
44
|
+
▼
|
|
45
|
+
Report — scores, pass/fail, recommendations, fix priority
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
## Benchmark Suite
|
|
51
|
+
|
|
52
|
+
### 1. Web Vitals (Frontend Performance)
|
|
53
|
+
|
|
54
|
+
| Metric | Good | Needs Work | Poor | Measurement |
|
|
55
|
+
|---|---|---|---|---|
|
|
56
|
+
| LCP (Largest Contentful Paint) | < 2.5s | 2.5-4.0s | > 4.0s | Lighthouse or `web-vitals` library |
|
|
57
|
+
| INP (Interaction to Next Paint) | < 200ms | 200-500ms | > 500ms | Lab approximation via TBT |
|
|
58
|
+
| CLS (Cumulative Layout Shift) | < 0.1 | 0.1-0.25 | > 0.25 | Layout shift detection |
|
|
59
|
+
| TTFB (Time to First Byte) | < 800ms | 800-1800ms | > 1800ms | Server response timing |
|
|
60
|
+
| FCP (First Contentful Paint) | < 1.8s | 1.8-3.0s | > 3.0s | Lighthouse |
|
|
61
|
+
| Speed Index | < 3.4s | 3.4-5.8s | > 5.8s | Lighthouse |
|
|
62
|
+
|
|
63
|
+
**How to Run:**
|
|
64
|
+
```bash
|
|
65
|
+
# If lighthouse is available
|
|
66
|
+
npx lighthouse http://localhost:3000 --output json --chrome-flags="--headless"
|
|
67
|
+
|
|
68
|
+
# If web-vitals is installed, inject into page and measure
|
|
69
|
+
# VERIFY: check if lighthouse-cli is available before running
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
**Common Fixes by Metric:**
|
|
73
|
+
|
|
74
|
+
| Metric | Fix | Impact |
|
|
75
|
+
|---|---|---|
|
|
76
|
+
| LCP slow | Preload hero image, use `fetchpriority="high"` | High |
|
|
77
|
+
| LCP slow | Eliminate render-blocking CSS/JS | High |
|
|
78
|
+
| INP slow | Break long tasks > 50ms into smaller chunks | High |
|
|
79
|
+
| INP slow | Use `requestIdleCallback` for non-critical work | Medium |
|
|
80
|
+
| CLS high | Set explicit `width`/`height` on images/embeds | High |
|
|
81
|
+
| CLS high | Use `font-display: swap` + font preload | Medium |
|
|
82
|
+
| TTFB slow | Add caching headers, use CDN | High |
|
|
83
|
+
| TTFB slow | Optimize database queries, add indexes | High |
|
|
84
|
+
| FCP slow | Inline critical CSS, defer non-critical | High |
|
|
85
|
+
|
|
86
|
+
### 2. Bundle Analysis (JavaScript/CSS)
|
|
87
|
+
|
|
88
|
+
| Check | Target | Warning | Fail | Tool |
|
|
89
|
+
|---|---|---|---|---|
|
|
90
|
+
| Total JS (gzipped) | < 100KB | 100-200KB | > 200KB | Build output |
|
|
91
|
+
| Largest chunk (gzipped) | < 50KB | 50-100KB | > 100KB | Build output |
|
|
92
|
+
| CSS total | < 50KB | 50-100KB | > 100KB | Build output |
|
|
93
|
+
| Unused CSS | < 5% | 5-15% | > 15% | PurgeCSS |
|
|
94
|
+
| Duplicate packages | 0 | 1-2 | > 2 | Bundle analyzer |
|
|
95
|
+
| Tree-shaking | No side-effect barrel exports | — | Side-effect imports found | Manual analysis |
|
|
96
|
+
|
|
97
|
+
**How to Run:**
|
|
98
|
+
```bash
|
|
99
|
+
# Build and analyze
|
|
100
|
+
npm run build -- --stats
|
|
101
|
+
# VERIFY: check if the build script supports --stats flag
|
|
102
|
+
|
|
103
|
+
# Alternative: analyze existing build output
|
|
104
|
+
npx source-map-explorer dist/**/*.js
|
|
105
|
+
# VERIFY: check if source-map-explorer is available
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
**Common Fixes:**
|
|
109
|
+
|
|
110
|
+
| Issue | Fix | Savings |
|
|
111
|
+
|---|---|---|
|
|
112
|
+
| Large lodash import | `import debounce from 'lodash/debounce'` not `import { debounce } from 'lodash'` | 50-80KB |
|
|
113
|
+
| Moment.js | Replace with `dayjs` or `date-fns` | 60-70KB |
|
|
114
|
+
| Full icon library | Use tree-shakeable imports or individual icon files | 20-100KB |
|
|
115
|
+
| Uncompressed images | Use WebP/AVIF, add `loading="lazy"` | 50-500KB |
|
|
116
|
+
| CSS framework unused | PurgeCSS or `content` config in Tailwind | 30-90KB |
|
|
117
|
+
|
|
118
|
+
### 3. API Latency (Backend Performance)
|
|
119
|
+
|
|
120
|
+
| Check | Target | Warning | Fail | Method |
|
|
121
|
+
|---|---|---|---|---|
|
|
122
|
+
| Avg response (simple GET) | < 100ms | 100-300ms | > 300ms | 10 sequential requests |
|
|
123
|
+
| Avg response (complex query) | < 300ms | 300-800ms | > 800ms | 10 sequential requests |
|
|
124
|
+
| P95 response | < 500ms | 500-1000ms | > 1000ms | Sort, pick 95th percentile |
|
|
125
|
+
| P99 response | < 1000ms | 1-3s | > 3s | Sort, pick 99th percentile |
|
|
126
|
+
| Cold start | < 1s | 1-3s | > 3s | First request after 30s idle |
|
|
127
|
+
| Concurrent handling | Linear scaling up to 10 req | — | Exponential degradation | 10 parallel requests |
|
|
128
|
+
|
|
129
|
+
**How to Run:**
|
|
130
|
+
```bash
|
|
131
|
+
# Using curl timing
|
|
132
|
+
curl -o /dev/null -s -w "time_total: %{time_total}s\n" http://localhost:3000/api/health
|
|
133
|
+
|
|
134
|
+
# Loop for average
|
|
135
|
+
for i in $(seq 1 10); do
|
|
136
|
+
curl -o /dev/null -s -w "%{time_total}\n" http://localhost:3000/api/endpoint
|
|
137
|
+
done
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
**Common Fixes:**
|
|
141
|
+
|
|
142
|
+
| Symptom | Likely Cause | Fix |
|
|
143
|
+
|---|---|---|
|
|
144
|
+
| Slow first request | Cold start, no connection pool | Pre-warm, use connection pooling |
|
|
145
|
+
| Slow list endpoints | N+1 queries | Add eager loading / `include` |
|
|
146
|
+
| Slow under load | No caching | Add Redis/in-memory cache for hot paths |
|
|
147
|
+
| Inconsistent P95 | GC pauses | Optimize memory allocation, reduce object churn |
|
|
148
|
+
|
|
149
|
+
### 4. Build Performance (DX)
|
|
150
|
+
|
|
151
|
+
| Check | Target | Warning | Fail |
|
|
152
|
+
|---|---|---|---|
|
|
153
|
+
| Dev server cold start | < 3s | 3-8s | > 8s |
|
|
154
|
+
| Hot reload (HMR) | < 200ms | 200-500ms | > 500ms |
|
|
155
|
+
| Full production build | < 30s | 30-60s | > 60s |
|
|
156
|
+
| TypeScript type-check | < 15s | 15-30s | > 30s |
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Composite Score
|
|
161
|
+
|
|
162
|
+
```
|
|
163
|
+
Performance Score = (
|
|
164
|
+
Web_Vitals_Score × 0.35 +
|
|
165
|
+
Bundle_Score × 0.25 +
|
|
166
|
+
API_Score × 0.25 +
|
|
167
|
+
Build_Score × 0.15
|
|
168
|
+
) × 100
|
|
169
|
+
|
|
170
|
+
Grade:
|
|
171
|
+
90-100 → A (Ship with confidence)
|
|
172
|
+
75-89 → B (Minor optimizations available)
|
|
173
|
+
60-74 → C (Notable performance issues)
|
|
174
|
+
40-59 → D (Significant problems — fix before deploy)
|
|
175
|
+
< 40 → F (Critical — likely impacts user retention)
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Each sub-score is calculated as: `(checks_passed / total_checks)` weighted by target (1.0), warning (0.6), fail (0.0).
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
## Output Format
|
|
183
|
+
|
|
184
|
+
```
|
|
185
|
+
━━━ Performance Benchmark Report ━━━━━━━━━
|
|
186
|
+
|
|
187
|
+
Project: [name]
|
|
188
|
+
Date: [timestamp]
|
|
189
|
+
Score: [0-100] / 100 → Grade [A-F]
|
|
190
|
+
|
|
191
|
+
━━━ Web Vitals ━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
192
|
+
|
|
193
|
+
LCP: 1.8s ✅ Good (target: < 2.5s)
|
|
194
|
+
INP: 95ms ✅ Good (target: < 200ms)
|
|
195
|
+
CLS: 0.05 ✅ Good (target: < 0.1)
|
|
196
|
+
TTFB: 420ms ✅ Good (target: < 800ms)
|
|
197
|
+
FCP: 1.2s ✅ Good (target: < 1.8s)
|
|
198
|
+
Score: 92/100
|
|
199
|
+
|
|
200
|
+
━━━ Bundle ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
201
|
+
|
|
202
|
+
Total JS: 156KB gzipped 🟡 Warning (target: < 100KB)
|
|
203
|
+
Largest chunk: 82KB gzipped 🟡 Warning (target: < 50KB)
|
|
204
|
+
CSS total: 28KB gzipped ✅ Good
|
|
205
|
+
Unused CSS: 4.2% ✅ Good
|
|
206
|
+
Duplicates: 0 ✅ Good
|
|
207
|
+
Score: 72/100
|
|
208
|
+
|
|
209
|
+
━━━ API Latency ━━━━━━━━━━━━━━━━━━━━━━━━
|
|
210
|
+
|
|
211
|
+
GET /api/users: avg 89ms ✅ | p95 142ms ✅
|
|
212
|
+
POST /api/auth: avg 210ms 🟡 | p95 480ms 🟡
|
|
213
|
+
GET /api/dashboard: avg 340ms ❌ | p95 820ms ❌
|
|
214
|
+
Cold start: 680ms ✅
|
|
215
|
+
Score: 58/100
|
|
216
|
+
|
|
217
|
+
━━━ Build ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
218
|
+
|
|
219
|
+
Dev cold start: 2.1s ✅
|
|
220
|
+
HMR: 89ms ✅
|
|
221
|
+
Production build: 18s ✅
|
|
222
|
+
Type-check: 12s ✅
|
|
223
|
+
Score: 100/100
|
|
224
|
+
|
|
225
|
+
━━━ Fix Priority (by impact) ━━━━━━━━━━━
|
|
226
|
+
|
|
227
|
+
1. 🔴 GET /api/dashboard avg 340ms
|
|
228
|
+
→ Add database index on dashboard query joins
|
|
229
|
+
→ Expected: < 100ms (70% improvement)
|
|
230
|
+
|
|
231
|
+
2. 🟡 Total JS 156KB
|
|
232
|
+
→ Lazy-load chart library (80KB)
|
|
233
|
+
→ Expected: < 80KB initial (50% reduction)
|
|
234
|
+
|
|
235
|
+
3. 🟡 POST /api/auth avg 210ms
|
|
236
|
+
→ Cache user lookup in auth flow
|
|
237
|
+
→ Expected: < 100ms (50% improvement)
|
|
238
|
+
|
|
239
|
+
━━━ Trend (if baseline available) ━━━━━━
|
|
240
|
+
|
|
241
|
+
LCP: 1.8s → 1.8s → (no change)
|
|
242
|
+
Bundle: 140KB → 156KB ↑ (+11%) ⚠️ Regression
|
|
243
|
+
API p95: 400ms → 480ms ↑ (+20%) ⚠️ Regression
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
## Regression Detection
|
|
249
|
+
|
|
250
|
+
If a previous benchmark baseline exists (stored in `perf-baseline.json` or similar):
|
|
251
|
+
|
|
252
|
+
| Metric | Change | Status |
|
|
253
|
+
|---|---|---|
|
|
254
|
+
| < 5% increase | No change | ✅ Stable |
|
|
255
|
+
| 5-15% increase | Minor regression | 🟡 Flag |
|
|
256
|
+
| > 15% increase | Significant regression | 🔴 Block deploy |
|
|
257
|
+
| Any decrease | Improvement | 🎉 Celebrate |
|
|
258
|
+
|
|
259
|
+
---
|
|
260
|
+
|
|
261
|
+
## Baseline Management
|
|
262
|
+
|
|
263
|
+
After a successful benchmark, save a baseline to detect future regressions:
|
|
264
|
+
|
|
265
|
+
```bash
|
|
266
|
+
# Save current benchmark as baseline
|
|
267
|
+
python .agent/scripts/bundle_analyzer.py . --save-baseline
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
The baseline file is `perf-baseline.json` in the project root. Check it into version control so regressions are caught in CI.
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## Cross-Workflow Navigation
|
|
275
|
+
|
|
276
|
+
| After /performance-benchmarker shows... | Go to |
|
|
277
|
+
|---|---|
|
|
278
|
+
| Grade D or F | `/tribunal-performance` on the slowest code paths |
|
|
279
|
+
| Bundle regression (+15%) | `/audit` for dependency analysis, then `/fix` |
|
|
280
|
+
| API latency P95 > 500ms | `/debug` to identify the slow query or operation |
|
|
281
|
+
| Web vitals LCP > 4s | `/enhance` to add image preloading and critical CSS |
|
|
282
|
+
| Grade A or B, ready for deploy | `/deploy` following pre-flight checklist |
|
|
283
|
+
|
|
284
|
+
---
|
|
285
|
+
|
|
286
|
+
## Hallucination Guard
|
|
287
|
+
|
|
288
|
+
- **Only run benchmarks with installed tools** — check with `which` or `npx --dry-run` first.
|
|
289
|
+
- **Never fabricate benchmark numbers** — report "SKIPPED: [tool] not installed" if unavailable.
|
|
290
|
+
- **Flag anomalies**: `// NOTE: unusually fast — may be cached` or `// NOTE: first run, cold start included`.
|
|
291
|
+
- **Mark tool availability**: `// VERIFY: lighthouse-cli not detected, using fallback estimation`.
|
|
292
|
+
- **Don't guess fixes** — only recommend fixes for issues that have measured evidence.
|
|
293
|
+
|
|
294
|
+
---
|
|
295
|
+
|
|
296
|
+
## Usage
|
|
297
|
+
|
|
298
|
+
```
|
|
299
|
+
/performance-benchmarker full audit
|
|
300
|
+
/performance-benchmarker web vitals only
|
|
301
|
+
/performance-benchmarker bundle analysis
|
|
302
|
+
/performance-benchmarker api latency for /api/users /api/posts
|
|
303
|
+
/performance-benchmarker build performance
|
|
304
|
+
/performance-benchmarker compare with baseline
|
|
305
|
+
```
|
package/.agent/workflows/plan.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
description: Create project plan using project-planner agent. No code writing
|
|
2
|
+
description: Create project plan using project-planner agent. No code writing — only plan file generation.
|
|
3
3
|
---
|
|
4
4
|
|
|
5
5
|
# /plan — Write the Plan First
|
|
@@ -8,7 +8,18 @@ $ARGUMENTS
|
|
|
8
8
|
|
|
9
9
|
---
|
|
10
10
|
|
|
11
|
-
This command produces one thing: a structured plan file
|
|
11
|
+
This command produces one thing: **a structured plan file**. Nothing is implemented. No code is written. The plan is the output.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## When to Use /plan vs Other Commands
|
|
16
|
+
|
|
17
|
+
| Use `/plan` when... | Use something else when... |
|
|
18
|
+
|---|---|
|
|
19
|
+
| Requirements are unclear or large | You already know what to build → `/create` |
|
|
20
|
+
| Multi-agent work needs coordination | Single function needed → `/generate` |
|
|
21
|
+
| You want written scope agreement before coding | Ready to build immediately → `/create` |
|
|
22
|
+
| Stakeholder review is needed before work starts | Just a quick discussion → ask directly |
|
|
12
23
|
|
|
13
24
|
---
|
|
14
25
|
|
|
@@ -23,38 +34,34 @@ This command produces one thing: a structured plan file. Nothing is implemented.
|
|
|
23
34
|
|
|
24
35
|
### Gate: Clarify Before You Plan
|
|
25
36
|
|
|
26
|
-
The `project-planner` agent asks:
|
|
37
|
+
The `project-planner` agent asks — and gets answers to — these four questions before writing a single line of the plan:
|
|
27
38
|
|
|
28
39
|
```
|
|
29
|
-
What outcome needs to exist that doesn't exist today?
|
|
30
|
-
What are the hard constraints? (stack, existing code, deadline)
|
|
31
|
-
What's explicitly not being built in this version?
|
|
32
|
-
How will we confirm it's done?
|
|
40
|
+
1. What outcome needs to exist that doesn't exist today?
|
|
41
|
+
2. What are the hard constraints? (stack, existing code, deadline)
|
|
42
|
+
3. What's explicitly not being built in this version?
|
|
43
|
+
4. How will we confirm it's done? (observable done condition)
|
|
33
44
|
```
|
|
34
45
|
|
|
35
|
-
If any answer is "I don't know" — those are clarified before the plan is written, not after.
|
|
36
|
-
|
|
37
|
-
### Plan File Creation
|
|
46
|
+
If any answer is "I don't know" — those are clarified **before** the plan is written, not after.
|
|
38
47
|
|
|
39
|
-
|
|
40
|
-
Location: docs/PLAN-{task-slug}.md
|
|
48
|
+
> ⚠️ An unclear "done condition" is the most common cause of scope creep. It must be specific and observable.
|
|
41
49
|
|
|
42
|
-
|
|
43
|
-
Pull 2–3 key words from the request
|
|
44
|
-
Lowercase + hyphens
|
|
45
|
-
Max 30 characters
|
|
46
|
-
"build auth with JWT" → PLAN-auth-jwt.md
|
|
47
|
-
"shopping cart checkout" → PLAN-cart-checkout.md
|
|
48
|
-
```
|
|
50
|
+
---
|
|
49
51
|
|
|
50
|
-
###
|
|
52
|
+
### Plan File Creation
|
|
51
53
|
|
|
52
54
|
```
|
|
53
|
-
|
|
55
|
+
Location: docs/PLAN-{task-slug}.md
|
|
54
56
|
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
57
|
+
Slug naming rules:
|
|
58
|
+
- Pull 2–3 key words from the request
|
|
59
|
+
- Lowercase + hyphens
|
|
60
|
+
- Max 30 characters
|
|
61
|
+
Examples:
|
|
62
|
+
"build auth with JWT" → PLAN-auth-jwt.md
|
|
63
|
+
"shopping cart checkout" → PLAN-cart-checkout.md
|
|
64
|
+
"multi-tenant user roles" → PLAN-user-roles.md
|
|
58
65
|
```
|
|
59
66
|
|
|
60
67
|
---
|
|
@@ -65,37 +72,71 @@ Review it, then:
|
|
|
65
72
|
# Plan: [Feature Name]
|
|
66
73
|
|
|
67
74
|
## What Done Looks Like
|
|
68
|
-
[Observable outcome — one sentence]
|
|
75
|
+
[Observable outcome — one specific, testable sentence]
|
|
69
76
|
|
|
70
77
|
## Won't Include in This Version
|
|
71
|
-
- [Explicit exclusion]
|
|
78
|
+
- [Explicit exclusion 1]
|
|
79
|
+
- [Explicit exclusion 2]
|
|
72
80
|
|
|
73
81
|
## Unresolved Questions
|
|
74
|
-
- [
|
|
82
|
+
- [Item needing external confirmation: VERIFY]
|
|
75
83
|
|
|
76
84
|
## Estimates (Ranges + Confidence)
|
|
77
|
-
|
|
85
|
+
| Task | Optimistic | Realistic | Pessimistic | Confidence |
|
|
86
|
+
|------|-----------|-----------|-------------|------------|
|
|
87
|
+
| DB schema | 30min | 1h | 2h | High |
|
|
88
|
+
| API layer | 2h | 4h | 8h | Medium |
|
|
89
|
+
| Frontend | 3h | 6h | 12h | Low |
|
|
78
90
|
|
|
79
91
|
## Task Table
|
|
80
92
|
| # | Task | Agent | Depends on | Done when |
|
|
81
|
-
|---|------|-------|-----------|-----------|
|
|
82
|
-
| 1 |
|
|
83
|
-
| 2 |
|
|
93
|
+
|---|------|-------|-----------|-----------|
|
|
94
|
+
| 1 | DB schema | database-architect | none | migration runs |
|
|
95
|
+
| 2 | API routes | backend-specialist | #1 | returns 201 |
|
|
96
|
+
| 3 | Frontend component | frontend-specialist | #2 | renders without errors |
|
|
97
|
+
| 4 | Tests | test-engineer | #2 | all specs pass |
|
|
84
98
|
|
|
85
99
|
## Review Gates
|
|
86
100
|
| Task | Tribunal |
|
|
87
101
|
|---|---|
|
|
88
102
|
| #1 schema | /tribunal-database |
|
|
89
103
|
| #2 API | /tribunal-backend |
|
|
104
|
+
| #3 UI | /tribunal-frontend |
|
|
105
|
+
| #4 tests | test-coverage-reviewer |
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
---
|
|
109
|
+
|
|
110
|
+
### After the File is Written
|
|
111
|
+
|
|
112
|
+
```
|
|
113
|
+
✅ Plan written: docs/PLAN-{slug}.md
|
|
114
|
+
|
|
115
|
+
Review it, then:
|
|
116
|
+
/create → Begin full implementation (uses this plan)
|
|
117
|
+
/generate → Implement a single task from the table
|
|
118
|
+
/orchestrate → Coordinate all agents across the full plan
|
|
90
119
|
```
|
|
91
120
|
|
|
92
121
|
---
|
|
93
122
|
|
|
94
123
|
## Hallucination Guard
|
|
95
124
|
|
|
96
|
-
- Every tool
|
|
97
|
-
-
|
|
125
|
+
- Every tool, library, or API mentioned in the plan must be **real and verified** before being listed
|
|
126
|
+
- Time estimates are **ranges with confidence labels** — never single-point guarantees
|
|
98
127
|
- External dependencies that aren't confirmed get a `[VERIFY: check this exists]` tag
|
|
128
|
+
- The done condition is **observable and specific** — "it works" is not a done condition
|
|
129
|
+
|
|
130
|
+
---
|
|
131
|
+
|
|
132
|
+
## Cross-Workflow Navigation
|
|
133
|
+
|
|
134
|
+
| After /plan produces the file... | Go to |
|
|
135
|
+
|---|---|
|
|
136
|
+
| Ready to build the full plan | `/create` reads the plan and starts building |
|
|
137
|
+
| Need a single task implemented | `/generate [task description]` |
|
|
138
|
+
| Multi-agent coordination needed | `/orchestrate` to run the plan as a managed build |
|
|
139
|
+
| Need to review existing code first | `explorer-agent` before committing to the plan |
|
|
99
140
|
|
|
100
141
|
---
|
|
101
142
|
|
|
@@ -105,4 +146,6 @@ All time estimates include: optimistic / realistic / pessimistic + confidence le
|
|
|
105
146
|
/plan REST API with user auth
|
|
106
147
|
/plan dark mode toggle for the settings page
|
|
107
148
|
/plan multi-tenant account switching
|
|
149
|
+
/plan event-driven notification system with queues
|
|
150
|
+
/plan admin dashboard with user management and analytics
|
|
108
151
|
```
|