tribunal-kit 1.0.0 โ†’ 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (125) hide show
  1. package/.agent/.shared/ui-ux-pro-max/README.md +3 -3
  2. package/.agent/ARCHITECTURE.md +205 -10
  3. package/.agent/GEMINI.md +37 -7
  4. package/.agent/agents/accessibility-reviewer.md +134 -0
  5. package/.agent/agents/ai-code-reviewer.md +129 -0
  6. package/.agent/agents/frontend-specialist.md +3 -0
  7. package/.agent/agents/game-developer.md +21 -21
  8. package/.agent/agents/logic-reviewer.md +12 -0
  9. package/.agent/agents/mobile-reviewer.md +79 -0
  10. package/.agent/agents/orchestrator.md +56 -26
  11. package/.agent/agents/performance-reviewer.md +36 -0
  12. package/.agent/agents/supervisor-agent.md +156 -0
  13. package/.agent/agents/swarm-worker-contracts.md +166 -0
  14. package/.agent/agents/swarm-worker-registry.md +92 -0
  15. package/.agent/rules/GEMINI.md +134 -5
  16. package/.agent/scripts/bundle_analyzer.py +259 -0
  17. package/.agent/scripts/dependency_analyzer.py +247 -0
  18. package/.agent/scripts/lint_runner.py +188 -0
  19. package/.agent/scripts/patch_skills_meta.py +177 -0
  20. package/.agent/scripts/patch_skills_output.py +285 -0
  21. package/.agent/scripts/schema_validator.py +279 -0
  22. package/.agent/scripts/security_scan.py +224 -0
  23. package/.agent/scripts/session_manager.py +144 -3
  24. package/.agent/scripts/skill_integrator.py +234 -0
  25. package/.agent/scripts/strengthen_skills.py +220 -0
  26. package/.agent/scripts/swarm_dispatcher.py +317 -0
  27. package/.agent/scripts/test_runner.py +192 -0
  28. package/.agent/scripts/test_swarm_dispatcher.py +163 -0
  29. package/.agent/skills/agent-organizer/SKILL.md +132 -0
  30. package/.agent/skills/agentic-patterns/SKILL.md +335 -0
  31. package/.agent/skills/api-patterns/SKILL.md +226 -50
  32. package/.agent/skills/app-builder/SKILL.md +215 -52
  33. package/.agent/skills/architecture/SKILL.md +176 -31
  34. package/.agent/skills/bash-linux/SKILL.md +150 -134
  35. package/.agent/skills/behavioral-modes/SKILL.md +152 -160
  36. package/.agent/skills/brainstorming/SKILL.md +148 -101
  37. package/.agent/skills/brainstorming/dynamic-questioning.md +10 -0
  38. package/.agent/skills/clean-code/SKILL.md +139 -134
  39. package/.agent/skills/code-review-checklist/SKILL.md +177 -80
  40. package/.agent/skills/config-validator/SKILL.md +165 -0
  41. package/.agent/skills/csharp-developer/SKILL.md +107 -0
  42. package/.agent/skills/database-design/SKILL.md +252 -29
  43. package/.agent/skills/deployment-procedures/SKILL.md +122 -175
  44. package/.agent/skills/devops-engineer/SKILL.md +134 -0
  45. package/.agent/skills/devops-incident-responder/SKILL.md +98 -0
  46. package/.agent/skills/documentation-templates/SKILL.md +175 -121
  47. package/.agent/skills/dotnet-core-expert/SKILL.md +103 -0
  48. package/.agent/skills/edge-computing/SKILL.md +213 -0
  49. package/.agent/skills/frontend-design/SKILL.md +76 -0
  50. package/.agent/skills/frontend-design/color-system.md +18 -0
  51. package/.agent/skills/frontend-design/typography-system.md +18 -0
  52. package/.agent/skills/game-development/SKILL.md +69 -0
  53. package/.agent/skills/geo-fundamentals/SKILL.md +158 -99
  54. package/.agent/skills/i18n-localization/SKILL.md +158 -96
  55. package/.agent/skills/intelligent-routing/SKILL.md +89 -285
  56. package/.agent/skills/intelligent-routing/router-manifest.md +65 -0
  57. package/.agent/skills/lint-and-validate/SKILL.md +229 -27
  58. package/.agent/skills/llm-engineering/SKILL.md +258 -0
  59. package/.agent/skills/local-first/SKILL.md +203 -0
  60. package/.agent/skills/mcp-builder/SKILL.md +159 -111
  61. package/.agent/skills/mobile-design/SKILL.md +102 -282
  62. package/.agent/skills/nextjs-react-expert/SKILL.md +143 -227
  63. package/.agent/skills/nodejs-best-practices/SKILL.md +201 -254
  64. package/.agent/skills/observability/SKILL.md +285 -0
  65. package/.agent/skills/parallel-agents/SKILL.md +124 -118
  66. package/.agent/skills/performance-profiling/SKILL.md +143 -89
  67. package/.agent/skills/plan-writing/SKILL.md +133 -97
  68. package/.agent/skills/platform-engineer/SKILL.md +135 -0
  69. package/.agent/skills/powershell-windows/SKILL.md +167 -104
  70. package/.agent/skills/python-patterns/SKILL.md +149 -361
  71. package/.agent/skills/python-pro/SKILL.md +114 -0
  72. package/.agent/skills/react-specialist/SKILL.md +107 -0
  73. package/.agent/skills/realtime-patterns/SKILL.md +296 -0
  74. package/.agent/skills/red-team-tactics/SKILL.md +136 -134
  75. package/.agent/skills/rust-pro/SKILL.md +237 -173
  76. package/.agent/skills/seo-fundamentals/SKILL.md +134 -82
  77. package/.agent/skills/server-management/SKILL.md +155 -104
  78. package/.agent/skills/sql-pro/SKILL.md +104 -0
  79. package/.agent/skills/systematic-debugging/SKILL.md +156 -79
  80. package/.agent/skills/tailwind-patterns/SKILL.md +163 -205
  81. package/.agent/skills/tdd-workflow/SKILL.md +148 -88
  82. package/.agent/skills/test-result-analyzer/SKILL.md +299 -0
  83. package/.agent/skills/testing-patterns/SKILL.md +141 -114
  84. package/.agent/skills/trend-researcher/SKILL.md +228 -0
  85. package/.agent/skills/ui-ux-pro-max/SKILL.md +107 -0
  86. package/.agent/skills/ui-ux-researcher/SKILL.md +234 -0
  87. package/.agent/skills/vue-expert/SKILL.md +118 -0
  88. package/.agent/skills/vulnerability-scanner/SKILL.md +228 -188
  89. package/.agent/skills/web-design-guidelines/SKILL.md +148 -33
  90. package/.agent/skills/webapp-testing/SKILL.md +171 -122
  91. package/.agent/skills/whimsy-injector/SKILL.md +349 -0
  92. package/.agent/skills/workflow-optimizer/SKILL.md +219 -0
  93. package/.agent/workflows/api-tester.md +279 -0
  94. package/.agent/workflows/audit.md +168 -0
  95. package/.agent/workflows/brainstorm.md +65 -19
  96. package/.agent/workflows/changelog.md +144 -0
  97. package/.agent/workflows/create.md +67 -14
  98. package/.agent/workflows/debug.md +122 -30
  99. package/.agent/workflows/deploy.md +82 -31
  100. package/.agent/workflows/enhance.md +59 -27
  101. package/.agent/workflows/fix.md +143 -0
  102. package/.agent/workflows/generate.md +84 -20
  103. package/.agent/workflows/migrate.md +163 -0
  104. package/.agent/workflows/orchestrate.md +66 -17
  105. package/.agent/workflows/performance-benchmarker.md +305 -0
  106. package/.agent/workflows/plan.md +76 -33
  107. package/.agent/workflows/preview.md +73 -17
  108. package/.agent/workflows/refactor.md +153 -0
  109. package/.agent/workflows/review-ai.md +140 -0
  110. package/.agent/workflows/review.md +83 -16
  111. package/.agent/workflows/session.md +154 -0
  112. package/.agent/workflows/status.md +74 -18
  113. package/.agent/workflows/strengthen-skills.md +99 -0
  114. package/.agent/workflows/swarm.md +194 -0
  115. package/.agent/workflows/test.md +80 -31
  116. package/.agent/workflows/tribunal-backend.md +55 -13
  117. package/.agent/workflows/tribunal-database.md +62 -18
  118. package/.agent/workflows/tribunal-frontend.md +58 -12
  119. package/.agent/workflows/tribunal-full.md +70 -11
  120. package/.agent/workflows/tribunal-mobile.md +123 -0
  121. package/.agent/workflows/tribunal-performance.md +152 -0
  122. package/.agent/workflows/ui-ux-pro-max.md +100 -82
  123. package/README.md +117 -62
  124. package/bin/tribunal-kit.js +329 -75
  125. package/package.json +10 -6
@@ -1,143 +1,197 @@
1
1
  ---
2
2
  name: performance-profiling
3
3
  description: Performance profiling principles. Measurement, analysis, and optimization techniques.
4
- allowed-tools: Read, Glob, Grep, Bash
4
+ allowed-tools: Read, Write, Edit, Glob, Grep
5
+ version: 1.0.0
6
+ last-updated: 2026-03-12
7
+ applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
5
8
  ---
6
9
 
7
- # Performance Profiling
10
+ # Performance Profiling Principles
8
11
 
9
- > Measure, analyze, optimize - in that order.
12
+ > Never optimize code you haven't measured.
13
+ > The bottleneck is almost never where you expect it to be.
10
14
 
11
- ## ๐Ÿ”ง Runtime Scripts
15
+ ---
16
+
17
+ ## The Measurement-First Rule
12
18
 
13
- **Execute these for automated profiling:**
19
+ Every performance investigation follows the same sequence:
14
20
 
15
- | Script | Purpose | Usage |
16
- |--------|---------|-------|
17
- | `scripts/lighthouse_audit.py` | Lighthouse performance audit | `python scripts/lighthouse_audit.py https://example.com` |
21
+ ```
22
+ Measure โ†’ Identify hotspot โ†’ Form hypothesis โ†’ Change one thing โ†’ Measure again
23
+ ```
24
+
25
+ Breaking this sequence โ€” jumping straight to "fix" โ€” wastes time and creates new problems.
18
26
 
19
27
  ---
20
28
 
21
- ## 1. Core Web Vitals
29
+ ## What to Measure
22
30
 
23
- ### Targets
31
+ ### Backend
24
32
 
25
- | Metric | Good | Poor | Measures |
26
- |--------|------|------|----------|
27
- | **LCP** | < 2.5s | > 4.0s | Loading |
28
- | **INP** | < 200ms | > 500ms | Interactivity |
29
- | **CLS** | < 0.1 | > 0.25 | Stability |
33
+ | Metric | Tool | Target |
34
+ |---|---|---|
35
+ | Request throughput | ab, k6, wrk | Baseline + stress test |
36
+ | P50/P95/P99 latency | DataDog, Grafana, k6 | P99 < SLA threshold |
37
+ | Memory usage | `process.memoryUsage()`, heap snapshot | Stable under load (no growth) |
38
+ | CPU usage | clinic.js flame chart | Identify blocking operations |
39
+ | Database query time | Query logs, pg_stat_statements | No query > 100ms without index |
30
40
 
31
- ### When to Measure
41
+ ### Frontend
32
42
 
33
- | Stage | Tool |
34
- |-------|------|
35
- | Development | Local Lighthouse |
36
- | CI/CD | Lighthouse CI |
37
- | Production | RUM (Real User Monitoring) |
43
+ | Metric | Tool | Target (2025 Core Web Vitals) |
44
+ |---|---|---|
45
+ | LCP (Largest Contentful Paint) | Lighthouse, CrUX | < 2.5s |
46
+ | INP (Interaction to Next Paint) | Lighthouse, Web Vitals | < 200ms |
47
+ | CLS (Cumulative Layout Shift) | Lighthouse | < 0.1 |
48
+ | Bundle size (JS) | `npm run build` + analyzer | < 200kB initial JS |
38
49
 
39
50
  ---
40
51
 
41
- ## 2. Profiling Workflow
52
+ ## Common Backend Bottlenecks
53
+
54
+ ### N+1 Queries (most common)
42
55
 
43
- ### The 4-Step Process
56
+ ```ts
57
+ // โŒ 1 + N queries
58
+ const posts = await db.post.findMany();
59
+ for (const post of posts) {
60
+ post.author = await db.user.findUnique({ where: { id: post.authorId } });
61
+ }
44
62
 
63
+ // โœ… 2 queries total
64
+ const posts = await db.post.findMany({ include: { author: true } });
45
65
  ```
46
- 1. BASELINE โ†’ Measure current state
47
- 2. IDENTIFY โ†’ Find the bottleneck
48
- 3. FIX โ†’ Make targeted change
49
- 4. VALIDATE โ†’ Confirm improvement
66
+
67
+ **Detection:** Enable query logging. Repeated identical queries differing only by ID = N+1.
68
+
69
+ ### Missing Database Indexes
70
+
71
+ ```sql
72
+ -- EXPLAIN ANALYZE tells you if a query is doing a sequential scan
73
+ EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = $1;
74
+
75
+ -- Sequential scan on large table โ†’ add index
76
+ CREATE INDEX idx_orders_user_id ON orders(user_id);
50
77
  ```
51
78
 
52
- ### Profiling Tool Selection
79
+ ### Blocking the Event Loop (Node.js)
53
80
 
54
- | Problem | Tool |
55
- |---------|------|
56
- | Page load | Lighthouse |
57
- | Bundle size | Bundle analyzer |
58
- | Runtime | DevTools Performance |
59
- | Memory | DevTools Memory |
60
- | Network | DevTools Network |
81
+ ```ts
82
+ // โŒ Synchronous CPU work blocks all requests
83
+ const result = JSON.parse(fs.readFileSync('huge.json', 'utf8'));
84
+
85
+ // โœ… Non-blocking
86
+ const content = await fs.promises.readFile('huge.json', 'utf8');
87
+ const result = JSON.parse(content); // still sync but no disk I/O blocking
88
+ ```
61
89
 
62
90
  ---
63
91
 
64
- ## 3. Bundle Analysis
92
+ ## Common Frontend Bottlenecks
93
+
94
+ ### Bundle Size
65
95
 
66
- ### What to Look For
96
+ - Identify large packages with `npx vite-bundle-visualizer` or `@next/bundle-analyzer`
97
+ - Replace heavy packages with lighter alternatives (e.g., `date-fns` instead of `moment`)
98
+ - Code-split routes โ€” don't ship all JavaScript on first load
67
99
 
68
- | Issue | Indicator |
69
- |-------|-----------|
70
- | Large dependencies | Top of bundle |
71
- | Duplicate code | Multiple chunks |
72
- | Unused code | Low coverage |
73
- | Missing splits | Single large chunk |
100
+ ### Render Performance
74
101
 
75
- ### Optimization Actions
102
+ ```ts
103
+ // โŒ Recalculates on every render
104
+ function ExpensiveList({ items }) {
105
+ const sorted = items.sort((a, b) => a.name.localeCompare(b.name));
106
+ return sorted.map(item => <Item key={item.id} item={item} />);
107
+ }
76
108
 
77
- | Finding | Action |
78
- |---------|--------|
79
- | Big library | Import specific modules |
80
- | Duplicate deps | Dedupe, update versions |
81
- | Route in main | Code split |
82
- | Unused exports | Tree shake |
109
+ // โœ… Recalculates only when items change
110
+ function ExpensiveList({ items }) {
111
+ const sorted = useMemo(
112
+ () => [...items].sort((a, b) => a.name.localeCompare(b.name)),
113
+ [items]
114
+ );
115
+ return sorted.map(item => <Item key={item.id} item={item} />);
116
+ }
117
+ ```
83
118
 
84
119
  ---
85
120
 
86
- ## 4. Runtime Profiling
121
+ ## Profiling Tools
87
122
 
88
- ### Performance Tab Analysis
123
+ | Tool | Platform | Best For |
124
+ |---|---|---|
125
+ | `clinic.js` (`clinic doctor`) | Node.js | CPU flame charts, memory leaks |
126
+ | Chrome DevTools โ†’ Performance | Browser | JS execution, paint, layout |
127
+ | `EXPLAIN ANALYZE` | PostgreSQL | Query plan analysis |
128
+ | Lighthouse | Web | Full Core Web Vitals audit |
129
+ | `k6` | Backend load testing | Throughput and latency under load |
89
130
 
90
- | Pattern | Meaning |
91
- |---------|---------|
92
- | Long tasks (>50ms) | UI blocking |
93
- | Many small tasks | Possible batching opportunity |
94
- | Layout/paint | Rendering bottleneck |
95
- | Script | JavaScript execution |
131
+ ---
96
132
 
97
- ### Memory Tab Analysis
133
+ ## Scripts
98
134
 
99
- | Pattern | Meaning |
100
- |---------|---------|
101
- | Growing heap | Possible leak |
102
- | Large retained | Check references |
103
- | Detached DOM | Not cleaned up |
135
+ | Script | Purpose | Run With |
136
+ |---|---|---|
137
+ | `scripts/lighthouse_audit.py` | Lighthouse performance audit | `python scripts/lighthouse_audit.py <url>` |
104
138
 
105
139
  ---
106
140
 
107
- ## 5. Common Bottlenecks
141
+ ## Output Format
142
+
143
+ When this skill produces a recommendation or design decision, structure your output as:
144
+
145
+ ```
146
+ โ”โ”โ” Performance Profiling Recommendation โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
147
+ Decision: [what was chosen / proposed]
148
+ Rationale: [why โ€” one concise line]
149
+ Trade-offs: [what is consciously accepted]
150
+ Next action: [concrete next step for the user]
151
+ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
152
+ Pre-Flight: โœ… All checks passed
153
+ or โŒ [blocking item that must be resolved first]
154
+ ```
108
155
 
109
- ### By Symptom
110
156
 
111
- | Symptom | Likely Cause |
112
- |---------|--------------|
113
- | Slow initial load | Large JS, render blocking |
114
- | Slow interactions | Heavy event handlers |
115
- | Jank during scroll | Layout thrashing |
116
- | Growing memory | Leaks, retained refs |
117
157
 
118
158
  ---
119
159
 
120
- ## 6. Quick Win Priorities
160
+ ## ๐Ÿค– LLM-Specific Traps
161
+
162
+ AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
121
163
 
122
- | Priority | Action | Impact |
123
- |----------|--------|--------|
124
- | 1 | Enable compression | High |
125
- | 2 | Lazy load images | High |
126
- | 3 | Code split routes | High |
127
- | 4 | Cache static assets | Medium |
128
- | 5 | Optimize images | Medium |
164
+ 1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
165
+ 2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
166
+ 3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
167
+ 4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
168
+ 5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
129
169
 
130
170
  ---
131
171
 
132
- ## 7. Anti-Patterns
172
+ ## ๐Ÿ›๏ธ Tribunal Integration (Anti-Hallucination)
133
173
 
134
- | โŒ Don't | โœ… Do |
135
- |----------|-------|
136
- | Guess at problems | Profile first |
137
- | Micro-optimize | Fix biggest issue |
138
- | Optimize early | Optimize when needed |
139
- | Ignore real users | Use RUM data |
174
+ **Slash command: `/review` or `/tribunal-full`**
175
+ **Active reviewers: `logic-reviewer` ยท `security-auditor`**
140
176
 
141
- ---
177
+ ### โŒ Forbidden AI Tropes
178
+
179
+ 1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
180
+ 2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
181
+ 3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
182
+
183
+ ### โœ… Pre-Flight Self-Audit
184
+
185
+ Review these questions before confirming output:
186
+ ```
187
+ โœ… Did I rely ONLY on real, verified tools and methods?
188
+ โœ… Is this solution appropriately scoped to the user's constraints?
189
+ โœ… Did I handle potential failure modes and edge cases?
190
+ โœ… Have I avoided generic boilerplate that doesn't add value?
191
+ ```
192
+
193
+ ### ๐Ÿ›‘ Verification-Before-Completion (VBC) Protocol
142
194
 
143
- > **Remember:** The fastest code is code that doesn't run. Remove before optimizing.
195
+ **CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
196
+ - โŒ **Forbidden:** Declaring a task complete because the output "looks correct."
197
+ - โœ… **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.
@@ -1,152 +1,188 @@
1
1
  ---
2
2
  name: plan-writing
3
3
  description: Structured task planning with clear breakdowns, dependencies, and verification criteria. Use when implementing features, refactoring, or any multi-step work.
4
- allowed-tools: Read, Glob, Grep
4
+ allowed-tools: Read, Write, Edit, Glob, Grep
5
+ version: 1.0.0
6
+ last-updated: 2026-03-12
7
+ applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
5
8
  ---
6
9
 
7
- # Plan Writing
10
+ # Task Planning Standards
8
11
 
9
- > Source: obra/superpowers
12
+ > A plan is not a promise. It is a map.
13
+ > Maps get updated when the terrain doesn't match them.
10
14
 
11
- ## Overview
12
- This skill provides a framework for breaking down work into clear, actionable tasks with verification criteria.
15
+ ---
13
16
 
14
- ## Task Breakdown Principles
17
+ ## When to Write a Plan
15
18
 
16
- ### 1. Small, Focused Tasks
17
- - Each task should take 2-5 minutes
18
- - One clear outcome per task
19
- - Independently verifiable
19
+ Write a plan before implementation when:
20
+ - The task touches more than 2 files in non-trivial ways
21
+ - The task has dependencies (thing B can't start until thing A is done)
22
+ - The task involves a risky operation (migration, data transformation, breaking change)
23
+ - The team needs to review the approach before time is spent implementing it
20
24
 
21
- ### 2. Clear Verification
22
- - How do you know it's done?
23
- - What can you check/test?
24
- - What's the expected output?
25
+ Skip the formal plan for: single-function fixes, typo corrections, config tweaks.
25
26
 
26
- ### 3. Logical Ordering
27
- - Dependencies identified
28
- - Parallel work where possible
29
- - Critical path highlighted
30
- - **Phase X: Verification is always LAST**
27
+ ---
31
28
 
32
- ### 4. Dynamic Naming in Project Root
33
- - Plan files are saved as `{task-slug}.md` in the PROJECT ROOT
34
- - Name derived from task (e.g., "add auth" โ†’ `auth-feature.md`)
35
- - **NEVER** inside `.claude/`, `docs/`, or temp folders
29
+ ## Plan Structure
36
30
 
37
- ## Planning Principles (NOT Templates!)
31
+ ```markdown
32
+ # Plan: [Feature or Task Name]
38
33
 
39
- > ๐Ÿ”ด **NO fixed templates. Each plan is UNIQUE to the task.**
34
+ ## Goal
35
+ One sentence: what outcome does this achieve?
40
36
 
41
- ### Principle 1: Keep It SHORT
37
+ ## Context
38
+ - Why is this being done?
39
+ - What problem does it solve or what requirement does it satisfy?
40
+ - What exists today that this changes?
42
41
 
43
- | โŒ Wrong | โœ… Right |
44
- |----------|----------|
45
- | 50 tasks with sub-sub-tasks | 5-10 clear tasks max |
46
- | Every micro-step listed | Only actionable items |
47
- | Verbose descriptions | One-line per task |
42
+ ## Approach
43
+ High-level strategy. Enough detail for someone unfamiliar with the code to understand the direction.
44
+ Not implementation details โ€” those go in the tasks.
48
45
 
49
- > **Rule:** If plan is longer than 1 page, it's too long. Simplify.
46
+ ## Tasks
50
47
 
51
- ---
48
+ ### Phase 1 โ€” [Name] (prerequisite for Phase 2)
49
+ - [ ] Task 1.1: Description
50
+ - [ ] Task 1.2: Description (depends on 1.1)
52
51
 
53
- ### Principle 2: Be SPECIFIC, Not Generic
52
+ ### Phase 2 โ€” [Name] (can run after Phase 1 is complete)
53
+ - [ ] Task 2.1: Description
54
+ - [ ] Task 2.2: Description
54
55
 
55
- | โŒ Wrong | โœ… Right |
56
- |----------|----------|
57
- | "Set up project" | "Run `npx create-next-app`" |
58
- | "Add authentication" | "Install next-auth, create `/api/auth/[...nextauth].ts`" |
59
- | "Style the UI" | "Add Tailwind classes to `Header.tsx`" |
56
+ ## Verification
57
+ How will we know this is done and working?
58
+ - [ ] Specific behavior that can be tested
59
+ - [ ] Metric or log line that confirms success
60
+ - [ ] Edge case that must not regress
60
61
 
61
- > **Rule:** Each task should have a clear, verifiable outcome.
62
+ ## Risks and Open Questions
63
+ - [Risk]: What might go wrong, and what's the mitigation?
64
+ - [Open]: What decision hasn't been made yet that could change this plan?
65
+
66
+ ## Files That Will Change
67
+ - `path/to/file.ts` โ€” what changes
68
+ - `path/to/schema.sql` โ€” what changes
69
+ ```
62
70
 
63
71
  ---
64
72
 
65
- ### Principle 3: Dynamic Content Based on Project Type
73
+ ## Dependency Notation
66
74
 
67
- **For NEW PROJECT:**
68
- - What tech stack? (decide first)
69
- - What's the MVP? (minimal features)
70
- - What's the file structure?
75
+ When tasks have a strict order, mark it:
71
76
 
72
- **For FEATURE ADDITION:**
73
- - Which files are affected?
74
- - What dependencies needed?
75
- - How to verify it works?
77
+ ```
78
+ Task A โ€” (no dependencies, do first)
79
+ Task B โ€” (requires A complete)
80
+ Task C โ€” (can run parallel with B)
81
+ Task D โ€” (requires B and C complete)
82
+ ```
76
83
 
77
- **For BUG FIX:**
78
- - What's the root cause?
79
- - What file/line to change?
80
- - How to test the fix?
84
+ This prevents teams from working on D while B is still broken.
81
85
 
82
86
  ---
83
87
 
84
- ### Principle 4: Scripts Are Project-Specific
88
+ ## Task Granularity
85
89
 
86
- > ๐Ÿ”ด **DO NOT copy-paste script commands. Choose based on project type.**
90
+ Each task should be:
91
+ - Completable in one session by one person
92
+ - Independently reviewable (a PR could represent one task)
93
+ - Testable: there is a concrete way to know if it's done
87
94
 
88
- | Project Type | Relevant Scripts |
89
- |--------------|------------------|
90
- | Frontend/React | `ux_audit.py`, `accessibility_checker.py` |
91
- | Backend/API | `api_validator.py`, `security_scan.py` |
92
- | Mobile | `mobile_audit.py` |
93
- | Database | `schema_validator.py` |
94
- | Full-stack | Mix of above based on what you touched |
95
-
96
- **Wrong:** Adding all scripts to every plan
97
- **Right:** Only scripts relevant to THIS task
95
+ **Too vague:** "Implement the auth system"
96
+ **Right size:** "Add `POST /api/auth/login` endpoint with JWT issuance and Zod validation"
98
97
 
99
98
  ---
100
99
 
101
- ### Principle 5: Verification is Simple
100
+ ## Updating the Plan
101
+
102
+ Plans are living documents:
102
103
 
103
- | โŒ Wrong | โœ… Right |
104
- |----------|----------|
105
- | "Verify the component works correctly" | "Run `npm run dev`, click button, see toast" |
106
- | "Test the API" | "curl localhost:3000/api/users returns 200" |
107
- | "Check styles" | "Open browser, verify dark mode toggle works" |
104
+ - Mark tasks `[x]` when complete, not when started
105
+ - Add `[!]` to blocked tasks with a note on what is blocking
106
+ - When an assumption proves wrong, update the approach section โ€” don't silently deviate from the plan
108
107
 
109
108
  ---
110
109
 
111
- ## Plan Structure (Flexible, Not Fixed!)
110
+ ## Verification Criteria Rules
111
+
112
+ Verification criteria are not optional. For each task:
113
+
114
+ - At least one must be **observable** (you can see it, not just believe it)
115
+ - At least one must cover a **failure mode** (what should NOT happen)
112
116
 
113
117
  ```
114
- # [Task Name]
118
+ โœ… Observable: `POST /api/users` returns 201 with a user ID in the response body
119
+ โœ… Failure mode: `POST /api/users` with a duplicate email returns 409, not 500
120
+ ```
115
121
 
116
- ## Goal
117
- One sentence: What are we building/fixing?
122
+ ---
118
123
 
119
- ## Tasks
120
- - [ ] Task 1: [Specific action] โ†’ Verify: [How to check]
121
- - [ ] Task 2: [Specific action] โ†’ Verify: [How to check]
122
- - [ ] Task 3: [Specific action] โ†’ Verify: [How to check]
124
+ ## ๐Ÿ›‘ Verification-Before-Completion (VBC) Protocol
123
125
 
124
- ## Done When
125
- - [ ] [Main success criteria]
126
- ```
126
+ **CRITICAL:** Every plan must integrate a strict "evidence-based closeout" state machine for its tasks.
127
+ - โŒ **Forbidden:** Writing vague verification steps like "Check that it looks right," "Ensure the code makes sense," or "Verify the logic."
128
+ - โœ… **Required:** Verification criteria MUST demand **concrete terminal/compiler evidence** (e.g., test success logs, CLI execution outputs, compiler success states, or network trace results). Explicitly state that an agent CANNOT consider the task complete until it captures this hard evidence.
129
+
130
+ ---
131
+
132
+ ## Output Format
127
133
 
128
- > **That's it.** No phases, no sub-sections unless truly needed.
129
- > Keep it minimal. Add complexity only when required.
134
+ When this skill produces a recommendation or design decision, structure your output as:
130
135
 
131
- ## Notes
132
- [Any important considerations]
133
136
  ```
137
+ โ”โ”โ” Plan Writing Recommendation โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
138
+ Decision: [what was chosen / proposed]
139
+ Rationale: [why โ€” one concise line]
140
+ Trade-offs: [what is consciously accepted]
141
+ Next action: [concrete next step for the user]
142
+ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
143
+ Pre-Flight: โœ… All checks passed
144
+ or โŒ [blocking item that must be resolved first]
145
+ ```
146
+
147
+
134
148
 
135
149
  ---
136
150
 
137
- ## Best Practices (Quick Reference)
151
+ ## ๐Ÿค– LLM-Specific Traps
152
+
153
+ AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
138
154
 
139
- 1. **Start with goal** - What are we building/fixing?
140
- 2. **Max 10 tasks** - If more, break into multiple plans
141
- 3. **Each task verifiable** - Clear "done" criteria
142
- 4. **Project-specific** - No copy-paste templates
143
- 5. **Update as you go** - Mark `[x]` when complete
155
+ 1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
156
+ 2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
157
+ 3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
158
+ 4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
159
+ 5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
144
160
 
145
161
  ---
146
162
 
147
- ## When to Use
163
+ ## ๐Ÿ›๏ธ Tribunal Integration (Anti-Hallucination)
164
+
165
+ **Slash command: `/review` or `/tribunal-full`**
166
+ **Active reviewers: `logic-reviewer` ยท `security-auditor`**
167
+
168
+ ### โŒ Forbidden AI Tropes
169
+
170
+ 1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
171
+ 2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
172
+ 3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
173
+
174
+ ### โœ… Pre-Flight Self-Audit
175
+
176
+ Review these questions before confirming output:
177
+ ```
178
+ โœ… Did I rely ONLY on real, verified tools and methods?
179
+ โœ… Is this solution appropriately scoped to the user's constraints?
180
+ โœ… Did I handle potential failure modes and edge cases?
181
+ โœ… Have I avoided generic boilerplate that doesn't add value?
182
+ ```
183
+
184
+ ### ๐Ÿ›‘ Verification-Before-Completion (VBC) Protocol
148
185
 
149
- - New project from scratch
150
- - Adding a feature
151
- - Fixing a bug (if complex)
152
- - Refactoring multiple files
186
+ **CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
187
+ - โŒ **Forbidden:** Declaring a task complete because the output "looks correct."
188
+ - โœ… **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.