npm - tribunal-kit - Versions diffs - 1.0.0 → 2.4.0 - Mend

tribunal-kit 1.0.0 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (125) hide show

package/.agent/.shared/ui-ux-pro-max/README.md +3 -3
package/.agent/ARCHITECTURE.md +205 -10
package/.agent/GEMINI.md +37 -7
package/.agent/agents/accessibility-reviewer.md +134 -0
package/.agent/agents/ai-code-reviewer.md +129 -0
package/.agent/agents/frontend-specialist.md +3 -0
package/.agent/agents/game-developer.md +21 -21
package/.agent/agents/logic-reviewer.md +12 -0
package/.agent/agents/mobile-reviewer.md +79 -0
package/.agent/agents/orchestrator.md +56 -26
package/.agent/agents/performance-reviewer.md +36 -0
package/.agent/agents/supervisor-agent.md +156 -0
package/.agent/agents/swarm-worker-contracts.md +166 -0
package/.agent/agents/swarm-worker-registry.md +92 -0
package/.agent/rules/GEMINI.md +134 -5
package/.agent/scripts/bundle_analyzer.py +259 -0
package/.agent/scripts/dependency_analyzer.py +247 -0
package/.agent/scripts/lint_runner.py +188 -0
package/.agent/scripts/patch_skills_meta.py +177 -0
package/.agent/scripts/patch_skills_output.py +285 -0
package/.agent/scripts/schema_validator.py +279 -0
package/.agent/scripts/security_scan.py +224 -0
package/.agent/scripts/session_manager.py +144 -3
package/.agent/scripts/skill_integrator.py +234 -0
package/.agent/scripts/strengthen_skills.py +220 -0
package/.agent/scripts/swarm_dispatcher.py +317 -0
package/.agent/scripts/test_runner.py +192 -0
package/.agent/scripts/test_swarm_dispatcher.py +163 -0
package/.agent/skills/agent-organizer/SKILL.md +132 -0
package/.agent/skills/agentic-patterns/SKILL.md +335 -0
package/.agent/skills/api-patterns/SKILL.md +226 -50
package/.agent/skills/app-builder/SKILL.md +215 -52
package/.agent/skills/architecture/SKILL.md +176 -31
package/.agent/skills/bash-linux/SKILL.md +150 -134
package/.agent/skills/behavioral-modes/SKILL.md +152 -160
package/.agent/skills/brainstorming/SKILL.md +148 -101
package/.agent/skills/brainstorming/dynamic-questioning.md +10 -0
package/.agent/skills/clean-code/SKILL.md +139 -134
package/.agent/skills/code-review-checklist/SKILL.md +177 -80
package/.agent/skills/config-validator/SKILL.md +165 -0
package/.agent/skills/csharp-developer/SKILL.md +107 -0
package/.agent/skills/database-design/SKILL.md +252 -29
package/.agent/skills/deployment-procedures/SKILL.md +122 -175
package/.agent/skills/devops-engineer/SKILL.md +134 -0
package/.agent/skills/devops-incident-responder/SKILL.md +98 -0
package/.agent/skills/documentation-templates/SKILL.md +175 -121
package/.agent/skills/dotnet-core-expert/SKILL.md +103 -0
package/.agent/skills/edge-computing/SKILL.md +213 -0
package/.agent/skills/frontend-design/SKILL.md +76 -0
package/.agent/skills/frontend-design/color-system.md +18 -0
package/.agent/skills/frontend-design/typography-system.md +18 -0
package/.agent/skills/game-development/SKILL.md +69 -0
package/.agent/skills/geo-fundamentals/SKILL.md +158 -99
package/.agent/skills/i18n-localization/SKILL.md +158 -96
package/.agent/skills/intelligent-routing/SKILL.md +89 -285
package/.agent/skills/intelligent-routing/router-manifest.md +65 -0
package/.agent/skills/lint-and-validate/SKILL.md +229 -27
package/.agent/skills/llm-engineering/SKILL.md +258 -0
package/.agent/skills/local-first/SKILL.md +203 -0
package/.agent/skills/mcp-builder/SKILL.md +159 -111
package/.agent/skills/mobile-design/SKILL.md +102 -282
package/.agent/skills/nextjs-react-expert/SKILL.md +143 -227
package/.agent/skills/nodejs-best-practices/SKILL.md +201 -254
package/.agent/skills/observability/SKILL.md +285 -0
package/.agent/skills/parallel-agents/SKILL.md +124 -118
package/.agent/skills/performance-profiling/SKILL.md +143 -89
package/.agent/skills/plan-writing/SKILL.md +133 -97
package/.agent/skills/platform-engineer/SKILL.md +135 -0
package/.agent/skills/powershell-windows/SKILL.md +167 -104
package/.agent/skills/python-patterns/SKILL.md +149 -361
package/.agent/skills/python-pro/SKILL.md +114 -0
package/.agent/skills/react-specialist/SKILL.md +107 -0
package/.agent/skills/realtime-patterns/SKILL.md +296 -0
package/.agent/skills/red-team-tactics/SKILL.md +136 -134
package/.agent/skills/rust-pro/SKILL.md +237 -173
package/.agent/skills/seo-fundamentals/SKILL.md +134 -82
package/.agent/skills/server-management/SKILL.md +155 -104
package/.agent/skills/sql-pro/SKILL.md +104 -0
package/.agent/skills/systematic-debugging/SKILL.md +156 -79
package/.agent/skills/tailwind-patterns/SKILL.md +163 -205
package/.agent/skills/tdd-workflow/SKILL.md +148 -88
package/.agent/skills/test-result-analyzer/SKILL.md +299 -0
package/.agent/skills/testing-patterns/SKILL.md +141 -114
package/.agent/skills/trend-researcher/SKILL.md +228 -0
package/.agent/skills/ui-ux-pro-max/SKILL.md +107 -0
package/.agent/skills/ui-ux-researcher/SKILL.md +234 -0
package/.agent/skills/vue-expert/SKILL.md +118 -0
package/.agent/skills/vulnerability-scanner/SKILL.md +228 -188
package/.agent/skills/web-design-guidelines/SKILL.md +148 -33
package/.agent/skills/webapp-testing/SKILL.md +171 -122
package/.agent/skills/whimsy-injector/SKILL.md +349 -0
package/.agent/skills/workflow-optimizer/SKILL.md +219 -0
package/.agent/workflows/api-tester.md +279 -0
package/.agent/workflows/audit.md +168 -0
package/.agent/workflows/brainstorm.md +65 -19
package/.agent/workflows/changelog.md +144 -0
package/.agent/workflows/create.md +67 -14
package/.agent/workflows/debug.md +122 -30
package/.agent/workflows/deploy.md +82 -31
package/.agent/workflows/enhance.md +59 -27
package/.agent/workflows/fix.md +143 -0
package/.agent/workflows/generate.md +84 -20
package/.agent/workflows/migrate.md +163 -0
package/.agent/workflows/orchestrate.md +66 -17
package/.agent/workflows/performance-benchmarker.md +305 -0
package/.agent/workflows/plan.md +76 -33
package/.agent/workflows/preview.md +73 -17
package/.agent/workflows/refactor.md +153 -0
package/.agent/workflows/review-ai.md +140 -0
package/.agent/workflows/review.md +83 -16
package/.agent/workflows/session.md +154 -0
package/.agent/workflows/status.md +74 -18
package/.agent/workflows/strengthen-skills.md +99 -0
package/.agent/workflows/swarm.md +194 -0
package/.agent/workflows/test.md +80 -31
package/.agent/workflows/tribunal-backend.md +55 -13
package/.agent/workflows/tribunal-database.md +62 -18
package/.agent/workflows/tribunal-frontend.md +58 -12
package/.agent/workflows/tribunal-full.md +70 -11
package/.agent/workflows/tribunal-mobile.md +123 -0
package/.agent/workflows/tribunal-performance.md +152 -0
package/.agent/workflows/ui-ux-pro-max.md +100 -82
package/README.md +117 -62
package/bin/tribunal-kit.js +329 -75
package/package.json +10 -6

package/.agent/skills/performance-profiling/SKILL.md CHANGED Viewed

@@ -1,143 +1,197 @@
 ---
 name: performance-profiling
 description: Performance profiling principles. Measurement, analysis, and optimization techniques.
-allowed-tools: Read, Glob, Grep, Bash
+allowed-tools: Read, Write, Edit, Glob, Grep
+version: 1.0.0
+last-updated: 2026-03-12
+applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
 ---
-# Performance Profiling
+# Performance Profiling Principles
-> Measure, analyze, optimize - in that order.
+> Never optimize code you haven't measured.
+> The bottleneck is almost never where you expect it to be.
-## 🔧 Runtime Scripts
+---
+## The Measurement-First Rule
-**Execute these for automated profiling:**
+Every performance investigation follows the same sequence:
-| Script | Purpose | Usage |
-|--------|---------|-------|
-| `scripts/lighthouse_audit.py` | Lighthouse performance audit | `python scripts/lighthouse_audit.py https://example.com` |
+```
+Measure → Identify hotspot → Form hypothesis → Change one thing → Measure again
+```
+Breaking this sequence — jumping straight to "fix" — wastes time and creates new problems.
 ---
-## 1. Core Web Vitals
+## What to Measure
-### Targets
+### Backend
-| Metric | Good | Poor | Measures |
-|--------|------|------|----------|
-| **LCP** | < 2.5s | > 4.0s | Loading |
-| **INP** | < 200ms | > 500ms | Interactivity |
-| **CLS** | < 0.1 | > 0.25 | Stability |
+| Metric | Tool | Target |
+|---|---|---|
+| Request throughput | ab, k6, wrk | Baseline + stress test |
+| P50/P95/P99 latency | DataDog, Grafana, k6 | P99 < SLA threshold |
+| Memory usage | `process.memoryUsage()`, heap snapshot | Stable under load (no growth) |
+| CPU usage | clinic.js flame chart | Identify blocking operations |
+| Database query time | Query logs, pg_stat_statements | No query > 100ms without index |
-### When to Measure
+### Frontend
-| Stage | Tool |
-|-------|------|
-| Development | Local Lighthouse |
-| CI/CD | Lighthouse CI |
-| Production | RUM (Real User Monitoring) |
+| Metric | Tool | Target (2025 Core Web Vitals) |
+|---|---|---|
+| LCP (Largest Contentful Paint) | Lighthouse, CrUX | < 2.5s |
+| INP (Interaction to Next Paint) | Lighthouse, Web Vitals | < 200ms |
+| CLS (Cumulative Layout Shift) | Lighthouse | < 0.1 |
+| Bundle size (JS) | `npm run build` + analyzer | < 200kB initial JS |
 ---
-## 2. Profiling Workflow
+## Common Backend Bottlenecks
+### N+1 Queries (most common)
-### The 4-Step Process
+```ts
+// ❌ 1 + N queries
+const posts = await db.post.findMany();
+for (const post of posts) {
+  post.author = await db.user.findUnique({ where: { id: post.authorId } });
+}
+// ✅ 2 queries total
+const posts = await db.post.findMany({ include: { author: true } });
 ```
-1. BASELINE → Measure current state
-2. IDENTIFY → Find the bottleneck
-3. FIX → Make targeted change
-4. VALIDATE → Confirm improvement
+**Detection:** Enable query logging. Repeated identical queries differing only by ID = N+1.
+### Missing Database Indexes
+```sql
+-- EXPLAIN ANALYZE tells you if a query is doing a sequential scan
+EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = $1;
+-- Sequential scan on large table → add index
+CREATE INDEX idx_orders_user_id ON orders(user_id);
 ```
-### Profiling Tool Selection
+### Blocking the Event Loop (Node.js)
-| Problem | Tool |
-|---------|------|
-| Page load | Lighthouse |
-| Bundle size | Bundle analyzer |
-| Runtime | DevTools Performance |
-| Memory | DevTools Memory |
-| Network | DevTools Network |
+```ts
+// ❌ Synchronous CPU work blocks all requests
+const result = JSON.parse(fs.readFileSync('huge.json', 'utf8'));
+// ✅ Non-blocking
+const content = await fs.promises.readFile('huge.json', 'utf8');
+const result = JSON.parse(content);  // still sync but no disk I/O blocking
+```
 ---
-## 3. Bundle Analysis
+## Common Frontend Bottlenecks
+### Bundle Size
-### What to Look For
+- Identify large packages with `npx vite-bundle-visualizer` or `@next/bundle-analyzer`
+- Replace heavy packages with lighter alternatives (e.g., `date-fns` instead of `moment`)
+- Code-split routes — don't ship all JavaScript on first load
-| Issue | Indicator |
-|-------|-----------|
-| Large dependencies | Top of bundle |
-| Duplicate code | Multiple chunks |
-| Unused code | Low coverage |
-| Missing splits | Single large chunk |
+### Render Performance
-### Optimization Actions
+```ts
+// ❌ Recalculates on every render
+function ExpensiveList({ items }) {
+  const sorted = items.sort((a, b) => a.name.localeCompare(b.name));
+  return sorted.map(item => <Item key={item.id} item={item} />);
+}
-| Finding | Action |
-|---------|--------|
-| Big library | Import specific modules |
-| Duplicate deps | Dedupe, update versions |
-| Route in main | Code split |
-| Unused exports | Tree shake |
+// ✅ Recalculates only when items change
+function ExpensiveList({ items }) {
+  const sorted = useMemo(
+    () => [...items].sort((a, b) => a.name.localeCompare(b.name)),
+    [items]
+  );
+  return sorted.map(item => <Item key={item.id} item={item} />);
+}
+```
 ---
-## 4. Runtime Profiling
+## Profiling Tools
-### Performance Tab Analysis
+| Tool | Platform | Best For |
+|---|---|---|
+| `clinic.js` (`clinic doctor`) | Node.js | CPU flame charts, memory leaks |
+| Chrome DevTools → Performance | Browser | JS execution, paint, layout |
+| `EXPLAIN ANALYZE` | PostgreSQL | Query plan analysis |
+| Lighthouse | Web | Full Core Web Vitals audit |
+| `k6` | Backend load testing | Throughput and latency under load |
-| Pattern | Meaning |
-|---------|---------|
-| Long tasks (>50ms) | UI blocking |
-| Many small tasks | Possible batching opportunity |
-| Layout/paint | Rendering bottleneck |
-| Script | JavaScript execution |
+---
-### Memory Tab Analysis
+## Scripts
-| Pattern | Meaning |
-|---------|---------|
-| Growing heap | Possible leak |
-| Large retained | Check references |
-| Detached DOM | Not cleaned up |
+| Script | Purpose | Run With |
+|---|---|---|
+| `scripts/lighthouse_audit.py` | Lighthouse performance audit | `python scripts/lighthouse_audit.py <url>` |
 ---
-## 5. Common Bottlenecks
+## Output Format
+When this skill produces a recommendation or design decision, structure your output as:
+```
+━━━ Performance Profiling Recommendation ━━━━━━━━━━━━━━━━
+Decision:    [what was chosen / proposed]
+Rationale:   [why — one concise line]
+Trade-offs:  [what is consciously accepted]
+Next action: [concrete next step for the user]
+─────────────────────────────────────────────────
+Pre-Flight:  ✅ All checks passed
+             or ❌ [blocking item that must be resolved first]
+```
-### By Symptom
-| Symptom | Likely Cause |
-|---------|--------------|
-| Slow initial load | Large JS, render blocking |
-| Slow interactions | Heavy event handlers |
-| Jank during scroll | Layout thrashing |
-| Growing memory | Leaks, retained refs |
 ---
-## 6. Quick Win Priorities
+## 🤖 LLM-Specific Traps
+AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
-| Priority | Action | Impact |
-|----------|--------|--------|
-| 1 | Enable compression | High |
-| 2 | Lazy load images | High |
-| 3 | Code split routes | High |
-| 4 | Cache static assets | Medium |
-| 5 | Optimize images | Medium |
+1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
+2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
+3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
+4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
 ---
-## 7. Anti-Patterns
+## 🏛️ Tribunal Integration (Anti-Hallucination)
-| ❌ Don't | ✅ Do |
-|----------|-------|
-| Guess at problems | Profile first |
-| Micro-optimize | Fix biggest issue |
-| Optimize early | Optimize when needed |
-| Ignore real users | Use RUM data |
+**Slash command: `/review` or `/tribunal-full`**
+**Active reviewers: `logic-reviewer` · `security-auditor`**
----
+### ❌ Forbidden AI Tropes
+1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
+2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
+3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+### ✅ Pre-Flight Self-Audit
+Review these questions before confirming output:
+```
+✅ Did I rely ONLY on real, verified tools and methods?
+✅ Is this solution appropriately scoped to the user's constraints?
+✅ Did I handle potential failure modes and edge cases?
+✅ Have I avoided generic boilerplate that doesn't add value?
+```
+### 🛑 Verification-Before-Completion (VBC) Protocol
-> **Remember:** The fastest code is code that doesn't run. Remove before optimizing.
+**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
+- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
+- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.

package/.agent/skills/plan-writing/SKILL.md CHANGED Viewed

@@ -1,152 +1,188 @@
 ---
 name: plan-writing
 description: Structured task planning with clear breakdowns, dependencies, and verification criteria. Use when implementing features, refactoring, or any multi-step work.
-allowed-tools: Read, Glob, Grep
+allowed-tools: Read, Write, Edit, Glob, Grep
+version: 1.0.0
+last-updated: 2026-03-12
+applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
 ---
-# Plan Writing
+# Task Planning Standards
-> Source: obra/superpowers
+> A plan is not a promise. It is a map.
+> Maps get updated when the terrain doesn't match them.
-## Overview
-This skill provides a framework for breaking down work into clear, actionable tasks with verification criteria.
+---
-## Task Breakdown Principles
+## When to Write a Plan
-### 1. Small, Focused Tasks
-- Each task should take 2-5 minutes
-- One clear outcome per task
-- Independently verifiable
+Write a plan before implementation when:
+- The task touches more than 2 files in non-trivial ways
+- The task has dependencies (thing B can't start until thing A is done)
+- The task involves a risky operation (migration, data transformation, breaking change)
+- The team needs to review the approach before time is spent implementing it
-### 2. Clear Verification
-- How do you know it's done?
-- What can you check/test?
-- What's the expected output?
+Skip the formal plan for: single-function fixes, typo corrections, config tweaks.
-### 3. Logical Ordering
-- Dependencies identified
-- Parallel work where possible
-- Critical path highlighted
-- **Phase X: Verification is always LAST**
+---
-### 4. Dynamic Naming in Project Root
-- Plan files are saved as `{task-slug}.md` in the PROJECT ROOT
-- Name derived from task (e.g., "add auth" → `auth-feature.md`)
-- **NEVER** inside `.claude/`, `docs/`, or temp folders
+## Plan Structure
-## Planning Principles (NOT Templates!)
+```markdown
+# Plan: [Feature or Task Name]
-> 🔴 **NO fixed templates. Each plan is UNIQUE to the task.**
+## Goal
+One sentence: what outcome does this achieve?
-### Principle 1: Keep It SHORT
+## Context
+- Why is this being done?
+- What problem does it solve or what requirement does it satisfy?
+- What exists today that this changes?
-| ❌ Wrong | ✅ Right |
-|----------|----------|
-| 50 tasks with sub-sub-tasks | 5-10 clear tasks max |
-| Every micro-step listed | Only actionable items |
-| Verbose descriptions | One-line per task |
+## Approach
+High-level strategy. Enough detail for someone unfamiliar with the code to understand the direction.
+Not implementation details — those go in the tasks.
-> **Rule:** If plan is longer than 1 page, it's too long. Simplify.
+## Tasks
----
+### Phase 1 — [Name] (prerequisite for Phase 2)
+- [ ] Task 1.1: Description
+- [ ] Task 1.2: Description (depends on 1.1)
-### Principle 2: Be SPECIFIC, Not Generic
+### Phase 2 — [Name] (can run after Phase 1 is complete)
+- [ ] Task 2.1: Description
+- [ ] Task 2.2: Description
-| ❌ Wrong | ✅ Right |
-|----------|----------|
-| "Set up project" | "Run `npx create-next-app`" |
-| "Add authentication" | "Install next-auth, create `/api/auth/[...nextauth].ts`" |
-| "Style the UI" | "Add Tailwind classes to `Header.tsx`" |
+## Verification
+How will we know this is done and working?
+- [ ] Specific behavior that can be tested
+- [ ] Metric or log line that confirms success
+- [ ] Edge case that must not regress
-> **Rule:** Each task should have a clear, verifiable outcome.
+## Risks and Open Questions
+- [Risk]: What might go wrong, and what's the mitigation?
+- [Open]: What decision hasn't been made yet that could change this plan?
+## Files That Will Change
+- `path/to/file.ts` — what changes
+- `path/to/schema.sql` — what changes
+```
 ---
-### Principle 3: Dynamic Content Based on Project Type
+## Dependency Notation
-**For NEW PROJECT:**
-- What tech stack? (decide first)
-- What's the MVP? (minimal features)
-- What's the file structure?
+When tasks have a strict order, mark it:
-**For FEATURE ADDITION:**
-- Which files are affected?
-- What dependencies needed?
-- How to verify it works?
+```
+Task A — (no dependencies, do first)
+Task B — (requires A complete)
+Task C — (can run parallel with B)
+Task D — (requires B and C complete)
+```
-**For BUG FIX:**
-- What's the root cause?
-- What file/line to change?
-- How to test the fix?
+This prevents teams from working on D while B is still broken.
 ---
-### Principle 4: Scripts Are Project-Specific
+## Task Granularity
-> 🔴 **DO NOT copy-paste script commands. Choose based on project type.**
+Each task should be:
+- Completable in one session by one person
+- Independently reviewable (a PR could represent one task)
+- Testable: there is a concrete way to know if it's done
-| Project Type | Relevant Scripts |
-|--------------|------------------|
-| Frontend/React | `ux_audit.py`, `accessibility_checker.py` |
-| Backend/API | `api_validator.py`, `security_scan.py` |
-| Mobile | `mobile_audit.py` |
-| Database | `schema_validator.py` |
-| Full-stack | Mix of above based on what you touched |
-**Wrong:** Adding all scripts to every plan
-**Right:** Only scripts relevant to THIS task
+**Too vague:** "Implement the auth system"
+**Right size:** "Add `POST /api/auth/login` endpoint with JWT issuance and Zod validation"
 ---
-### Principle 5: Verification is Simple
+## Updating the Plan
+Plans are living documents:
-| ❌ Wrong | ✅ Right |
-|----------|----------|
-| "Verify the component works correctly" | "Run `npm run dev`, click button, see toast" |
-| "Test the API" | "curl localhost:3000/api/users returns 200" |
-| "Check styles" | "Open browser, verify dark mode toggle works" |
+- Mark tasks `[x]` when complete, not when started
+- Add `[!]` to blocked tasks with a note on what is blocking
+- When an assumption proves wrong, update the approach section — don't silently deviate from the plan
 ---
-## Plan Structure (Flexible, Not Fixed!)
+## Verification Criteria Rules
+Verification criteria are not optional. For each task:
+- At least one must be **observable** (you can see it, not just believe it)
+- At least one must cover a **failure mode** (what should NOT happen)
 ```
-# [Task Name]
+✅ Observable: `POST /api/users` returns 201 with a user ID in the response body
+✅ Failure mode: `POST /api/users` with a duplicate email returns 409, not 500
+```
-## Goal
-One sentence: What are we building/fixing?
+---
-## Tasks
-- [ ] Task 1: [Specific action] → Verify: [How to check]
-- [ ] Task 2: [Specific action] → Verify: [How to check]
-- [ ] Task 3: [Specific action] → Verify: [How to check]
+## 🛑 Verification-Before-Completion (VBC) Protocol
-## Done When
-- [ ] [Main success criteria]
-```
+**CRITICAL:** Every plan must integrate a strict "evidence-based closeout" state machine for its tasks.
+- ❌ **Forbidden:** Writing vague verification steps like "Check that it looks right," "Ensure the code makes sense," or "Verify the logic."
+- ✅ **Required:** Verification criteria MUST demand **concrete terminal/compiler evidence** (e.g., test success logs, CLI execution outputs, compiler success states, or network trace results). Explicitly state that an agent CANNOT consider the task complete until it captures this hard evidence.
+---
+## Output Format
-> **That's it.** No phases, no sub-sections unless truly needed.
-> Keep it minimal. Add complexity only when required.
+When this skill produces a recommendation or design decision, structure your output as:
-## Notes
-[Any important considerations]
 ```
+━━━ Plan Writing Recommendation ━━━━━━━━━━━━━━━━
+Decision:    [what was chosen / proposed]
+Rationale:   [why — one concise line]
+Trade-offs:  [what is consciously accepted]
+Next action: [concrete next step for the user]
+─────────────────────────────────────────────────
+Pre-Flight:  ✅ All checks passed
+             or ❌ [blocking item that must be resolved first]
+```
 ---
-## Best Practices (Quick Reference)
+## 🤖 LLM-Specific Traps
+AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
-1. **Start with goal** - What are we building/fixing?
-2. **Max 10 tasks** - If more, break into multiple plans
-3. **Each task verifiable** - Clear "done" criteria
-4. **Project-specific** - No copy-paste templates
-5. **Update as you go** - Mark `[x]` when complete
+1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
+2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
+3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
+4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
 ---
-## When to Use
+## 🏛️ Tribunal Integration (Anti-Hallucination)
+**Slash command: `/review` or `/tribunal-full`**
+**Active reviewers: `logic-reviewer` · `security-auditor`**
+### ❌ Forbidden AI Tropes
+1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
+2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
+3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+### ✅ Pre-Flight Self-Audit
+Review these questions before confirming output:
+```
+✅ Did I rely ONLY on real, verified tools and methods?
+✅ Is this solution appropriately scoped to the user's constraints?
+✅ Did I handle potential failure modes and edge cases?
+✅ Have I avoided generic boilerplate that doesn't add value?
+```
+### 🛑 Verification-Before-Completion (VBC) Protocol
-- New project from scratch
-- Adding a feature
-- Fixing a bug (if complex)
-- Refactoring multiple files
+**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
+- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
+- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.