npm - tribunal-kit - Versions diffs - 1.0.0 → 2.4.0 - Mend

tribunal-kit 1.0.0 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (125) hide show

package/.agent/.shared/ui-ux-pro-max/README.md +3 -3
package/.agent/ARCHITECTURE.md +205 -10
package/.agent/GEMINI.md +37 -7
package/.agent/agents/accessibility-reviewer.md +134 -0
package/.agent/agents/ai-code-reviewer.md +129 -0
package/.agent/agents/frontend-specialist.md +3 -0
package/.agent/agents/game-developer.md +21 -21
package/.agent/agents/logic-reviewer.md +12 -0
package/.agent/agents/mobile-reviewer.md +79 -0
package/.agent/agents/orchestrator.md +56 -26
package/.agent/agents/performance-reviewer.md +36 -0
package/.agent/agents/supervisor-agent.md +156 -0
package/.agent/agents/swarm-worker-contracts.md +166 -0
package/.agent/agents/swarm-worker-registry.md +92 -0
package/.agent/rules/GEMINI.md +134 -5
package/.agent/scripts/bundle_analyzer.py +259 -0
package/.agent/scripts/dependency_analyzer.py +247 -0
package/.agent/scripts/lint_runner.py +188 -0
package/.agent/scripts/patch_skills_meta.py +177 -0
package/.agent/scripts/patch_skills_output.py +285 -0
package/.agent/scripts/schema_validator.py +279 -0
package/.agent/scripts/security_scan.py +224 -0
package/.agent/scripts/session_manager.py +144 -3
package/.agent/scripts/skill_integrator.py +234 -0
package/.agent/scripts/strengthen_skills.py +220 -0
package/.agent/scripts/swarm_dispatcher.py +317 -0
package/.agent/scripts/test_runner.py +192 -0
package/.agent/scripts/test_swarm_dispatcher.py +163 -0
package/.agent/skills/agent-organizer/SKILL.md +132 -0
package/.agent/skills/agentic-patterns/SKILL.md +335 -0
package/.agent/skills/api-patterns/SKILL.md +226 -50
package/.agent/skills/app-builder/SKILL.md +215 -52
package/.agent/skills/architecture/SKILL.md +176 -31
package/.agent/skills/bash-linux/SKILL.md +150 -134
package/.agent/skills/behavioral-modes/SKILL.md +152 -160
package/.agent/skills/brainstorming/SKILL.md +148 -101
package/.agent/skills/brainstorming/dynamic-questioning.md +10 -0
package/.agent/skills/clean-code/SKILL.md +139 -134
package/.agent/skills/code-review-checklist/SKILL.md +177 -80
package/.agent/skills/config-validator/SKILL.md +165 -0
package/.agent/skills/csharp-developer/SKILL.md +107 -0
package/.agent/skills/database-design/SKILL.md +252 -29
package/.agent/skills/deployment-procedures/SKILL.md +122 -175
package/.agent/skills/devops-engineer/SKILL.md +134 -0
package/.agent/skills/devops-incident-responder/SKILL.md +98 -0
package/.agent/skills/documentation-templates/SKILL.md +175 -121
package/.agent/skills/dotnet-core-expert/SKILL.md +103 -0
package/.agent/skills/edge-computing/SKILL.md +213 -0
package/.agent/skills/frontend-design/SKILL.md +76 -0
package/.agent/skills/frontend-design/color-system.md +18 -0
package/.agent/skills/frontend-design/typography-system.md +18 -0
package/.agent/skills/game-development/SKILL.md +69 -0
package/.agent/skills/geo-fundamentals/SKILL.md +158 -99
package/.agent/skills/i18n-localization/SKILL.md +158 -96
package/.agent/skills/intelligent-routing/SKILL.md +89 -285
package/.agent/skills/intelligent-routing/router-manifest.md +65 -0
package/.agent/skills/lint-and-validate/SKILL.md +229 -27
package/.agent/skills/llm-engineering/SKILL.md +258 -0
package/.agent/skills/local-first/SKILL.md +203 -0
package/.agent/skills/mcp-builder/SKILL.md +159 -111
package/.agent/skills/mobile-design/SKILL.md +102 -282
package/.agent/skills/nextjs-react-expert/SKILL.md +143 -227
package/.agent/skills/nodejs-best-practices/SKILL.md +201 -254
package/.agent/skills/observability/SKILL.md +285 -0
package/.agent/skills/parallel-agents/SKILL.md +124 -118
package/.agent/skills/performance-profiling/SKILL.md +143 -89
package/.agent/skills/plan-writing/SKILL.md +133 -97
package/.agent/skills/platform-engineer/SKILL.md +135 -0
package/.agent/skills/powershell-windows/SKILL.md +167 -104
package/.agent/skills/python-patterns/SKILL.md +149 -361
package/.agent/skills/python-pro/SKILL.md +114 -0
package/.agent/skills/react-specialist/SKILL.md +107 -0
package/.agent/skills/realtime-patterns/SKILL.md +296 -0
package/.agent/skills/red-team-tactics/SKILL.md +136 -134
package/.agent/skills/rust-pro/SKILL.md +237 -173
package/.agent/skills/seo-fundamentals/SKILL.md +134 -82
package/.agent/skills/server-management/SKILL.md +155 -104
package/.agent/skills/sql-pro/SKILL.md +104 -0
package/.agent/skills/systematic-debugging/SKILL.md +156 -79
package/.agent/skills/tailwind-patterns/SKILL.md +163 -205
package/.agent/skills/tdd-workflow/SKILL.md +148 -88
package/.agent/skills/test-result-analyzer/SKILL.md +299 -0
package/.agent/skills/testing-patterns/SKILL.md +141 -114
package/.agent/skills/trend-researcher/SKILL.md +228 -0
package/.agent/skills/ui-ux-pro-max/SKILL.md +107 -0
package/.agent/skills/ui-ux-researcher/SKILL.md +234 -0
package/.agent/skills/vue-expert/SKILL.md +118 -0
package/.agent/skills/vulnerability-scanner/SKILL.md +228 -188
package/.agent/skills/web-design-guidelines/SKILL.md +148 -33
package/.agent/skills/webapp-testing/SKILL.md +171 -122
package/.agent/skills/whimsy-injector/SKILL.md +349 -0
package/.agent/skills/workflow-optimizer/SKILL.md +219 -0
package/.agent/workflows/api-tester.md +279 -0
package/.agent/workflows/audit.md +168 -0
package/.agent/workflows/brainstorm.md +65 -19
package/.agent/workflows/changelog.md +144 -0
package/.agent/workflows/create.md +67 -14
package/.agent/workflows/debug.md +122 -30
package/.agent/workflows/deploy.md +82 -31
package/.agent/workflows/enhance.md +59 -27
package/.agent/workflows/fix.md +143 -0
package/.agent/workflows/generate.md +84 -20
package/.agent/workflows/migrate.md +163 -0
package/.agent/workflows/orchestrate.md +66 -17
package/.agent/workflows/performance-benchmarker.md +305 -0
package/.agent/workflows/plan.md +76 -33
package/.agent/workflows/preview.md +73 -17
package/.agent/workflows/refactor.md +153 -0
package/.agent/workflows/review-ai.md +140 -0
package/.agent/workflows/review.md +83 -16
package/.agent/workflows/session.md +154 -0
package/.agent/workflows/status.md +74 -18
package/.agent/workflows/strengthen-skills.md +99 -0
package/.agent/workflows/swarm.md +194 -0
package/.agent/workflows/test.md +80 -31
package/.agent/workflows/tribunal-backend.md +55 -13
package/.agent/workflows/tribunal-database.md +62 -18
package/.agent/workflows/tribunal-frontend.md +58 -12
package/.agent/workflows/tribunal-full.md +70 -11
package/.agent/workflows/tribunal-mobile.md +123 -0
package/.agent/workflows/tribunal-performance.md +152 -0
package/.agent/workflows/ui-ux-pro-max.md +100 -82
package/README.md +117 -62
package/bin/tribunal-kit.js +329 -75
package/package.json +10 -6

package/.agent/skills/observability/SKILL.md ADDED Viewed

@@ -0,0 +1,285 @@
+---
+name: observability
+description: Production observability principles. OpenTelemetry traces, structured logs, metrics, SLOs/SLIs/error budgets, and AI observability. Use when setting up monitoring, debugging production issues, or designing observable distributed systems.
+allowed-tools: Read, Write, Edit, Glob, Grep
+version: 1.0.0
+last-updated: 2026-03-12
+applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
+---
+# Observability Principles
+> Monitoring tells you when something is broken.
+> Observability tells you why.
+---
+## The Three Pillars
+```
+TRACES    → The journey of a single request across services
+            "Why was THIS request slow?"
+LOGS      → Discrete events with context
+            "What exactly happened at 14:23:07?"
+METRICS   → Aggregated measurements over time
+            "What is our error rate over the last hour?"
+```
+Use all three. They answer different questions. None replaces the others.
+---
+## OpenTelemetry: The Standard
+OpenTelemetry (OTel) is the vendor-neutral standard for instrumentation. Use it and you can swap backends (Jaeger, Grafana Tempo, Honeycomb, Datadog) without changing application code.
+```ts
+// src/instrumentation.ts — initialize OTel once, before app code
+import { NodeSDK } from '@opentelemetry/sdk-node';
+import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
+import { Resource } from '@opentelemetry/resources';
+import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';
+const sdk = new NodeSDK({
+  resource: new Resource({
+    [SemanticResourceAttributes.SERVICE_NAME]: 'my-api',
+    [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0',
+  }),
+  traceExporter: new OTLPTraceExporter({
+    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT,
+  }),
+});
+sdk.start();
+process.on('SIGTERM', () => sdk.shutdown());
+```
+---
+## Distributed Tracing
+Traces connect the dots across microservice boundaries:
+```ts
+import { trace, context, SpanStatusCode } from '@opentelemetry/api';
+const tracer = trace.getTracer('payment-service');
+async function processPayment(orderId: string, amount: number) {
+  return tracer.startActiveSpan('payment.process', async (span) => {
+    try {
+      // Add business context to the span
+      span.setAttributes({
+        'order.id': orderId,
+        'payment.amount': amount,
+        'payment.currency': 'USD',
+      });
+      const result = await chargeCard(orderId, amount);
+      span.setStatus({ code: SpanStatusCode.OK });
+      return result;
+    } catch (err) {
+      // Record the error with full context
+      span.recordException(err as Error);
+      span.setStatus({ code: SpanStatusCode.ERROR, message: (err as Error).message });
+      throw err;
+    } finally {
+      span.end();
+    }
+  });
+}
+```
+---
+## Structured Logging
+Logs must be machine-parseable:
+```ts
+// ❌ Unstructured — impossible to query, filter, or alert on
+console.log(`User ${userId} failed to login at ${new Date()}`);
+// ✅ Structured — every field is queryable
+logger.warn({
+  event: 'auth.login_failed',
+  userId,
+  reason: 'invalid_password',
+  attemptCount: 3,
+  ip: req.ip,
+  timestamp: new Date().toISOString(),
+});
+```
+### What to Always Log
+| Always | Never |
+|---|---|
+| Request ID / trace ID | Passwords or password hashes |
+| User ID (not PII) | Credit card numbers |
+| Error type + message | API keys or tokens |
+| Duration (ms) | Full request bodies (may contain PII) |
+| HTTP status code | |
+---
+## Metrics: What to Measure
+The four golden signals (Google SRE):
+```
+1. LATENCY       — How long does serving a request take?
+                   Track p50, p95, p99 — not just average
+                   Average hides the worst-case user experience
+2. TRAFFIC       — How much demand is there?
+                   requests/sec, messages/sec, bytes/sec
+3. ERRORS        — What fraction of requests are failing?
+                   HTTP 5xx rate, exception rate, timeout rate
+4. SATURATION    — How "full" is your service?
+                   CPU %, memory %, queue depth
+```
+---
+## SLOs / SLIs / Error Budgets
+The framework that connects technical work to business reliability:
+```
+SLI (Service Level Indicator) — a specific, measurable signal:
+  "HTTP 200 responses as % of all responses to /api/checkout"
+SLO (Service Level Objective) — your reliability promise:
+  "99.9% of checkout requests succeed over a 30-day window"
+Error Budget — how much unreliability you can afford:
+  "30 days × 0.1% error tolerance = 43.2 minutes of downtime allowed"
+Error Budget Policy:
+  Budget healthy  → ship new features freely
+  Budget depleted → freeze releases, focus only on reliability
+```
+---
+## AI Observability
+Standard metrics don't cover AI systems. Add these:
+```ts
+// Track every AI call with these dimensions
+logger.info({
+  event: 'ai.completion',
+  model: 'gpt-4o',
+  prompt_tokens: response.usage.prompt_tokens,
+  completion_tokens: response.usage.completion_tokens,
+  total_tokens: response.usage.total_tokens,
+  latency_ms: duration,
+  cost_usd: calculateCost(model, usage),
+  trace_id: currentTraceId(),
+  // Eval scores (from async evaluation pipeline)
+  eval_faithfulness: 0.92,    // Did output match sources?
+  eval_relevance: 0.88,       // Did output answer the question?
+});
+```
+### AI-Specific Alerts
+```
+🚨 TOKEN COST SPIKE     → cost per request > 2x trailing average → alert
+🚨 LATENCY DEGRADATION  → p95 LLM latency > 5s → alert
+🚨 EVAL SCORE DECLINE   → faithfulness drops below 0.8 (model drift?) → alert
+🚨 ERROR RATE SPIKE     → 429s or context_length errors > 5% → alert
+```
+---
+## Output Format
+When this skill produces a recommendation or design decision, structure your output as:
+```
+━━━ Observability Recommendation ━━━━━━━━━━━━━━━━
+Decision:    [what was chosen / proposed]
+Rationale:   [why — one concise line]
+Trade-offs:  [what is consciously accepted]
+Next action: [concrete next step for the user]
+─────────────────────────────────────────────────
+Pre-Flight:  ✅ All checks passed
+             or ❌ [blocking item that must be resolved first]
+```
+---
+## 🏛️ Tribunal Integration (Anti-Hallucination)
+**Slash command: `/tribunal-backend`**
+**Active reviewers: `logic` · `security` · `performance`**
+### ❌ Forbidden AI Tropes in Observability
+1. **Logging sensitive data** — never log request bodies wholesale — they contain passwords, tokens, PII. Log only specific, safe fields.
+2. **Tracking averages only** — `avg(latency)` hides the 1% of users who get 10x worse experience. Always use percentiles (p95, p99).
+3. **100% SLO targets** — `99.999%` SLOs are wrong for most services. They consume all error budget instantly and paralyze product velocity.
+4. **Inventing OTel packages** — only use `@opentelemetry/{sdk-node,api,exporter-*}` from the official `@opentelemetry` npm org.
+### ✅ Pre-Flight Self-Audit
+```
+✅ Are logs structured JSON (not string-interpolated messages)?
+✅ Is no PII or credential data being logged?
+✅ Are latency measurements tracking percentiles (p95/p99), not just averages?
+✅ Does every async operation have a trace span with error recording?
+✅ Are AI calls instrumented with token count + cost + latency tracking?
+✅ Is there an SLO defined with an explicit error budget policy?
+```
+---
+## 🤖 LLM-Specific Traps
+AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
+1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
+2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
+3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
+4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
+---
+## 🏛️ Tribunal Integration (Anti-Hallucination)
+**Slash command: `/review` or `/tribunal-full`**
+**Active reviewers: `logic-reviewer` · `security-auditor`**
+### ❌ Forbidden AI Tropes
+1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
+2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
+3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+### ✅ Pre-Flight Self-Audit
+Review these questions before confirming output:
+```
+✅ Did I rely ONLY on real, verified tools and methods?
+✅ Is this solution appropriately scoped to the user's constraints?
+✅ Did I handle potential failure modes and edge cases?
+✅ Have I avoided generic boilerplate that doesn't add value?
+```
+### 🛑 Verification-Before-Completion (VBC) Protocol
+**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
+- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
+- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.

package/.agent/skills/parallel-agents/SKILL.md CHANGED Viewed

@@ -1,175 +1,181 @@
 ---
 name: parallel-agents
 description: Multi-agent orchestration patterns. Use when multiple independent tasks can run with different domain expertise or when comprehensive analysis requires multiple perspectives.
-allowed-tools: Read, Glob, Grep
+allowed-tools: Read, Write, Edit, Glob, Grep
+version: 1.0.0
+last-updated: 2026-03-12
+applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
 ---
-# Native Parallel Agents
+# Multi-Agent Orchestration
-> Orchestration through Antigravity's built-in Agent Tool
+> Parallel agents are faster. They are also harder to keep consistent.
+> Coordinate them — don't just fire them simultaneously and hope for compatible outputs.
-## Overview
-This skill enables coordinating multiple specialized agents through Antigravity's native agent system. Unlike external scripts, this approach keeps all orchestration within Antigravity's control.
+---
-## When to Use Orchestration
+## When to Use Parallel Agents
-✅ **Good for:**
-- Complex tasks requiring multiple expertise domains
-- Code analysis from security, performance, and quality perspectives
-- Comprehensive reviews (architecture + security + testing)
-- Feature implementation needing backend + frontend + database work
+Use multiple agents when:
+- Tasks are genuinely **independent** (output of A doesn't feed input of B)
+- Different tasks require **different domain expertise**
+- Comprehensive **review** needs multiple specialist perspectives simultaneously
+- Speed matters and tasks can be assigned and awaited independently
-❌ **Not for:**
-- Simple, single-domain tasks
-- Quick fixes or small changes
-- Tasks where one agent suffices
+Do **not** use parallel agents when:
+- Tasks have sequential dependencies (you need the result to start the next)
+- The overhead of coordination exceeds the time saved
 ---
-## Native Agent Invocation
+## Orchestration Patterns
-### Single Agent
-```
-Use the security-auditor agent to review authentication
-```
+### Pattern 1 — Parallel Review (Tribunal)
-### Sequential Chain
-```
-First, use the explorer-agent to discover project structure.
-Then, use the backend-specialist to review API endpoints.
-Finally, use the test-engineer to identify test gaps.
-```
+Multiple reviewers look at the same code simultaneously, each from a different angle.
-### With Context Passing
-```
-Use the frontend-specialist to analyze React components.
-Based on those findings, have the test-engineer generate component tests.
 ```
+Code (input)
+    ├── → logic-reviewer      → finds logic errors
+    ├── → security-auditor    → finds vulnerabilities
+    ├── → type-safety-reviewer → finds type unsafe code
+    └── → performance-reviewer → finds bottlenecks
-### Resume Previous Work
-```
-Resume agent [agentId] and continue with additional requirements.
+All verdicts → synthesize → Human Gate (approve/reject/revise)
 ```
----
+**When:** `/tribunal-*` commands, code review before merge
-## Orchestration Patterns
+### Pattern 2 — Domain Specialization
+Different specialists handle different parts of the same task simultaneously.
-### Pattern 1: Comprehensive Analysis
 ```
-Agents: explorer-agent → [domain-agents] → synthesis
-1. explorer-agent: Map codebase structure
-2. security-auditor: Security posture
-3. backend-specialist: API quality
-4. frontend-specialist: UI/UX patterns
-5. test-engineer: Test coverage
-6. Synthesize all findings
+"Build a user auth system" (input)
+    ├── → backend-specialist    → API routes + JWT logic
+    ├── → frontend-specialist   → Login/register UI
+    └── → database-architect    → User schema + sessions table
+All outputs → orchestrator synthesizes into coherent system
+(ensures API contract matches what frontend calls,
+ and DB schema matches what backend queries)
 ```
-### Pattern 2: Feature Review
-```
-Agents: affected-domain-agents → test-engineer
+**When:** Full-stack feature builds via `/orchestrate`
-1. Identify affected domains (backend? frontend? both?)
-2. Invoke relevant domain agents
-3. test-engineer verifies changes
-4. Synthesize recommendations
-```
+### Pattern 3 — Sequential with Parallel Phases
+Some tasks are inherently sequential at the macro level but can parallelize within each phase.
-### Pattern 3: Security Audit
 ```
-Agents: security-auditor → penetration-tester → synthesis
+Phase 1 (sequential):
+  database-architect → schema design
-1. security-auditor: Configuration and code review
-2. penetration-tester: Active vulnerability testing
-3. Synthesize with prioritized remediation
+Phase 2 (parallel, after Phase 1):
+  backend-specialist  → API uses schema from Phase 1
+  frontend-specialist → UI uses API contract from Phase 2a (estimated)
+Phase 3 (sequential, after Phase 2):
+  test-engineer → E2E tests with real API + UI
 ```
 ---
-## Available Agents
-| Agent | Expertise | Trigger Phrases |
-|-------|-----------|-----------------|
-| `orchestrator` | Coordination | "comprehensive", "multi-perspective" |
-| `security-auditor` | Security | "security", "auth", "vulnerabilities" |
-| `penetration-tester` | Security Testing | "pentest", "red team", "exploit" |
-| `backend-specialist` | Backend | "API", "server", "Node.js", "Express" |
-| `frontend-specialist` | Frontend | "React", "UI", "components", "Next.js" |
-| `test-engineer` | Testing | "tests", "coverage", "TDD" |
-| `devops-engineer` | DevOps | "deploy", "CI/CD", "infrastructure" |
-| `database-architect` | Database | "schema", "Prisma", "migrations" |
-| `mobile-developer` | Mobile | "React Native", "Flutter", "mobile" |
-| `api-designer` | API Design | "REST", "GraphQL", "OpenAPI" |
-| `debugger` | Debugging | "bug", "error", "not working" |
-| `explorer-agent` | Discovery | "explore", "map", "structure" |
-| `documentation-writer` | Documentation | "write docs", "create README", "generate API docs" |
-| `performance-optimizer` | Performance | "slow", "optimize", "profiling" |
-| `project-planner` | Planning | "plan", "roadmap", "milestones" |
-| `seo-specialist` | SEO | "SEO", "meta tags", "search ranking" |
-| `game-developer` | Game Development | "game", "Unity", "Godot", "Phaser" |
+## Orchestrator Responsibilities
+The orchestrator coordinates agents. It:
+1. **Assigns scope** — each agent gets exactly what it needs, nothing more
+2. **Manages state** — passes the right outputs from each agent to the next that needs them
+3. **Resolves conflicts** — when two agents propose incompatible solutions, the orchestrator decides or asks the user
+4. **Verifies consistency** — ensures that the API contract the backend builds matches what the frontend calls
 ---
-## Antigravity Built-in Agents
+## Consistency Rules for Multi-Agent Output
-These work alongside custom agents:
+The biggest failure in parallel agent work is **inconsistency at boundaries**:
-| Agent | Model | Purpose |
-|-------|-------|---------|
-| **Explore** | Haiku | Fast read-only codebase search |
-| **Plan** | Sonnet | Research during plan mode |
-| **General-purpose** | Sonnet | Complex multi-step modifications |
+- Backend generates `userId` but frontend calls it `user_id`
+- Database schema has `user_email` but backend queries `email`
+- Agent A designs one error shape; Agent B assumes a different one
-Use **Explore** for quick searches, **custom agents** for domain expertise.
+**Prevention:**
+- Establish contracts (types, schemas, API shapes) **before** parallel work begins
+- Each agent receives the shared contract as context
+- Orchestrator reviews all outputs for boundary consistency before presenting to user
 ---
-## Synthesis Protocol
+## Communication Format Between Agents
-After all agents complete, synthesize:
+When one agent's output feeds another:
-```markdown
-## Orchestration Synthesis
+```
+[AGENT: backend-specialist OUTPUT]
+API Contract:
+  POST /api/users → { id: string, email: string, createdAt: string }
+  POST /api/auth/login → { token: string, expiresAt: string }
-### Task Summary
-[What was accomplished]
+[AGENT: frontend-specialist RECEIVES]
+Use the above API contract. Build the UI to match these exact request/response shapes.
+```
+---
-### Agent Contributions
-| Agent | Finding |
-|-------|---------|
-| security-auditor | Found X |
-| backend-specialist | Identified Y |
+## Output Format
-### Consolidated Recommendations
-1. **Critical**: [Issue from Agent A]
-2. **Important**: [Issue from Agent B]
-3. **Nice-to-have**: [Enhancement from Agent C]
+When this skill completes a task, structure your output as:
-### Action Items
-- [ ] Fix critical security issue
-- [ ] Refactor API endpoint
-- [ ] Add missing tests
 ```
+━━━ Parallel Agents Output ━━━━━━━━━━━━━━━━━━━━━━━━
+Task:        [what was performed]
+Result:      [outcome summary — one line]
+─────────────────────────────────────────────────
+Checks:      ✅ [N passed] · ⚠️  [N warnings] · ❌ [N blocked]
+VBC status:  PENDING → VERIFIED
+Evidence:    [link to terminal output, test result, or file diff]
+```
 ---
-## Best Practices
+## 🤖 LLM-Specific Traps
+AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
-1. **Available agents** - 17 specialized agents can be orchestrated
-2. **Logical order** - Discovery → Analysis → Implementation → Testing
-3. **Share context** - Pass relevant findings to subsequent agents
-4. **Single synthesis** - One unified report, not separate outputs
-5. **Verify changes** - Always include test-engineer for code modifications
+1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
+2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
+3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
+4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
 ---
-## Key Benefits
+## 🏛️ Tribunal Integration (Anti-Hallucination)
+**Slash command: `/review` or `/tribunal-full`**
+**Active reviewers: `logic-reviewer` · `security-auditor`**
+### ❌ Forbidden AI Tropes
+1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
+2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
+3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+### ✅ Pre-Flight Self-Audit
+Review these questions before confirming output:
+```
+✅ Did I rely ONLY on real, verified tools and methods?
+✅ Is this solution appropriately scoped to the user's constraints?
+✅ Did I handle potential failure modes and edge cases?
+✅ Have I avoided generic boilerplate that doesn't add value?
+```
+### 🛑 Verification-Before-Completion (VBC) Protocol
-- ✅ **Single session** - All agents share context
-- ✅ **AI-controlled** - Claude orchestrates autonomously
-- ✅ **Native integration** - Works with built-in Explore, Plan agents
-- ✅ **Resume support** - Can continue previous agent work
-- ✅ **Context passing** - Findings flow between agents
+**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
+- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
+- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.