npm - tribunal-kit - Versions diffs - 1.0.0 → 2.4.2 - Mend

tribunal-kit 1.0.0 → 2.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (127) hide show

package/.agent/.shared/ui-ux-pro-max/README.md +3 -3
package/.agent/ARCHITECTURE.md +205 -10
package/.agent/GEMINI.md +37 -7
package/.agent/agents/accessibility-reviewer.md +134 -0
package/.agent/agents/ai-code-reviewer.md +129 -0
package/.agent/agents/frontend-specialist.md +3 -0
package/.agent/agents/game-developer.md +21 -21
package/.agent/agents/logic-reviewer.md +12 -0
package/.agent/agents/mobile-reviewer.md +79 -0
package/.agent/agents/orchestrator.md +56 -26
package/.agent/agents/performance-reviewer.md +36 -0
package/.agent/agents/supervisor-agent.md +156 -0
package/.agent/agents/swarm-worker-contracts.md +166 -0
package/.agent/agents/swarm-worker-registry.md +92 -0
package/.agent/rules/GEMINI.md +134 -5
package/.agent/scripts/bundle_analyzer.py +259 -0
package/.agent/scripts/dependency_analyzer.py +247 -0
package/.agent/scripts/lint_runner.py +188 -0
package/.agent/scripts/patch_skills_meta.py +177 -0
package/.agent/scripts/patch_skills_output.py +285 -0
package/.agent/scripts/schema_validator.py +279 -0
package/.agent/scripts/security_scan.py +224 -0
package/.agent/scripts/session_manager.py +144 -3
package/.agent/scripts/skill_integrator.py +234 -0
package/.agent/scripts/strengthen_skills.py +220 -0
package/.agent/scripts/swarm_dispatcher.py +317 -0
package/.agent/scripts/test_runner.py +192 -0
package/.agent/scripts/test_swarm_dispatcher.py +163 -0
package/.agent/skills/agent-organizer/SKILL.md +132 -0
package/.agent/skills/agentic-patterns/SKILL.md +335 -0
package/.agent/skills/api-patterns/SKILL.md +226 -50
package/.agent/skills/app-builder/SKILL.md +215 -52
package/.agent/skills/architecture/SKILL.md +176 -31
package/.agent/skills/bash-linux/SKILL.md +150 -134
package/.agent/skills/behavioral-modes/SKILL.md +152 -160
package/.agent/skills/brainstorming/SKILL.md +148 -101
package/.agent/skills/brainstorming/dynamic-questioning.md +10 -0
package/.agent/skills/clean-code/SKILL.md +139 -134
package/.agent/skills/code-review-checklist/SKILL.md +177 -80
package/.agent/skills/config-validator/SKILL.md +165 -0
package/.agent/skills/csharp-developer/SKILL.md +107 -0
package/.agent/skills/database-design/SKILL.md +252 -29
package/.agent/skills/deployment-procedures/SKILL.md +122 -175
package/.agent/skills/devops-engineer/SKILL.md +134 -0
package/.agent/skills/devops-incident-responder/SKILL.md +98 -0
package/.agent/skills/documentation-templates/SKILL.md +175 -121
package/.agent/skills/dotnet-core-expert/SKILL.md +103 -0
package/.agent/skills/edge-computing/SKILL.md +213 -0
package/.agent/skills/frontend-design/SKILL.md +76 -0
package/.agent/skills/frontend-design/color-system.md +18 -0
package/.agent/skills/frontend-design/typography-system.md +18 -0
package/.agent/skills/game-development/SKILL.md +69 -0
package/.agent/skills/geo-fundamentals/SKILL.md +158 -99
package/.agent/skills/github-operations/SKILL.md +354 -0
package/.agent/skills/i18n-localization/SKILL.md +158 -96
package/.agent/skills/intelligent-routing/SKILL.md +89 -285
package/.agent/skills/intelligent-routing/router-manifest.md +65 -0
package/.agent/skills/lint-and-validate/SKILL.md +229 -27
package/.agent/skills/llm-engineering/SKILL.md +258 -0
package/.agent/skills/local-first/SKILL.md +203 -0
package/.agent/skills/mcp-builder/SKILL.md +159 -111
package/.agent/skills/mobile-design/SKILL.md +102 -282
package/.agent/skills/nextjs-react-expert/SKILL.md +143 -227
package/.agent/skills/nodejs-best-practices/SKILL.md +201 -254
package/.agent/skills/observability/SKILL.md +285 -0
package/.agent/skills/parallel-agents/SKILL.md +124 -118
package/.agent/skills/performance-profiling/SKILL.md +143 -89
package/.agent/skills/plan-writing/SKILL.md +133 -97
package/.agent/skills/platform-engineer/SKILL.md +135 -0
package/.agent/skills/powershell-windows/SKILL.md +167 -104
package/.agent/skills/python-patterns/SKILL.md +149 -361
package/.agent/skills/python-pro/SKILL.md +114 -0
package/.agent/skills/react-specialist/SKILL.md +107 -0
package/.agent/skills/readme-builder/SKILL.md +270 -0
package/.agent/skills/realtime-patterns/SKILL.md +296 -0
package/.agent/skills/red-team-tactics/SKILL.md +136 -134
package/.agent/skills/rust-pro/SKILL.md +237 -173
package/.agent/skills/seo-fundamentals/SKILL.md +134 -82
package/.agent/skills/server-management/SKILL.md +155 -104
package/.agent/skills/sql-pro/SKILL.md +104 -0
package/.agent/skills/systematic-debugging/SKILL.md +156 -79
package/.agent/skills/tailwind-patterns/SKILL.md +163 -205
package/.agent/skills/tdd-workflow/SKILL.md +148 -88
package/.agent/skills/test-result-analyzer/SKILL.md +299 -0
package/.agent/skills/testing-patterns/SKILL.md +141 -114
package/.agent/skills/trend-researcher/SKILL.md +228 -0
package/.agent/skills/ui-ux-pro-max/SKILL.md +107 -0
package/.agent/skills/ui-ux-researcher/SKILL.md +234 -0
package/.agent/skills/vue-expert/SKILL.md +118 -0
package/.agent/skills/vulnerability-scanner/SKILL.md +228 -188
package/.agent/skills/web-design-guidelines/SKILL.md +148 -33
package/.agent/skills/webapp-testing/SKILL.md +171 -122
package/.agent/skills/whimsy-injector/SKILL.md +349 -0
package/.agent/skills/workflow-optimizer/SKILL.md +219 -0
package/.agent/workflows/api-tester.md +279 -0
package/.agent/workflows/audit.md +168 -0
package/.agent/workflows/brainstorm.md +65 -19
package/.agent/workflows/changelog.md +144 -0
package/.agent/workflows/create.md +67 -14
package/.agent/workflows/debug.md +122 -30
package/.agent/workflows/deploy.md +82 -31
package/.agent/workflows/enhance.md +59 -27
package/.agent/workflows/fix.md +143 -0
package/.agent/workflows/generate.md +84 -20
package/.agent/workflows/migrate.md +163 -0
package/.agent/workflows/orchestrate.md +66 -17
package/.agent/workflows/performance-benchmarker.md +305 -0
package/.agent/workflows/plan.md +76 -33
package/.agent/workflows/preview.md +73 -17
package/.agent/workflows/refactor.md +153 -0
package/.agent/workflows/review-ai.md +140 -0
package/.agent/workflows/review.md +83 -16
package/.agent/workflows/session.md +154 -0
package/.agent/workflows/status.md +74 -18
package/.agent/workflows/strengthen-skills.md +99 -0
package/.agent/workflows/swarm.md +194 -0
package/.agent/workflows/test.md +80 -31
package/.agent/workflows/tribunal-backend.md +55 -13
package/.agent/workflows/tribunal-database.md +62 -18
package/.agent/workflows/tribunal-frontend.md +58 -12
package/.agent/workflows/tribunal-full.md +70 -11
package/.agent/workflows/tribunal-mobile.md +123 -0
package/.agent/workflows/tribunal-performance.md +152 -0
package/.agent/workflows/ui-ux-pro-max.md +100 -82
package/README.md +117 -62
package/bin/tribunal-kit.js +542 -288
package/package.json +10 -6

package/.agent/skills/web-design-guidelines/SKILL.md CHANGED Viewed

@@ -1,57 +1,172 @@
 ---
 name: web-design-guidelines
-description: Review UI code for Web Interface Guidelines compliance. Use when asked to "review my UI", "check accessibility", "audit design", "review UX", or "check my site against best practices".
-metadata:
-  author: vercel
-  version: "1.0.0"
-  argument-hint: <file-or-pattern>
+description: Review UI code for Next-Generation Web Interface Guidelines compliance. Use when asked to "review my UI", "check accessibility", "audit design", "review UX", or "check my site against best practices".
+allowed-tools: Read, Write, Edit, Glob, Grep
+version: 1.0.0
+last-updated: 2026-03-12
+applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
 ---
-# Web Interface Guidelines
+# Next-Gen Web Interface Review Guidelines (Pro-Max Level)
-Review files for compliance with Web Interface Guidelines.
+> Good UI is invisible. Users think about their task, not about the interface.
+> Great UI anticipates the task and reacts at the speed of thought.
-## How It Works
+---
+## Review Trigger
+Load this skill when asked to:
+- Review or audit a UI
+- Check accessibility compliance (WCAG 3.0 / APCA)
+- Improve UX & Cognitive Safety
+- Check a site against best practices
+---
+## Extreme Review Categories
+### 1. Neuro-Inclusivity & Accessibility (WCAG 3.0 APCA Base)
+Non-negotiable baseline for any public interface in 2026+:
+| Check | How to Verify |
+|---|---|
+| **APCA Contrast** | Ensure Lc (Lightness Contrast) is > 75 for body text, > 60 for large text. (Do not rely solely on old WCAG 2.1 4.5:1 math). |
+| **Cognitive Safety** | Check if `prefers-reduced-motion` is respected. No infinite spinning loaders. |
+| **Keyboard Fluidity** | Tab order must follow visual order. Focus states cannot be just a 1px dotted line; use `outline: 2px solid var(--focus-color); outline-offset: 2px;`. |
+| **Semantic AI Context** | Images must have `alt` tags, but complex charts need full `<details>` breakdowns for screen readers and AI agents crawling the site. |
+| **Interaction Buffers** | Are touch targets mathematically ≥48px (Fitts' Law)? |
+### 2. Extreme Core Web Vitals (CWV)
+Drop the old 2022 standards. The new baseline for premium web:
-1. Fetch the latest guidelines from the source URL below
-2. Read the specified files (or prompt user for files/pattern)
-3. Check against all rules in the fetched guidelines
-4. Output findings in the terse `file:line` format
+| Metric | Premium Target | Common Failures |
+|---|---|---|
+| **LCP** (Largest Contentful Paint) | **< 1.5s** | Missing `fetchpriority="high"` on hero images. Heavy client-side React rendering blocking the paint. |
+| **INP** (Interaction to Next Paint) | **< 100ms** | Main thread blocked by React hydrate. Use `startTransition` or Web Workers for heavy JS. |
+| **CLS** (Cumulative Layout Shift) | **0.00** | Missing `width` and `height` on images. Late-loading web fonts (use `font-display: optional`). |
-## Guidelines Source
+### 3. Energy Efficiency & Sustainability
-Fetch fresh guidelines before each review:
+Code has a carbon footprint and a battery cost.
+- **OLED Pure Black:** Does the dark mode use `#000` or `#010101` to physically turn off pixels?
+- **Animation Tax:** Are animations using CPU-heavy properties (`margin`, `width`) instead of GPU-accelerated ones (`transform`, `opacity`)?
+- **Asset Weight:** Are images AVIF/WebP? Are fonts subsetted Variable Fonts?
+### 4. Visual Design Quality (Pro-Max)
+Evaluate these brutally honestly:
+- **Mathematical Spacing:** Does spacing follow a strict scale (e.g., 4, 8, 16, 24, 32) or is it a mess of arbitrary pixels?
+- **Fluid Typography:** Is text using `clamp()` to scale, or does it awkwardly jump at breakpoints?
+- **The "Purple Ban":** Is the site heavily relying on deep purple/violet as a primary color? (Flag this as an overused AI-generated cliché).
+- **Z-Axis Depth:** Are shadows realistic (multi-layered CSS shadows) or flat and cheap (`box-shadow: 0 4px 6px #000`)?
+- **Micro-Interactions:** Do buttons scale down slightly on `:active`? Do elements use spring-physics easing (`cubic-bezier(0.34, 1.56, 0.64, 1)`)?
+### 5. AI & Streaming UX
+- **Zero-Wait States:** If the app is waiting for an LLM/Server response, does it show a static spinner (BAD) or stream the skeleton/content (GOOD)?
+- **Optimistic UI:** Do likes/saves update the UI *instantly* before the server confirms?
+---
+## Common Review Findings
+| Finding | Severity | Fix |
+|---|---|---|
+| Missing/Weak focus styles | High | Add visible `:focus-visible` with offset |
+| `margin`/`padding` animated | High | Change to `transform: translate()` |
+| Touch targets under 48px | High | Increase padding/min-height |
+| Layout shifts on load (CLS > 0) | High | Pre-allocate space for async content/images |
+| Linear CSS transitions | Medium | Upgrade to spring-based `cubic-bezier` curves |
+| Purple as primary color | Low/Brand | Rethink palette — overused AI design cliché |
+---
+## Audit Format Template
+When reporting a UI review, use this exact brutal structure:
+```markdown
+## UI/UX Pro-Max Review: [Component/Page Name]
+### ♿ Neuro-Inclusivity & A11y
+- [BLOCKER] [Finding with specific element and fix]
+- [WARN] [Finding]
+### ⚡ Performance & Energy (CWV)
+- [Finding]
+### 🎨 Visual & Spatial Quality
+- [Finding]
+### 🛠️ Interaction & Physics
+- [Finding]
+### Summary
+X blockers, Y warnings, Z suggestions.
+Recommended action before shipping: [specific steps]
 ```
-https://raw.githubusercontent.com/vercel-labs/web-interface-guidelines/main/command.md
+---
+## Output Format
+When this skill produces a recommendation or design decision, structure your output as:
+```
+━━━ Web Design Guidelines Recommendation ━━━━━━━━━━━━━━━━
+Decision:    [what was chosen / proposed]
+Rationale:   [why — one concise line]
+Trade-offs:  [what is consciously accepted]
+Next action: [concrete next step for the user]
+─────────────────────────────────────────────────
+Pre-Flight:  ✅ All checks passed
+             or ❌ [blocking item that must be resolved first]
 ```
-Use WebFetch to retrieve the latest rules. The fetched content contains all the rules and output format instructions.
-## Usage
-When a user provides a file or pattern argument:
-1. Fetch guidelines from the source URL above
-2. Read the specified files
-3. Apply all rules from the fetched guidelines
-4. Output findings using the format specified in the guidelines
+---
+## 🤖 LLM-Specific Traps
+AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
-If no files specified, ask the user which files to review.
+1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
+2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
+3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
+4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
 ---
-## Related Skills
+## 🏛️ Tribunal Integration (Anti-Hallucination)
+**Slash command: `/review` or `/tribunal-full`**
+**Active reviewers: `logic-reviewer` · `security-auditor`**
+### ❌ Forbidden AI Tropes
-| Skill | When to Use |
-|-------|-------------|
-| **[frontend-design](../frontend-design/SKILL.md)** | Before coding - Learn design principles (color, typography, UX psychology) |
-| **web-design-guidelines** (this) | After coding - Audit for accessibility, performance, and best practices |
+1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
+2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
+3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
-## Design Workflow
+### ✅ Pre-Flight Self-Audit
+Review these questions before confirming output:
 ```
-1. DESIGN   → Read frontend-design principles
-2. CODE     → Implement the design
-3. AUDIT    → Run web-design-guidelines review ← YOU ARE HERE
-4. FIX      → Address findings from audit
+✅ Did I rely ONLY on real, verified tools and methods?
+✅ Is this solution appropriately scoped to the user's constraints?
+✅ Did I handle potential failure modes and edge cases?
+✅ Have I avoided generic boilerplate that doesn't add value?
 ```
+### 🛑 Verification-Before-Completion (VBC) Protocol
+**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
+- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
+- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.

package/.agent/skills/webapp-testing/SKILL.md CHANGED Viewed

@@ -1,187 +1,236 @@
 ---
 name: webapp-testing
 description: Web application testing principles. E2E, Playwright, deep audit strategies.
-allowed-tools: Read, Write, Edit, Glob, Grep, Bash
+allowed-tools: Read, Write, Edit, Glob, Grep
+version: 1.0.0
+last-updated: 2026-03-12
+applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
 ---
-# Web App Testing
+# Web Application Testing
-> Discover and test everything. Leave no route untested.
+> E2E tests are the most expensive tests to write and maintain.
+> Write them for the flows that would wake someone up at 2am if they broke.
-## 🔧 Runtime Scripts
+---
-**Execute these for automated browser testing:**
+## What Belongs in E2E Tests
-| Script | Purpose | Usage |
-|--------|---------|-------|
-| `scripts/playwright_runner.py` | Basic browser test | `python scripts/playwright_runner.py https://example.com` |
-| | With screenshot | `python scripts/playwright_runner.py <url> --screenshot` |
-| | Accessibility check | `python scripts/playwright_runner.py <url> --a11y` |
+E2E tests simulate a real user in a real browser. Use them selectively:
-**Requires:** `pip install playwright && playwright install chromium`
+**Should be E2E:**
+- User can register and log in
+- User can complete a purchase / checkout flow
+- Critical form submission that triggers business logic
+- OAuth login flows
+- File upload and processing
----
+**Should NOT be E2E:**
+- Individual UI component appearance (use unit/visual tests)
+- API data validation (use API/integration tests)
+- Error message text (too brittle, too low value)
+- Every edge case (test edge cases at the service/unit level)
-## 1. Deep Audit Approach
+---
-### Discovery First
+## Playwright Patterns
-| Target | How to Find |
-|--------|-------------|
-| Routes | Scan app/, pages/, router files |
-| API endpoints | Grep for HTTP methods |
-| Components | Find component directories |
-| Features | Read documentation |
+### Page Object Model
-### Systematic Testing
+Encapsulate page interactions to keep tests maintainable:
-1. **Map** - List all routes/APIs
-2. **Scan** - Verify they respond
-3. **Test** - Cover critical paths
+```ts
+// page-objects/LoginPage.ts
+export class LoginPage {
+  constructor(private page: Page) {}
----
+  get emailInput() { return this.page.getByLabel('Email'); }
+  get passwordInput() { return this.page.getByLabel('Password'); }
+  get submitButton() { return this.page.getByRole('button', { name: 'Sign in' }); }
-## 2. Testing Pyramid for Web
+  async login(email: string, password: string) {
+    await this.emailInput.fill(email);
+    await this.passwordInput.fill(password);
+    await this.submitButton.click();
+  }
+}
-```
-        /\          E2E (Few)
-       /  \         Critical user flows
-      /----\
-     /      \       Integration (Some)
-    /--------\      API, data flow
-   /          \
-  /------------\    Component (Many)
-                    Individual UI pieces
+// tests/auth.spec.ts
+test('user can log in with valid credentials', async ({ page }) => {
+  const loginPage = new LoginPage(page);
+  await page.goto('/login');
+  await loginPage.login('user@test.com', 'password123');
+  await expect(page).toHaveURL('/dashboard');
+});
 ```
----
+### Locator Strategy (Priority Order)
-## 3. E2E Test Principles
+Prefer locators that reflect how the user thinks about the element:
-### What to Test
+```ts
+// 1. Role (most semantic, most resilient)
+page.getByRole('button', { name: 'Submit' })
+page.getByRole('textbox', { name: 'Email' })
-| Priority | Tests |
-|----------|-------|
-| 1 | Happy path user flows |
-| 2 | Authentication flows |
-| 3 | Critical business actions |
-| 4 | Error handling |
+// 2. Label (tied to accessibility — good signal)
+page.getByLabel('Email address')
-### E2E Best Practices
+// 3. Text (works but can be fragile if copy changes)
+page.getByText('Welcome back')
-| Practice | Why |
-|----------|-----|
-| Use data-testid | Stable selectors |
-| Wait for elements | Avoid flaky tests |
-| Clean state | Independent tests |
-| Avoid implementation details | Test user behavior |
+// 4. Test ID (last resort — doesn't break on copy/layout changes)
+page.getByTestId('submit-button')
----
+// ❌ Never (fragile — breaks on any CSS refactor)
+page.locator('.btn.btn-primary.submit')
+page.locator('#form > div:nth-child(2) > input')
+```
-## 4. Playwright Principles
+### Waiting for State
-### Core Concepts
+```ts
+// ✅ Wait for network idle before asserting
+await page.waitForLoadState('networkidle');
-| Concept | Use |
-|---------|-----|
-| Page Object Model | Encapsulate page logic |
-| Fixtures | Reusable test setup |
-| Assertions | Built-in auto-wait |
-| Trace Viewer | Debug failures |
+// ✅ Wait for a specific element
+await page.waitForSelector('[data-testid="results"]');
-### Configuration
+// ✅ Assertion-based waiting (Playwright retries automatically)
+await expect(page.getByText('Order confirmed')).toBeVisible();
-| Setting | Recommendation |
-|---------|----------------|
-| Retries | 2 on CI |
-| Trace | on-first-retry |
-| Screenshots | on-failure |
-| Video | retain-on-failure |
+// ❌ Fixed sleep (brittle — too short in CI, too slow locally)
+await page.waitForTimeout(2000);
+```
 ---
-## 5. Visual Testing
+## Test Data Management
-### When to Use
+Keep test data predictable and isolated:
-| Scenario | Value |
-|----------|-------|
-| Design system | High |
-| Marketing pages | High |
-| Component library | Medium |
-| Dynamic content | Lower |
+```ts
+// Seed database before tests that need specific data
+test.beforeEach(async ({ request }) => {
+  await request.post('/api/test/seed', {
+    data: { users: [testUser], products: [testProduct] }
+  });
+});
-### Strategy
+// Clean up after
+test.afterEach(async ({ request }) => {
+  await request.delete('/api/test/cleanup');
+});
+```
-- Baseline screenshots
-- Compare on changes
-- Review visual diffs
-- Update intentional changes
+**Rules:**
+- Each test owns its data and cleans up after itself
+- Tests don't share state through the database
+- Test accounts are distinguishable from real accounts (prefix: `test_`)
 ---
-## 6. API Testing Principles
+## CI/CD Integration
+```yaml
+# GitHub Actions example
+playwright-tests:
+  runs-on: ubuntu-latest
+  steps:
+    - uses: actions/checkout@v4
+    - uses: actions/setup-node@v4
+    - run: npm ci
+    - run: npx playwright install --with-deps
+    - run: npm run test:e2e
+    - uses: actions/upload-artifact@v4
+      if: failure()
+      with:
+        name: playwright-report
+        path: playwright-report/
+```
+**Key configurations:**
+```ts
+// playwright.config.ts
+export default defineConfig({
+  testDir: './tests/e2e',
+  fullyParallel: true,
+  retries: process.env.CI ? 2 : 0,  // retry only in CI
+  workers: process.env.CI ? 4 : 1,
+  reporter: [['html'], ['github']],
+  use: {
+    baseURL: process.env.BASE_URL || 'http://localhost:3000',
+    screenshot: 'only-on-failure',
+    video: 'retain-on-failure',
+    trace: 'on-first-retry',
+  },
+});
+```
+---
-### Coverage Areas
+## Scripts
-| Area | Tests |
-|------|-------|
-| Status codes | 200, 400, 404, 500 |
-| Response shape | Matches schema |
-| Error messages | User-friendly |
-| Edge cases | Empty, large, special chars |
+| Script | Purpose | Run With |
+|---|---|---|
+| `scripts/playwright_runner.py` | Runs Playwright test suite and reports | `python scripts/playwright_runner.py <project_path>` |
 ---
-## 7. Test Organization
+## Output Format
-### File Structure
+When this skill completes a task, structure your output as:
 ```
-tests/
-├── e2e/           # Full user flows
-├── integration/   # API, data
-├── component/     # UI units
-└── fixtures/      # Shared data
+━━━ Webapp Testing Output ━━━━━━━━━━━━━━━━━━━━━━━━
+Task:        [what was performed]
+Result:      [outcome summary — one line]
+─────────────────────────────────────────────────
+Checks:      ✅ [N passed] · ⚠️  [N warnings] · ❌ [N blocked]
+VBC status:  PENDING → VERIFIED
+Evidence:    [link to terminal output, test result, or file diff]
 ```
-### Naming Convention
-| Pattern | Example |
-|---------|---------|
-| Feature-based | `login.spec.ts` |
-| Descriptive | `user-can-checkout.spec.ts` |
 ---
-## 8. CI Integration
+## 🤖 LLM-Specific Traps
-### Pipeline Steps
+AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
-1. Install dependencies
-2. Install browsers
-3. Run tests
-4. Upload artifacts (traces, screenshots)
+1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
+2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
+3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
+4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
+5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
-### Parallelization
+---
-| Strategy | Use |
-|----------|-----|
-| Per file | Playwright default |
-| Sharding | Large suites |
-| Workers | Multiple browsers |
+## 🏛️ Tribunal Integration (Anti-Hallucination)
----
+**Slash command: `/review` or `/tribunal-full`**
+**Active reviewers: `logic-reviewer` · `security-auditor`**
-## 9. Anti-Patterns
+### ❌ Forbidden AI Tropes
-| ❌ Don't | ✅ Do |
-|----------|-------|
-| Test implementation | Test behavior |
-| Hardcode waits | Use auto-wait |
-| Skip cleanup | Isolate tests |
-| Ignore flaky tests | Fix root cause |
+1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
+2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
+3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
----
+### ✅ Pre-Flight Self-Audit
+Review these questions before confirming output:
+```
+✅ Did I rely ONLY on real, verified tools and methods?
+✅ Is this solution appropriately scoped to the user's constraints?
+✅ Did I handle potential failure modes and edge cases?
+✅ Have I avoided generic boilerplate that doesn't add value?
+```
+### 🛑 Verification-Before-Completion (VBC) Protocol
-> **Remember:** E2E tests are expensive. Use them for critical paths only.
+**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
+- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
+- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.