npm - tribunal-kit - Versions diffs - 2.4.6 → 3.1.0 - Mend

tribunal-kit 2.4.6 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (250) hide show

package/.agent/ARCHITECTURE.md +99 -99
package/.agent/GEMINI.md +52 -52
package/.agent/agents/accessibility-reviewer.md +139 -86
package/.agent/agents/ai-code-reviewer.md +160 -90
package/.agent/agents/backend-specialist.md +164 -127
package/.agent/agents/code-archaeologist.md +115 -73
package/.agent/agents/database-architect.md +130 -110
package/.agent/agents/debugger.md +137 -97
package/.agent/agents/dependency-reviewer.md +78 -30
package/.agent/agents/devops-engineer.md +161 -118
package/.agent/agents/documentation-writer.md +151 -87
package/.agent/agents/explorer-agent.md +117 -99
package/.agent/agents/frontend-reviewer.md +127 -47
package/.agent/agents/frontend-specialist.md +169 -109
package/.agent/agents/game-developer.md +28 -164
package/.agent/agents/logic-reviewer.md +87 -49
package/.agent/agents/mobile-developer.md +151 -103
package/.agent/agents/mobile-reviewer.md +133 -50
package/.agent/agents/orchestrator.md +121 -110
package/.agent/agents/penetration-tester.md +103 -77
package/.agent/agents/performance-optimizer.md +136 -92
package/.agent/agents/performance-reviewer.md +139 -69
package/.agent/agents/product-manager.md +104 -70
package/.agent/agents/product-owner.md +6 -25
package/.agent/agents/project-planner.md +95 -95
package/.agent/agents/qa-automation-engineer.md +174 -87
package/.agent/agents/security-auditor.md +133 -129
package/.agent/agents/seo-specialist.md +160 -99
package/.agent/agents/sql-reviewer.md +132 -44
package/.agent/agents/supervisor-agent.md +137 -109
package/.agent/agents/swarm-worker-contracts.md +17 -17
package/.agent/agents/swarm-worker-registry.md +46 -46
package/.agent/agents/test-coverage-reviewer.md +132 -53
package/.agent/agents/test-engineer.md +0 -21
package/.agent/agents/type-safety-reviewer.md +143 -33
package/.agent/patterns/generator.md +9 -9
package/.agent/patterns/inversion.md +12 -12
package/.agent/patterns/pipeline.md +9 -9
package/.agent/patterns/reviewer.md +13 -13
package/.agent/patterns/tool-wrapper.md +9 -9
package/.agent/rules/GEMINI.md +63 -63
package/.agent/scripts/__pycache__/auto_preview.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/bundle_analyzer.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/checklist.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/dependency_analyzer.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/security_scan.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/session_manager.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/skill_integrator.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/swarm_dispatcher.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/test_runner.cpython-311.pyc +0 -0
package/.agent/scripts/__pycache__/verify_all.cpython-311.pyc +0 -0
package/.agent/scripts/compress_skills.py +167 -0
package/.agent/scripts/consolidate_skills.py +173 -0
package/.agent/scripts/deep_compress.py +202 -0
package/.agent/scripts/minify_context.py +80 -0
package/.agent/scripts/security_scan.py +1 -1
package/.agent/scripts/strip_tribunal.py +41 -0
package/.agent/skills/agent-organizer/SKILL.md +60 -100
package/.agent/skills/agentic-patterns/SKILL.md +0 -70
package/.agent/skills/ai-prompt-injection-defense/SKILL.md +108 -53
package/.agent/skills/api-patterns/SKILL.md +197 -257
package/.agent/skills/api-security-auditor/SKILL.md +125 -57
package/.agent/skills/app-builder/SKILL.md +326 -50
package/.agent/skills/app-builder/templates/SKILL.md +13 -15
package/.agent/skills/app-builder/templates/astro-static/TEMPLATE.md +16 -16
package/.agent/skills/app-builder/templates/chrome-extension/TEMPLATE.md +22 -22
package/.agent/skills/app-builder/templates/cli-tool/TEMPLATE.md +18 -18
package/.agent/skills/app-builder/templates/electron-desktop/TEMPLATE.md +20 -20
package/.agent/skills/app-builder/templates/express-api/TEMPLATE.md +17 -17
package/.agent/skills/app-builder/templates/flutter-app/TEMPLATE.md +18 -18
package/.agent/skills/app-builder/templates/monorepo-turborepo/TEMPLATE.md +21 -21
package/.agent/skills/app-builder/templates/nextjs-fullstack/TEMPLATE.md +19 -19
package/.agent/skills/app-builder/templates/nextjs-saas/TEMPLATE.md +26 -26
package/.agent/skills/app-builder/templates/nextjs-static/TEMPLATE.md +26 -26
package/.agent/skills/app-builder/templates/nuxt-app/TEMPLATE.md +19 -19
package/.agent/skills/app-builder/templates/python-fastapi/TEMPLATE.md +18 -18
package/.agent/skills/app-builder/templates/react-native-app/TEMPLATE.md +20 -20
package/.agent/skills/appflow-wireframe/SKILL.md +71 -98
package/.agent/skills/architecture/SKILL.md +161 -200
package/.agent/skills/authentication-best-practices/SKILL.md +121 -54
package/.agent/skills/bash-linux/SKILL.md +71 -166
package/.agent/skills/behavioral-modes/SKILL.md +8 -69
package/.agent/skills/brainstorming/SKILL.md +345 -127
package/.agent/skills/building-native-ui/SKILL.md +125 -57
package/.agent/skills/clean-code/SKILL.md +266 -149
package/.agent/skills/code-review-checklist/SKILL.md +0 -62
package/.agent/skills/config-validator/SKILL.md +73 -131
package/.agent/skills/csharp-developer/SKILL.md +434 -73
package/.agent/skills/database-design/SKILL.md +190 -275
package/.agent/skills/deployment-procedures/SKILL.md +81 -158
package/.agent/skills/devops-engineer/SKILL.md +255 -94
package/.agent/skills/devops-incident-responder/SKILL.md +50 -69
package/.agent/skills/doc.md +5 -5
package/.agent/skills/documentation-templates/SKILL.md +19 -63
package/.agent/skills/edge-computing/SKILL.md +75 -165
package/.agent/skills/extract-design-system/SKILL.md +84 -58
package/.agent/skills/framer-motion-expert/SKILL.md +195 -0
package/.agent/skills/frontend-design/SKILL.md +151 -499
package/.agent/skills/game-design-expert/SKILL.md +71 -0
package/.agent/skills/game-engineering-expert/SKILL.md +88 -0
package/.agent/skills/geo-fundamentals/SKILL.md +52 -178
package/.agent/skills/github-operations/SKILL.md +197 -272
package/.agent/skills/gsap-expert/SKILL.md +194 -0
package/.agent/skills/i18n-localization/SKILL.md +60 -172
package/.agent/skills/intelligent-routing/SKILL.md +123 -103
package/.agent/skills/lint-and-validate/SKILL.md +8 -52
package/.agent/skills/llm-engineering/SKILL.md +281 -195
package/.agent/skills/local-first/SKILL.md +76 -159
package/.agent/skills/mcp-builder/SKILL.md +48 -188
package/.agent/skills/mobile-design/SKILL.md +213 -219
package/.agent/skills/motion-engineering/SKILL.md +184 -0
package/.agent/skills/nextjs-react-expert/SKILL.md +184 -203
package/.agent/skills/nodejs-best-practices/SKILL.md +403 -185
package/.agent/skills/observability/SKILL.md +211 -203
package/.agent/skills/parallel-agents/SKILL.md +53 -146
package/.agent/skills/performance-profiling/SKILL.md +171 -151
package/.agent/skills/plan-writing/SKILL.md +49 -153
package/.agent/skills/platform-engineer/SKILL.md +57 -103
package/.agent/skills/playwright-best-practices/SKILL.md +110 -63
package/.agent/skills/powershell-windows/SKILL.md +61 -179
package/.agent/skills/python-patterns/SKILL.md +7 -35
package/.agent/skills/python-pro/SKILL.md +273 -114
package/.agent/skills/react-specialist/SKILL.md +227 -108
package/.agent/skills/readme-builder/SKILL.md +15 -85
package/.agent/skills/realtime-patterns/SKILL.md +216 -243
package/.agent/skills/red-team-tactics/SKILL.md +10 -51
package/.agent/skills/rust-pro/SKILL.md +525 -142
package/.agent/skills/seo-fundamentals/SKILL.md +92 -153
package/.agent/skills/server-management/SKILL.md +110 -166
package/.agent/skills/shadcn-ui-expert/SKILL.md +154 -55
package/.agent/skills/skill-creator/SKILL.md +18 -58
package/.agent/skills/sql-pro/SKILL.md +543 -68
package/.agent/skills/supabase-postgres-best-practices/SKILL.md +28 -68
package/.agent/skills/swiftui-expert/SKILL.md +124 -57
package/.agent/skills/systematic-debugging/SKILL.md +49 -151
package/.agent/skills/tailwind-patterns/SKILL.md +433 -149
package/.agent/skills/tdd-workflow/SKILL.md +63 -169
package/.agent/skills/test-result-analyzer/SKILL.md +33 -73
package/.agent/skills/testing-patterns/SKILL.md +437 -130
package/.agent/skills/trend-researcher/SKILL.md +30 -71
package/.agent/skills/ui-ux-pro-max/SKILL.md +0 -41
package/.agent/skills/ui-ux-researcher/SKILL.md +51 -91
package/.agent/skills/vue-expert/SKILL.md +225 -119
package/.agent/skills/vulnerability-scanner/SKILL.md +264 -226
package/.agent/skills/web-accessibility-auditor/SKILL.md +141 -58
package/.agent/skills/web-design-guidelines/SKILL.md +17 -61
package/.agent/skills/webapp-testing/SKILL.md +71 -196
package/.agent/skills/whimsy-injector/SKILL.md +58 -132
package/.agent/skills/workflow-optimizer/SKILL.md +28 -68
package/.agent/workflows/api-tester.md +96 -224
package/.agent/workflows/audit.md +81 -122
package/.agent/workflows/brainstorm.md +69 -105
package/.agent/workflows/changelog.md +65 -97
package/.agent/workflows/create.md +73 -88
package/.agent/workflows/debug.md +80 -111
package/.agent/workflows/deploy.md +119 -92
package/.agent/workflows/enhance.md +80 -91
package/.agent/workflows/fix.md +68 -97
package/.agent/workflows/generate.md +165 -164
package/.agent/workflows/migrate.md +106 -109
package/.agent/workflows/orchestrate.md +103 -86
package/.agent/workflows/performance-benchmarker.md +77 -268
package/.agent/workflows/plan.md +120 -98
package/.agent/workflows/preview.md +39 -96
package/.agent/workflows/refactor.md +105 -97
package/.agent/workflows/review-ai.md +63 -102
package/.agent/workflows/review.md +71 -110
package/.agent/workflows/session.md +53 -113
package/.agent/workflows/status.md +42 -88
package/.agent/workflows/strengthen-skills.md +90 -51
package/.agent/workflows/swarm.md +114 -129
package/.agent/workflows/test.md +125 -102
package/.agent/workflows/tribunal-backend.md +60 -78
package/.agent/workflows/tribunal-database.md +62 -100
package/.agent/workflows/tribunal-frontend.md +62 -82
package/.agent/workflows/tribunal-full.md +56 -100
package/.agent/workflows/tribunal-mobile.md +65 -94
package/.agent/workflows/tribunal-performance.md +62 -105
package/.agent/workflows/ui-ux-pro-max.md +72 -121
package/README.md +11 -15
package/package.json +1 -1
package/.agent/skills/api-patterns/api-style.md +0 -42
package/.agent/skills/api-patterns/auth.md +0 -24
package/.agent/skills/api-patterns/documentation.md +0 -26
package/.agent/skills/api-patterns/graphql.md +0 -41
package/.agent/skills/api-patterns/rate-limiting.md +0 -31
package/.agent/skills/api-patterns/response.md +0 -37
package/.agent/skills/api-patterns/rest.md +0 -40
package/.agent/skills/api-patterns/security-testing.md +0 -122
package/.agent/skills/api-patterns/trpc.md +0 -41
package/.agent/skills/api-patterns/versioning.md +0 -22
package/.agent/skills/app-builder/agent-coordination.md +0 -71
package/.agent/skills/app-builder/feature-building.md +0 -53
package/.agent/skills/app-builder/project-detection.md +0 -34
package/.agent/skills/app-builder/scaffolding.md +0 -118
package/.agent/skills/app-builder/tech-stack.md +0 -40
package/.agent/skills/architecture/context-discovery.md +0 -43
package/.agent/skills/architecture/examples.md +0 -94
package/.agent/skills/architecture/pattern-selection.md +0 -68
package/.agent/skills/architecture/patterns-reference.md +0 -50
package/.agent/skills/architecture/trade-off-analysis.md +0 -77
package/.agent/skills/brainstorming/dynamic-questioning.md +0 -360
package/.agent/skills/database-design/database-selection.md +0 -43
package/.agent/skills/database-design/indexing.md +0 -39
package/.agent/skills/database-design/migrations.md +0 -48
package/.agent/skills/database-design/optimization.md +0 -36
package/.agent/skills/database-design/orm-selection.md +0 -30
package/.agent/skills/database-design/schema-design.md +0 -56
package/.agent/skills/dotnet-core-expert/SKILL.md +0 -103
package/.agent/skills/framer-motion-animations/SKILL.md +0 -74
package/.agent/skills/frontend-design/animation-guide.md +0 -331
package/.agent/skills/frontend-design/color-system.md +0 -329
package/.agent/skills/frontend-design/decision-trees.md +0 -418
package/.agent/skills/frontend-design/motion-graphics.md +0 -306
package/.agent/skills/frontend-design/typography-system.md +0 -363
package/.agent/skills/frontend-design/ux-psychology.md +0 -1116
package/.agent/skills/frontend-design/visual-effects.md +0 -383
package/.agent/skills/game-development/2d-games/SKILL.md +0 -119
package/.agent/skills/game-development/3d-games/SKILL.md +0 -135
package/.agent/skills/game-development/SKILL.md +0 -236
package/.agent/skills/game-development/game-art/SKILL.md +0 -185
package/.agent/skills/game-development/game-audio/SKILL.md +0 -190
package/.agent/skills/game-development/game-design/SKILL.md +0 -129
package/.agent/skills/game-development/mobile-games/SKILL.md +0 -108
package/.agent/skills/game-development/multiplayer/SKILL.md +0 -132
package/.agent/skills/game-development/pc-games/SKILL.md +0 -144
package/.agent/skills/game-development/vr-ar/SKILL.md +0 -123
package/.agent/skills/game-development/web-games/SKILL.md +0 -150
package/.agent/skills/intelligent-routing/router-manifest.md +0 -65
package/.agent/skills/mobile-design/decision-trees.md +0 -516
package/.agent/skills/mobile-design/mobile-backend.md +0 -491
package/.agent/skills/mobile-design/mobile-color-system.md +0 -420
package/.agent/skills/mobile-design/mobile-debugging.md +0 -122
package/.agent/skills/mobile-design/mobile-design-thinking.md +0 -357
package/.agent/skills/mobile-design/mobile-navigation.md +0 -458
package/.agent/skills/mobile-design/mobile-performance.md +0 -767
package/.agent/skills/mobile-design/mobile-testing.md +0 -356
package/.agent/skills/mobile-design/mobile-typography.md +0 -433
package/.agent/skills/mobile-design/platform-android.md +0 -666
package/.agent/skills/mobile-design/platform-ios.md +0 -561
package/.agent/skills/mobile-design/touch-psychology.md +0 -537
package/.agent/skills/nextjs-react-expert/1-async-eliminating-waterfalls.md +0 -312
package/.agent/skills/nextjs-react-expert/2-bundle-bundle-size-optimization.md +0 -240
package/.agent/skills/nextjs-react-expert/3-server-server-side-performance.md +0 -490
package/.agent/skills/nextjs-react-expert/4-client-client-side-data-fetching.md +0 -264
package/.agent/skills/nextjs-react-expert/5-rerender-re-render-optimization.md +0 -581
package/.agent/skills/nextjs-react-expert/6-rendering-rendering-performance.md +0 -432
package/.agent/skills/nextjs-react-expert/7-js-javascript-performance.md +0 -684
package/.agent/skills/nextjs-react-expert/8-advanced-advanced-patterns.md +0 -150
package/.agent/skills/vulnerability-scanner/checklists.md +0 -121

package/.agent/skills/llm-engineering/SKILL.md CHANGED Viewed

@@ -1,258 +1,344 @@
 ---
 name: llm-engineering
-description: LLM engineering principles for production AI systems. RAG pipeline design, vector store selection, prompt engineering, evals, and LLMOps. Use when building AI features, chat interfaces, semantic search, or any system calling an LLM API.
+description: LLM engineering mastery for production AI systems. Prompt engineering, RAG pipeline design, vector store selection, embedding strategies, chunking, reranking, structured output, function calling, streaming, evals, guard-rails, cost optimization, and LLMOps. Use when building AI features, chat interfaces, semantic search, or any system calling an LLM API.
 allowed-tools: Read, Write, Edit, Glob, Grep
-version: 1.0.0
-last-updated: 2026-03-12
-applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
+version: 3.2.0
+last-updated: 2026-04-07
+applies-to-model: gemini-3-1-pro, claude-3-7-sonnet
 ---
-# LLM Engineering Principles
-> An LLM is a probabilistic function, not a deterministic API.
-> Design your system to be correct despite that — not because you got lucky.
+# LLM Engineering — Production AI Systems Mastery
 ---
-## When This Skill Activates
+## Model Selection
-- Adding AI chat, completion, or summarization to an app
-- Building a RAG (Retrieval-Augmented Generation) pipeline
-- Integrating with OpenAI, Anthropic, Google Gemini, or local models
-- Designing semantic search
-- Setting up AI evals or monitoring
+```
+Model                    │ Use Case                              │ Cost Tier
+─────────────────────────┼───────────────────────────────────────┼──────────
+GPT-4o                   │ Complex reasoning, vision, code       │ $$$
+GPT-4o-mini              │ Classification, summaries, chat       │ $
+o3-mini                  │ Deep reasoning, math, code review     │ $$
+Claude 3.7 Sonnet        │ Long documents, analysis, code        │ $$$
+Claude 3.5 Haiku         │ Fast responses, simple tasks          │ $
+Gemini 3.1 Pro (High)    │ Large context, multimodal, code       │ $$$
+Gemini 3.0 Flash         │ High throughput, cost-efficient       │ $
+Llama 3.3 70B (open)     │ Self-hosted, data privacy             │ Free*
+Mistral Large 2          │ European data residency, code         │ $$
+* = compute costs only
+Selection rules:
+1. Start with the cheapest model that passes your evals
+2. Upgrade only when eval scores require it
+3. Use large models for complex reasoning, small for classification/routing
+4. Fine-tune ONLY after prompt engineering and RAG are exhausted
+5. ❌ HALLUCINATION TRAP: Model names change frequently — always verify current names
+   from provider docs before hardcoding (e.g. "gpt-4o" vs "gpt-4o-2024-11-20")
+```
 ---
-## Core Architecture Decision: What Pattern?
+## Prompt Engineering
-| Pattern | Use When | Avoid When |
-|---|---|---|
-| **Simple prompt** | Single-turn, no user docs | Needs accuracy on user data |
-| **RAG** | Answers must cite user/company docs | Data changes every second |
-| **Fine-tuning** | Consistent tone/style at scale | You have < 1000 examples |
-| **Agent loop** | Multi-step tasks, tool use | Single-answer questions |
-| **Hybrid** | RAG + agent (most production apps) | Over-engineering simple use case |
+### System Prompt Design
----
+```typescript
+const SYSTEM_PROMPT = `You are a customer support agent for Acme Corp.
-## RAG Pipeline Design
+## Rules
+1. Answer ONLY questions about Acme products and services.
+2. If you don't know the answer, say "I'll connect you with a specialist."
+3. Never discuss competitors.
+4. Never make up product features or pricing.
+5. Keep responses under 200 words.
-The core pattern for grounding LLMs in real data:
+## Response Format
+- Use bullet points for lists
+- Include product links when relevant
+- End with a follow-up question
-```
-INGEST                    RETRIEVE                  GENERATE
-─────────                 ─────────                 ─────────
-Documents                 User query                Retrieved chunks
-    │                         │                         │
-    ▼                         ▼                         ▼
-Chunk (512 tokens)    Embed query vector     Rerank by relevance
-    │                         │                         │
-    ▼                         ▼                         ▼
-Embed chunks          ANN search in          Build prompt:
-    │                 vector store           [system] + [chunks] + [query]
-    ▼                         │                         │
-Store in vector DB    Top-K results          Call LLM → stream response
+## Context
+Current date: ${new Date().toISOString().split("T")[0]}
+User plan: {{user_plan}}
+`;
+// ❌ HALLUCINATION TRAP: System prompts are NOT secrets
+// Users can extract system prompts with jailbreak techniques
+// Never put API keys, internal URLs, or secrets in system prompts
 ```
-### Chunking Strategy
+### Structured Output (JSON Mode)
-```ts
-// ❌ Fixed-size chunks break semantic units
-chunk(document, { size: 512 });  // Splits mid-sentence
+```typescript
+import { z } from "zod";
+import OpenAI from "openai";
-// ✅ Semantic chunking — split at natural boundaries
-chunk(document, {
-  strategy: 'markdown-headers',   // Or 'sentence', 'paragraph'
-  maxTokens: 512,
-  overlap: 64,                    // Overlap to preserve context at boundaries
+const SentimentSchema = z.object({
+  sentiment: z.enum(["positive", "negative", "neutral"]),
+  confidence: z.number().min(0).max(1),
+  reasoning: z.string(),
+  topics: z.array(z.string()),
 });
-```
-### Embedding Model Selection
-| Scale | Model | Dimensions | Notes |
-|---|---|---|---|
-| General English | `text-embedding-3-small` | 1536 | Best quality/cost ratio |
-| Multilingual | `multilingual-e5-large` | 1024 | Open source, self-hostable |
-| Code | `text-embedding-3-large` | 3072 | Higher cost, better code retrieval |
-| Local/private | `nomic-embed-text` | 768 | Runs on CPU via Ollama |
+// OpenAI — json_schema mode (strict = true enforces schema exactly)
+async function analyzeSentiment(text: string) {
+  const response = await openai.chat.completions.create({
+    model: "gpt-4o-mini",
+    response_format: {
+      type: "json_schema",
+      json_schema: {
+        name: "sentiment_analysis",
+        strict: true,
+        schema: {
+          type: "object",
+          properties: {
+            sentiment: { type: "string", enum: ["positive", "negative", "neutral"] },
+            confidence: { type: "number" },
+            reasoning: { type: "string" },
+            topics: { type: "array", items: { type: "string" } },
+          },
+          required: ["sentiment", "confidence", "reasoning", "topics"],
+          additionalProperties: false, // required for strict mode
+        },
+      },
+    },
+    messages: [{ role: "system", content: "Analyze sentiment." }, { role: "user", content: text }],
+  });
+  const raw = JSON.parse(response.choices[0].message.content ?? "{}");
+  return SentimentSchema.parse(raw); // always validate with Zod even in strict mode
+}
----
+// Gemini — response_mime_type + response_schema
+import { GoogleGenerativeAI, SchemaType } from "@google/generative-ai";
+const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
+const model = genAI.getGenerativeModel({
+  model: "gemini-2.0-flash",
+  generationConfig: {
+    responseMimeType: "application/json",
+    responseSchema: {
+      type: SchemaType.OBJECT,
+      properties: {
+        sentiment: { type: SchemaType.STRING, enum: ["positive", "negative", "neutral"] },
+        confidence: { type: SchemaType.NUMBER },
+        topics: { type: SchemaType.ARRAY, items: { type: SchemaType.STRING } },
+      },
+      required: ["sentiment", "confidence", "topics"],
+    },
+  },
+});
-## Vector Store Selection
-| Need | Choose | Why |
-|---|---|---|
-| Already on PostgreSQL | `pgvector` | Zero infra, SQL joins with metadata |
-| Managed, billion-scale | Pinecone | Hosted ANN, hybrid search built-in |
-| Open source, self-hosted | Qdrant | Rust-native, fast, rich filtering |
-| Already on Weaviate | Weaviate | GraphQL API, multimodal support |
-| Embedded/local | ChromaDB | Zero infra, great for prototyping |
-```ts
-// pgvector — stays inside your existing PostgreSQL
-import { pgvector } from '@pgvector/pg';
-// Store
-await db.query(
-  'INSERT INTO documents (content, embedding) VALUES ($1, $2)',
-  [text, JSON.stringify(embedding)]  // embedding is float[]
-);
-// Query — cosine similarity
-await db.query(
-  'SELECT content FROM documents ORDER BY embedding <=> $1 LIMIT 5',
-  [JSON.stringify(queryEmbedding)]
-);
+// ❌ HALLUCINATION TRAP: Always validate LLM JSON output with Zod/schema
+// LLMs produce malformed JSON, wrong types, missing fields even with strict mode
+// ❌ const result = JSON.parse(response); // trust blindly
+// ✅ const result = Schema.parse(JSON.parse(response)); // validate always
 ```
----
-## Prompt Engineering Principles
+### Function Calling / Tool Use
-### Message Structure
-```ts
-const messages = [
+```typescript
+const tools: OpenAI.ChatCompletionTool[] = [
   {
-    role: 'system',
-    content: `You are a helpful assistant for [Company].
-You ONLY answer questions based on the provided context.
-If the answer is not in the context, say "I don't have that information."
-Do NOT make up information.`,
+    type: "function",
+    function: {
+      name: "search_products",
+      description: "Search products by name, category, or price range",
+      parameters: {
+        type: "object",
+        properties: {
+          query: { type: "string", description: "Search query" },
+          category: { type: "string", enum: ["electronics", "clothing", "home"] },
+          max_price: { type: "number", description: "Maximum price in USD" },
+        },
+        required: ["query"],
+      },
+    },
   },
   {
-    // Retrieved chunks injected here — NOT into system prompt
-    role: 'user',
-    content: `Context:\n${retrievedChunks.join('\n\n')}\n\nQuestion: ${userQuery}`,
+    type: "function",
+    function: {
+      name: "get_order_status",
+      description: "Get the status of an order by order ID",
+      parameters: {
+        type: "object",
+        properties: {
+          order_id: { type: "string", description: "The order ID (e.g., ORD-12345)" },
+        },
+        required: ["order_id"],
+      },
+    },
   },
 ];
-```
-### Few-Shot Examples
-```ts
-// ❌ Zero-shot on complex tasks — model guesses the format
-"Extract entities from: John called Mary at 5pm"
-// ✅ Few-shot — show the expected output format
-`Extract entities. Output as JSON array.
+// Tool execution loop
+async function chatWithTools(userMessage: string) {
+  const messages: OpenAI.ChatCompletionMessageParam[] = [
+    { role: "system", content: SYSTEM_PROMPT },
+    { role: "user", content: userMessage },
+  ];
+  let response = await openai.chat.completions.create({
+    model: "gpt-4o-mini",
+    messages,
+    tools,
+  });
+  // Process tool calls
+  while (response.choices[0].finish_reason === "tool_calls") {
+    const toolCalls = response.choices[0].message.tool_calls ?? [];
+    messages.push(response.choices[0].message);
+    for (const call of toolCalls) {
+      const args = JSON.parse(call.function.arguments);
+      const result = await executeFunction(call.function.name, args);
+      messages.push({
+        role: "tool",
+        tool_call_id: call.id,
+        content: JSON.stringify(result),
+      });
+    }
-Example:
-Input: "Alice met Bob in London"
-Output: [{"name":"Alice","type":"person"},{"name":"Bob","type":"person"},{"name":"London","type":"location"}]
+    response = await openai.chat.completions.create({
+      model: "gpt-4o-mini",
+      messages,
+      tools,
+    });
+  }
-Input: "${userText}"
-Output:`
+  return response.choices[0].message.content;
+}
 ```
 ---
-## Evals: How to Know If It's Working
+## RAG (Retrieval-Augmented Generation)
+### Pipeline
 ```
-Deterministic evals:   Output matches expected exactly → code comparison
-LLM-as-judge evals:    Another LLM grades the output (1-5 scale)
-Human evals:           Gold standard, expensive, for calibration
-A/B testing:           Compare model/prompt versions on live traffic
+User Query
+    ↓
+[1] Embed query → vector
+    ↓
+[2] Search vector DB → top K chunks
+    ↓
+[3] (Optional) Rerank results → top N
+    ↓
+[4] Build prompt: system + context chunks + query
+    ↓
+[5] LLM generates answer with citations
+    ↓
+[6] Validate response (hallucination check)
 ```
-### Eval Categories
-| Category | What It Measures | Tooling |
-|---|---|---|
-| **Faithfulness** | Does answer match sources? | Ragas, ARES |
-| **Relevance** | Does answer address the question? | LLM-as-judge |
-| **Completeness** | Missing important info? | Human + LLM |
-| **Groundedness** | Hallucination rate | Ragas |
-| **Latency** | p50/p95 response time | OpenTelemetry |
----
-## LLMOps: Production Concerns
-### Cost Control
-```ts
-// Track tokens per request
-const response = await openai.chat.completions.create({ ... });
-const { prompt_tokens, completion_tokens } = response.usage;
-logger.info({ prompt_tokens, completion_tokens, model: 'gpt-4o', cost_usd: calcCost() });
-// Cache identical prompts — LLMs are deterministic at temp=0
-const cacheKey = hash(systemPrompt + userQuery);
-const cached = await cache.get(cacheKey);
-if (cached) return cached;
-```
+### Chunking Strategy
-### Retry with Exponential Backoff
-```ts
-async function callWithRetry(fn: () => Promise<any>, maxRetries = 3) {
-  for (let attempt = 0; attempt < maxRetries; attempt++) {
-    try {
-      return await fn();
-    } catch (err: any) {
-      if (err.status === 429 || err.status === 503) {
-        const delay = Math.pow(2, attempt) * 1000 + Math.random() * 500;
-        await sleep(delay);
-        continue;
-      }
-      throw err;  // Non-retryable errors bubble up immediately
+```typescript
+// ❌ BAD: Arbitrary character splitting
+const chunks = text.match(/.{1,1000}/g); // breaks mid-sentence, mid-word
+// ✅ GOOD: Semantic chunking with overlap
+function chunkDocument(text: string, options: ChunkOptions = {}): Chunk[] {
+  const {
+    maxTokens = 512,      // chunk size
+    overlapTokens = 50,    // overlap between chunks
+    separator = "\n\n",    // split on paragraph boundaries first
+  } = options;
+  const paragraphs = text.split(separator);
+  const chunks: Chunk[] = [];
+  let current = "";
+  for (const para of paragraphs) {
+    if (tokenCount(current + para) > maxTokens && current) {
+      chunks.push({ text: current.trim(), tokens: tokenCount(current) });
+      // Keep overlap from previous chunk
+      const words = current.split(" ");
+      current = words.slice(-overlapTokens).join(" ") + separator + para;
+    } else {
+      current += separator + para;
     }
   }
-  throw new Error('Max retries exceeded');
-}
-```
+  if (current.trim()) chunks.push({ text: current.trim(), tokens: tokenCount(current) });
----
+  return chunks;
+}
-## Output Format
+// Chunk size guidelines:
+// 256-512 tokens → precise retrieval (Q&A, support)
+// 512-1024 tokens → balanced (general RAG)
+// 1024-2048 tokens → broad context (summarization)
+```
-When this skill produces or reviews code, structure your output as follows:
+### Vector Store Selection
 ```
-━━━ Llm Engineering Report ━━━━━━━━━━━━━━━━━━━━━━━━
-Skill:       Llm Engineering
-Language:    [detected language / framework]
-Scope:       [N files · N functions]
-─────────────────────────────────────────────────
-✅ Passed:   [checks that passed, or "All clean"]
-⚠️  Warnings: [non-blocking issues, or "None"]
-❌ Blocked:  [blocking issues requiring fix, or "None"]
-─────────────────────────────────────────────────
-VBC status:  PENDING → VERIFIED
-Evidence:    [test output / lint pass / compile success]
+pgvector (PostgreSQL)  → Already using Postgres, <10M vectors, simple
+Pinecone               → Managed, serverless, easy scaling
+Weaviate               → Hybrid search (vector + keyword), multi-model
+Qdrant                 → High performance, Rust-based, self-hostable
+Chroma                 → Local development, prototyping
+Milvus                 → Enterprise scale, GPU acceleration
+// ❌ HALLUCINATION TRAP: Vector search is NOT keyword search
+// "Apple CEO" might not find "Tim Cook runs Apple Inc."
+// Use HYBRID search (vector + BM25 keyword) for production
 ```
-**VBC (Verification-Before-Completion) is mandatory.**
-Do not mark status as VERIFIED until concrete terminal evidence is provided.
 ---
-## 🏛️ Tribunal Integration (Anti-Hallucination)
+## Streaming
+```typescript
+// Server-Sent Events for AI token streaming
+app.get("/api/chat", async (req, res) => {
+  res.setHeader("Content-Type", "text/event-stream");
+  res.setHeader("Cache-Control", "no-cache");
+  res.setHeader("Connection", "keep-alive");
+  const stream = await openai.chat.completions.create({
+    model: "gpt-4o-mini",
+    messages: [{ role: "user", content: req.query.message as string }],
+    stream: true,
+  });
+  for await (const chunk of stream) {
+    const content = chunk.choices[0]?.delta?.content;
+    if (content) {
+      res.write(`data: ${JSON.stringify({ content })}\n\n`);
+    }
+  }
-**Slash command: `/review-ai`**
-**Active reviewers: `logic` · `security` · `ai-code-reviewer`**
+  res.write("data: [DONE]\n\n");
+  res.end();
+});
-### ❌ Forbidden AI Tropes in LLM Engineering
+// Client-side consumption
+const eventSource = new EventSource(`/api/chat?message=${encodeURIComponent(msg)}`);
+eventSource.onmessage = (event) => {
+  if (event.data === "[DONE]") { eventSource.close(); return; }
+  const { content } = JSON.parse(event.data);
+  appendToChat(content);
+};
+```
-1. **Hallucinated model names** — `gpt-5`, `claude-4`, `gemini-ultra-3` — verify against current provider docs.
-2. **Prompt injection via concatenation** — never `systemPrompt + userInput`. Use separate message roles.
-3. **No eval strategy** — shipping LLM features with zero eval coverage is shipping blind.
-4. **Ignoring token limits** — context exceeding `max_tokens` silently fails or truncates unpredictably.
-5. **No cost tracking** — LLM costs compound at scale — always instrument from day one.
-6. **Synchronous LLM calls** — all LLM API calls are async. Never block the event loop waiting for them.
+---
-### ✅ Pre-Flight Self-Audit
+## Cost Optimization
 ```
-✅ Are all model names verified against current provider documentation?
-✅ Is user input isolated in role:"user" messages, never concatenated into system prompt?
-✅ Is there retry logic with backoff for 429 / 503 errors?
-✅ Is token usage logged per request for cost tracking?
-✅ Is there an eval strategy (even minimal) to detect regressions?
-✅ Are context windows respected — chunked or summarized if approaching limits?
+1. Prompt caching        → Cache system prompts (OpenAI, Anthropic support this)
+2. Output token limiting → Set max_tokens to prevent runaway responses
+3. Tiered models         → Use cheap models for classification, expensive for reasoning
+4. Batch processing      → Use batch APIs for offline processing (50% discount)
+5. Chunked context       → Send only relevant chunks, not entire documents
+6. Response streaming    → Stream to reduce TTFT (time to first token)
+7. Structured output     → Shorter JSON responses vs verbose prose
+// Cost estimation:
+// GPT-4o: ~$2.50/1M input, ~$10/1M output
+// GPT-4o-mini: ~$0.15/1M input, ~$0.60/1M output
+// 1M tokens ≈ 750,000 words ≈ 3,000 pages
 ```
+---