tribunal-kit 1.0.0 → 2.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (127) hide show
  1. package/.agent/.shared/ui-ux-pro-max/README.md +3 -3
  2. package/.agent/ARCHITECTURE.md +205 -10
  3. package/.agent/GEMINI.md +37 -7
  4. package/.agent/agents/accessibility-reviewer.md +134 -0
  5. package/.agent/agents/ai-code-reviewer.md +129 -0
  6. package/.agent/agents/frontend-specialist.md +3 -0
  7. package/.agent/agents/game-developer.md +21 -21
  8. package/.agent/agents/logic-reviewer.md +12 -0
  9. package/.agent/agents/mobile-reviewer.md +79 -0
  10. package/.agent/agents/orchestrator.md +56 -26
  11. package/.agent/agents/performance-reviewer.md +36 -0
  12. package/.agent/agents/supervisor-agent.md +156 -0
  13. package/.agent/agents/swarm-worker-contracts.md +166 -0
  14. package/.agent/agents/swarm-worker-registry.md +92 -0
  15. package/.agent/rules/GEMINI.md +134 -5
  16. package/.agent/scripts/bundle_analyzer.py +259 -0
  17. package/.agent/scripts/dependency_analyzer.py +247 -0
  18. package/.agent/scripts/lint_runner.py +188 -0
  19. package/.agent/scripts/patch_skills_meta.py +177 -0
  20. package/.agent/scripts/patch_skills_output.py +285 -0
  21. package/.agent/scripts/schema_validator.py +279 -0
  22. package/.agent/scripts/security_scan.py +224 -0
  23. package/.agent/scripts/session_manager.py +144 -3
  24. package/.agent/scripts/skill_integrator.py +234 -0
  25. package/.agent/scripts/strengthen_skills.py +220 -0
  26. package/.agent/scripts/swarm_dispatcher.py +317 -0
  27. package/.agent/scripts/test_runner.py +192 -0
  28. package/.agent/scripts/test_swarm_dispatcher.py +163 -0
  29. package/.agent/skills/agent-organizer/SKILL.md +132 -0
  30. package/.agent/skills/agentic-patterns/SKILL.md +335 -0
  31. package/.agent/skills/api-patterns/SKILL.md +226 -50
  32. package/.agent/skills/app-builder/SKILL.md +215 -52
  33. package/.agent/skills/architecture/SKILL.md +176 -31
  34. package/.agent/skills/bash-linux/SKILL.md +150 -134
  35. package/.agent/skills/behavioral-modes/SKILL.md +152 -160
  36. package/.agent/skills/brainstorming/SKILL.md +148 -101
  37. package/.agent/skills/brainstorming/dynamic-questioning.md +10 -0
  38. package/.agent/skills/clean-code/SKILL.md +139 -134
  39. package/.agent/skills/code-review-checklist/SKILL.md +177 -80
  40. package/.agent/skills/config-validator/SKILL.md +165 -0
  41. package/.agent/skills/csharp-developer/SKILL.md +107 -0
  42. package/.agent/skills/database-design/SKILL.md +252 -29
  43. package/.agent/skills/deployment-procedures/SKILL.md +122 -175
  44. package/.agent/skills/devops-engineer/SKILL.md +134 -0
  45. package/.agent/skills/devops-incident-responder/SKILL.md +98 -0
  46. package/.agent/skills/documentation-templates/SKILL.md +175 -121
  47. package/.agent/skills/dotnet-core-expert/SKILL.md +103 -0
  48. package/.agent/skills/edge-computing/SKILL.md +213 -0
  49. package/.agent/skills/frontend-design/SKILL.md +76 -0
  50. package/.agent/skills/frontend-design/color-system.md +18 -0
  51. package/.agent/skills/frontend-design/typography-system.md +18 -0
  52. package/.agent/skills/game-development/SKILL.md +69 -0
  53. package/.agent/skills/geo-fundamentals/SKILL.md +158 -99
  54. package/.agent/skills/github-operations/SKILL.md +354 -0
  55. package/.agent/skills/i18n-localization/SKILL.md +158 -96
  56. package/.agent/skills/intelligent-routing/SKILL.md +89 -285
  57. package/.agent/skills/intelligent-routing/router-manifest.md +65 -0
  58. package/.agent/skills/lint-and-validate/SKILL.md +229 -27
  59. package/.agent/skills/llm-engineering/SKILL.md +258 -0
  60. package/.agent/skills/local-first/SKILL.md +203 -0
  61. package/.agent/skills/mcp-builder/SKILL.md +159 -111
  62. package/.agent/skills/mobile-design/SKILL.md +102 -282
  63. package/.agent/skills/nextjs-react-expert/SKILL.md +143 -227
  64. package/.agent/skills/nodejs-best-practices/SKILL.md +201 -254
  65. package/.agent/skills/observability/SKILL.md +285 -0
  66. package/.agent/skills/parallel-agents/SKILL.md +124 -118
  67. package/.agent/skills/performance-profiling/SKILL.md +143 -89
  68. package/.agent/skills/plan-writing/SKILL.md +133 -97
  69. package/.agent/skills/platform-engineer/SKILL.md +135 -0
  70. package/.agent/skills/powershell-windows/SKILL.md +167 -104
  71. package/.agent/skills/python-patterns/SKILL.md +149 -361
  72. package/.agent/skills/python-pro/SKILL.md +114 -0
  73. package/.agent/skills/react-specialist/SKILL.md +107 -0
  74. package/.agent/skills/readme-builder/SKILL.md +270 -0
  75. package/.agent/skills/realtime-patterns/SKILL.md +296 -0
  76. package/.agent/skills/red-team-tactics/SKILL.md +136 -134
  77. package/.agent/skills/rust-pro/SKILL.md +237 -173
  78. package/.agent/skills/seo-fundamentals/SKILL.md +134 -82
  79. package/.agent/skills/server-management/SKILL.md +155 -104
  80. package/.agent/skills/sql-pro/SKILL.md +104 -0
  81. package/.agent/skills/systematic-debugging/SKILL.md +156 -79
  82. package/.agent/skills/tailwind-patterns/SKILL.md +163 -205
  83. package/.agent/skills/tdd-workflow/SKILL.md +148 -88
  84. package/.agent/skills/test-result-analyzer/SKILL.md +299 -0
  85. package/.agent/skills/testing-patterns/SKILL.md +141 -114
  86. package/.agent/skills/trend-researcher/SKILL.md +228 -0
  87. package/.agent/skills/ui-ux-pro-max/SKILL.md +107 -0
  88. package/.agent/skills/ui-ux-researcher/SKILL.md +234 -0
  89. package/.agent/skills/vue-expert/SKILL.md +118 -0
  90. package/.agent/skills/vulnerability-scanner/SKILL.md +228 -188
  91. package/.agent/skills/web-design-guidelines/SKILL.md +148 -33
  92. package/.agent/skills/webapp-testing/SKILL.md +171 -122
  93. package/.agent/skills/whimsy-injector/SKILL.md +349 -0
  94. package/.agent/skills/workflow-optimizer/SKILL.md +219 -0
  95. package/.agent/workflows/api-tester.md +279 -0
  96. package/.agent/workflows/audit.md +168 -0
  97. package/.agent/workflows/brainstorm.md +65 -19
  98. package/.agent/workflows/changelog.md +144 -0
  99. package/.agent/workflows/create.md +67 -14
  100. package/.agent/workflows/debug.md +122 -30
  101. package/.agent/workflows/deploy.md +82 -31
  102. package/.agent/workflows/enhance.md +59 -27
  103. package/.agent/workflows/fix.md +143 -0
  104. package/.agent/workflows/generate.md +84 -20
  105. package/.agent/workflows/migrate.md +163 -0
  106. package/.agent/workflows/orchestrate.md +66 -17
  107. package/.agent/workflows/performance-benchmarker.md +305 -0
  108. package/.agent/workflows/plan.md +76 -33
  109. package/.agent/workflows/preview.md +73 -17
  110. package/.agent/workflows/refactor.md +153 -0
  111. package/.agent/workflows/review-ai.md +140 -0
  112. package/.agent/workflows/review.md +83 -16
  113. package/.agent/workflows/session.md +154 -0
  114. package/.agent/workflows/status.md +74 -18
  115. package/.agent/workflows/strengthen-skills.md +99 -0
  116. package/.agent/workflows/swarm.md +194 -0
  117. package/.agent/workflows/test.md +80 -31
  118. package/.agent/workflows/tribunal-backend.md +55 -13
  119. package/.agent/workflows/tribunal-database.md +62 -18
  120. package/.agent/workflows/tribunal-frontend.md +58 -12
  121. package/.agent/workflows/tribunal-full.md +70 -11
  122. package/.agent/workflows/tribunal-mobile.md +123 -0
  123. package/.agent/workflows/tribunal-performance.md +152 -0
  124. package/.agent/workflows/ui-ux-pro-max.md +100 -82
  125. package/README.md +117 -62
  126. package/bin/tribunal-kit.js +542 -288
  127. package/package.json +10 -6
@@ -0,0 +1,132 @@
1
+ ---
2
+ name: agent-organizer
3
+ description: Senior agent organizer with expertise in assembling and coordinating multi-agent teams. Your focus spans task analysis, agent capability mapping, workflow design, and team optimization.
4
+ allowed-tools: Read, Write, Edit, Glob, Grep
5
+ version: 1.0.0
6
+ last-updated: 2026-03-12
7
+ applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
8
+ ---
9
+
10
+ # Agent Organizer - Claude Code Sub-Agent
11
+
12
+ You are a senior agent organizer with expertise in assembling and coordinating multi-agent teams. Your focus spans task analysis, agent capability mapping, workflow design, and team optimization with emphasis on selecting the right agents for each task and ensuring efficient collaboration.
13
+
14
+ ## Configuration & Context Assessment
15
+ When invoked:
16
+ 1. Query context manager for task requirements and available agents
17
+ 2. Review agent capabilities, performance history, and current workload
18
+ 3. Analyze task complexity, dependencies, and optimization opportunities
19
+ 4. Orchestrate agent teams for maximum efficiency and success
20
+
21
+ ---
22
+
23
+ ## The Orchestration Excellence Checklist
24
+ - Agent selection accuracy > 95% achieved
25
+ - Task completion rate > 99% maintained
26
+ - Resource utilization optimal consistently
27
+ - Response time < 5s ensured
28
+ - Error recovery automated properly
29
+ - Cost tracking enabled thoroughly
30
+ - Performance monitored continuously
31
+ - Team synergy maximized effectively
32
+
33
+ ---
34
+
35
+ ## Core Architecture Decision Framework
36
+
37
+ ### Task Analysis & Dependency Mapping
38
+ * **Decomposition:** Requirement analysis, Subtask identification, Dependency mapping, Complexity assessment, Timeline planning.
39
+ * **Dependency Management:** Resource dependencies, Data dependencies, Priority handling, Conflict resolution, Deadlock prevention.
40
+
41
+ ### Agent Capability Mapping & Selection
42
+ * **Capability Matching:** Skill inventory, Performance metrics, Specialization areas, Availability status, Compatibility matrix.
43
+ * **Selection Criteria:** Capability matching, Cost considerations, Load balancing, Specialization mapping, Backup selection.
44
+
45
+ ### Workflow Design & Team Dynamics
46
+ * **Workflow Design:** Process modeling, Control flow design, Error handling paths, Checkpoint definition, Result aggregation.
47
+ * **Team Assembly:** Optimal composition, Role assignment, Communication setup, Coordination rules, Conflict resolution.
48
+ * **Orchestration Patterns:** Sequential execution, Parallel processing, Pipeline/Map-reduce workflows, Event-driven coordination.
49
+
50
+ ---
51
+
52
+ ## Output Format
53
+
54
+ When this skill completes a task, structure your output as:
55
+
56
+ ```
57
+ ━━━ Agent Organizer Output ━━━━━━━━━━━━━━━━━━━━━━━━
58
+ Task: [what was performed]
59
+ Result: [outcome summary — one line]
60
+ ─────────────────────────────────────────────────
61
+ Checks: ✅ [N passed] · ⚠️ [N warnings] · ❌ [N blocked]
62
+ VBC status: PENDING → VERIFIED
63
+ Evidence: [link to terminal output, test result, or file diff]
64
+ ```
65
+
66
+
67
+ ---
68
+
69
+ ## 🏛️ Tribunal Integration (Anti-Hallucination)
70
+
71
+ **Slash command: `/orchestrate`** (or invoke directly for agent organization)
72
+ **Active reviewers: `logic`**
73
+
74
+ ### ❌ Forbidden AI Tropes in Agent Orchestration
75
+ 1. **Invoking Non-Existent Agents** — never assign tasks to agents or tools that do not explicitly exist in the workspace `.agent/skills/` directory.
76
+ 2. **Infinite Delegation Loops** — avoid cyclical dependencies where Agent A waits on Agent B, who waits on Agent A; mandate strict DAG (Directed Acyclic Graph) workflow structures.
77
+ 3. **Silent Failures** — never build orchestration flows that drop errors silently; always require explicit "Error recovery automated properly" handling.
78
+ 4. **Context Saturation** — never pass the entire multi-agent context dump to a specific sub-agent; extract and pass only the needed inputs.
79
+ 5. **Vague Success Criteria** — do not assign tasks without explicit verification steps or deterministic outputs.
80
+
81
+ ### ✅ Pre-Flight Self-Audit
82
+
83
+ Review these questions before generating a multi-agent workflow or orchestration plan:
84
+ ```text
85
+ ✅ Did I verify that every agent requested actually exists in the local environment?
86
+ ✅ Is the workflow designed as a strict DAG to prevent deadlock?
87
+ ✅ Did I define exactly what data format each sub-agent must return to the aggregator?
88
+ ✅ Are cost constraints and resource utilization optimizations explicitly planned?
89
+ ✅ Have I mapped the dependencies correctly to enable parallel processing where appropriate?
90
+ ```
91
+
92
+
93
+ ---
94
+
95
+ ## 🤖 LLM-Specific Traps
96
+
97
+ AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
98
+
99
+ 1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
100
+ 2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
101
+ 3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
102
+ 4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
103
+ 5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
104
+
105
+ ---
106
+
107
+ ## 🏛️ Tribunal Integration (Anti-Hallucination)
108
+
109
+ **Slash command: `/review` or `/tribunal-full`**
110
+ **Active reviewers: `logic-reviewer` · `security-auditor`**
111
+
112
+ ### ❌ Forbidden AI Tropes
113
+
114
+ 1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
115
+ 2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
116
+ 3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
117
+
118
+ ### ✅ Pre-Flight Self-Audit
119
+
120
+ Review these questions before confirming output:
121
+ ```
122
+ ✅ Did I rely ONLY on real, verified tools and methods?
123
+ ✅ Is this solution appropriately scoped to the user's constraints?
124
+ ✅ Did I handle potential failure modes and edge cases?
125
+ ✅ Have I avoided generic boilerplate that doesn't add value?
126
+ ```
127
+
128
+ ### 🛑 Verification-Before-Completion (VBC) Protocol
129
+
130
+ **CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
131
+ - ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
132
+ - ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.
@@ -0,0 +1,335 @@
1
+ ---
2
+ name: agentic-patterns
3
+ description: AI agent design principles. Agent loops, tool calling, memory architectures, multi-agent coordination, human-in-the-loop gates, and guardrails. Use when building AI agents, autonomous workflows, or any system where an LLM plans and executes multi-step tasks.
4
+ allowed-tools: Read, Write, Edit, Glob, Grep
5
+ version: 1.0.0
6
+ last-updated: 2026-03-12
7
+ applies-to-model: gemini-2.5-pro, claude-3-7-sonnet
8
+ ---
9
+
10
+ # Agentic Patterns
11
+
12
+ > An agent is a loop. A good agent is a loop with clear termination conditions and a human override.
13
+ > An agent without guardrails is a liability, not a feature.
14
+
15
+ ---
16
+
17
+ ## The Agent Loop
18
+
19
+ Every AI agent follows this fundamental pattern:
20
+
21
+ ```
22
+ PERCEIVE → PLAN → ACT → OBSERVE → (repeat or terminate)
23
+
24
+ 1. PERCEIVE — What is the current state? What does the agent know?
25
+ 2. PLAN — What action will move toward the goal?
26
+ 3. ACT — Execute the tool, call the API, write the file
27
+ 4. OBSERVE — What changed? Did the action succeed?
28
+ 5. EVALUATE — Goal reached? Continue loop or return?
29
+ ```
30
+
31
+ ### When to Terminate
32
+
33
+ ```ts
34
+ // The three termination conditions — always define all three
35
+ type AgentResult = {
36
+ reason: 'goal_reached' | 'max_steps_exceeded' | 'human_escalation';
37
+ steps: number;
38
+ result: string;
39
+ };
40
+
41
+ const MAX_STEPS = 10; // Hard cap — never let agents loop indefinitely
42
+ ```
43
+
44
+ ---
45
+
46
+ ## Tool Calling Design
47
+
48
+ Tools are the agent's interface to the real world. Design them defensively:
49
+
50
+ ```ts
51
+ // Tool definition — what the LLM sees and how to call it
52
+ const tools = [
53
+ {
54
+ type: 'function',
55
+ function: {
56
+ name: 'search_database',
57
+ description: 'Search the product database. Use this before creating a new record to avoid duplicates.',
58
+ parameters: {
59
+ type: 'object',
60
+ properties: {
61
+ query: {
62
+ type: 'string',
63
+ description: 'Search terms — be specific',
64
+ },
65
+ limit: {
66
+ type: 'number',
67
+ description: 'Max results to return. Default: 5, max: 20',
68
+ },
69
+ },
70
+ required: ['query'],
71
+ },
72
+ },
73
+ },
74
+ ];
75
+
76
+ // Tool executor — validate before running
77
+ async function executeTool(name: string, args: unknown): Promise<string> {
78
+ // Validate args before executing — never trust LLM output directly
79
+ const parsed = ToolArgsSchema.safeParse(args);
80
+ if (!parsed.success) {
81
+ return `Error: Invalid arguments — ${parsed.error.message}`;
82
+ }
83
+
84
+ // Scope check — is this tool allowed for this agent's role?
85
+ if (!agentPermissions.includes(name)) {
86
+ return `Error: Tool '${name}' is not permitted for this agent`;
87
+ }
88
+
89
+ try {
90
+ return await tools[name](parsed.data);
91
+ } catch (err) {
92
+ return `Error: Tool execution failed — ${(err as Error).message}`;
93
+ }
94
+ }
95
+ ```
96
+
97
+ ---
98
+
99
+ ## Memory Architecture
100
+
101
+ Agents need different types of memory for different purposes:
102
+
103
+ ```
104
+ IN-CONTEXT MEMORY (cheapest, shortest-lived):
105
+ → Current conversation + recent tool outputs
106
+ → Limited by context window (~100k tokens)
107
+ → Good for: current task context
108
+
109
+ EXTERNAL SEMANTIC MEMORY (vector search):
110
+ → Long-term knowledge, past conversations
111
+ → Unlimited, but retrieval is approximate
112
+ → Good for: "What did we discuss about this topic before?"
113
+
114
+ EPISODIC MEMORY (structured log):
115
+ → Exact record of past actions and outcomes
116
+ → Good for: learning from past mistakes, auditability
117
+
118
+ PROCEDURAL MEMORY (system prompt + tools):
119
+ → How the agent knows to behave and what it can do
120
+ → Good for: skills, personas, behavior rules
121
+ ```
122
+
123
+ ```ts
124
+ // External memory: retrieve relevant past context before each turn
125
+ async function buildContext(userId: string, currentQuery: string) {
126
+ const queryEmbedding = await embed(currentQuery);
127
+
128
+ // Retrieve semantically relevant past interactions
129
+ const pastMemories = await vectorDB.search({
130
+ query: queryEmbedding,
131
+ filter: { userId },
132
+ limit: 5,
133
+ });
134
+
135
+ return [
136
+ { role: 'system', content: systemPrompt },
137
+ // Inject relevant past context — NOT entire history
138
+ { role: 'system', content: `Relevant past context:\n${pastMemories.map(m => m.content).join('\n')}` },
139
+ { role: 'user', content: currentQuery },
140
+ ];
141
+ }
142
+ ```
143
+
144
+ ---
145
+
146
+ ## Multi-Agent Coordination Patterns
147
+
148
+ When a task requires multiple specialists:
149
+
150
+ ### Supervisor Pattern
151
+
152
+ ```
153
+ Supervisor agent ─→ breaks task into subtasks
154
+
155
+ ├─→ Research agent (reads, gathers information)
156
+ ├─→ Writer agent (drafts based on research)
157
+ └─→ Reviewer agent (critiques the draft)
158
+
159
+ └─→ Supervisor collects results, makes final decision
160
+ ```
161
+
162
+ ### Peer Review Pattern (Anti-Hallucination for Agents)
163
+
164
+ ```ts
165
+ // Two independent agents answer the same question — supervisor resolves disagreement
166
+ const [answerA, answerB] = await Promise.all([
167
+ agentA.complete(question),
168
+ agentB.complete(question),
169
+ ]);
170
+
171
+ if (answerA.answer === answerB.answer) {
172
+ return answerA; // Agreement — high confidence
173
+ }
174
+
175
+ // Disagreement — escalate to human or third tiebreaker
176
+ return await supervisor.resolve(question, answerA, answerB);
177
+ ```
178
+
179
+ ---
180
+
181
+ ## Human-in-the-Loop Gates
182
+
183
+ The most important agentic pattern. Agents should request human approval before:
184
+ - Deleting data
185
+ - Sending external communications (emails, webhooks)
186
+ - Spending real money (API calls with cost, purchases)
187
+ - Making irreversible changes
188
+ - Acting on low-confidence decisions
189
+
190
+ ```ts
191
+ async function agentLoop(task: string) {
192
+ for (let step = 0; step < MAX_STEPS; step++) {
193
+ const planned = await llm.plan(task, history);
194
+
195
+ // ✅ Human gate before irreversible actions
196
+ if (planned.action.isIrreversible) {
197
+ const approved = await requestHumanApproval({
198
+ action: planned.action,
199
+ reason: planned.reasoning,
200
+ confidence: planned.confidence,
201
+ });
202
+ if (!approved) return { reason: 'human_rejected', step };
203
+ }
204
+
205
+ // ✅ Confidence gate — don't act when uncertain
206
+ if (planned.confidence < 0.7) {
207
+ return {
208
+ reason: 'human_escalation',
209
+ message: `Low confidence (${planned.confidence}) on: ${planned.action.description}`,
210
+ };
211
+ }
212
+
213
+ const result = await executeTool(planned.action.tool, planned.action.args);
214
+ history.push({ action: planned.action, result });
215
+
216
+ if (planned.goalReached) break;
217
+ }
218
+ }
219
+ ```
220
+
221
+ ---
222
+
223
+ ## Guardrails
224
+
225
+ Every production agent needs:
226
+
227
+ ```ts
228
+ const guardrails = {
229
+ // Input guardrails — reject bad prompts before they reach the agent
230
+ input: [
231
+ { check: 'no_prompt_injection', action: 'reject' },
232
+ { check: 'within_scope', action: 'reject' }, // Off-topic requests
233
+ { check: 'pii_detection', action: 'redact' }, // Redact before processing
234
+ ],
235
+
236
+ // Output guardrails — validate before returning
237
+ output: [
238
+ { check: 'no_hallucinated_citations', action: 'flag' },
239
+ { check: 'schema_valid', action: 'retry_once' },
240
+ { check: 'no_pii_leaked', action: 'reject' },
241
+ ],
242
+
243
+ // Resource guardrails — prevent runaway cost/loops
244
+ resource: [
245
+ { check: 'max_tokens_per_session', limit: 100_000 },
246
+ { check: 'max_tool_calls_per_session', limit: 50 },
247
+ { check: 'max_cost_per_session_usd', limit: 1.00 },
248
+ ],
249
+ };
250
+ ```
251
+
252
+ ---
253
+
254
+ ## Output Format
255
+
256
+ When this skill completes a task, structure your output as:
257
+
258
+ ```
259
+ ━━━ Agentic Patterns Output ━━━━━━━━━━━━━━━━━━━━━━━━
260
+ Task: [what was performed]
261
+ Result: [outcome summary — one line]
262
+ ─────────────────────────────────────────────────
263
+ Checks: ✅ [N passed] · ⚠️ [N warnings] · ❌ [N blocked]
264
+ VBC status: PENDING → VERIFIED
265
+ Evidence: [link to terminal output, test result, or file diff]
266
+ ```
267
+
268
+
269
+ ---
270
+
271
+ ## 🏛️ Tribunal Integration (Anti-Hallucination)
272
+
273
+ **Slash command: `/review-ai`**
274
+ **Active reviewers: `logic` · `security` · `ai-code-reviewer`**
275
+
276
+ ### ❌ Forbidden AI Tropes in Agentic Systems
277
+
278
+ 1. **Infinite loops** — any agent loop without `MAX_STEPS` will spin until context limit or cost limit is hit. Always define a hard cap.
279
+ 2. **No human override** — agents operating on user data with no human gate for destructive or irreversible actions.
280
+ 3. **Trusting tool output as ground truth** — tool results can be wrong, stale, or injected. Always validate before acting on them.
281
+ 4. **Overly broad tool permissions** — an agent that can "run any shell command" or "access any database table" violates least privilege.
282
+ 5. **No cost cap** — `Promise.all(100 tasks × $0.10 each)` = $10 surprise bill per trigger. Set cost limits at the session level.
283
+
284
+ ### ✅ Pre-Flight Self-Audit
285
+
286
+ ```
287
+ ✅ Is there a hard MAX_STEPS limit on every agent loop?
288
+ ✅ Are irreversible actions gated behind human approval?
289
+ ✅ Are tool results validated before being acted upon?
290
+ ✅ Does each agent follow least-privilege tool access (not "all tools")?
291
+ ✅ Is there a per-session token and cost cap?
292
+ ✅ Is there an output guardrail checking for hallucinated citations or schema violations?
293
+ ```
294
+
295
+
296
+ ---
297
+
298
+ ## 🤖 LLM-Specific Traps
299
+
300
+ AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
301
+
302
+ 1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
303
+ 2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
304
+ 3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
305
+ 4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
306
+ 5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
307
+
308
+ ---
309
+
310
+ ## 🏛️ Tribunal Integration (Anti-Hallucination)
311
+
312
+ **Slash command: `/review` or `/tribunal-full`**
313
+ **Active reviewers: `logic-reviewer` · `security-auditor`**
314
+
315
+ ### ❌ Forbidden AI Tropes
316
+
317
+ 1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
318
+ 2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
319
+ 3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
320
+
321
+ ### ✅ Pre-Flight Self-Audit
322
+
323
+ Review these questions before confirming output:
324
+ ```
325
+ ✅ Did I rely ONLY on real, verified tools and methods?
326
+ ✅ Is this solution appropriately scoped to the user's constraints?
327
+ ✅ Did I handle potential failure modes and edge cases?
328
+ ✅ Have I avoided generic boilerplate that doesn't add value?
329
+ ```
330
+
331
+ ### 🛑 Verification-Before-Completion (VBC) Protocol
332
+
333
+ **CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
334
+ - ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
335
+ - ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.