tribunal-kit 4.0.1 → 4.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent/GEMINI.md +4 -2
- package/.agent/agents/api-architect.md +66 -0
- package/.agent/agents/db-latency-auditor.md +216 -0
- package/.agent/agents/precedence-reviewer.md +41 -4
- package/.agent/agents/resilience-reviewer.md +88 -0
- package/.agent/agents/schema-reviewer.md +67 -0
- package/.agent/agents/throughput-optimizer.md +299 -0
- package/.agent/agents/vitals-reviewer.md +223 -0
- package/.agent/history/case-law/cases/case-0001.json +33 -0
- package/.agent/history/case-law/index.json +35 -0
- package/.agent/rules/GEMINI.md +20 -3
- package/.agent/scripts/case_law_manager.py +237 -7
- package/.agent/skills/agent-organizer/SKILL.md +42 -0
- package/.agent/skills/agentic-patterns/SKILL.md +42 -0
- package/.agent/skills/ai-prompt-injection-defense/SKILL.md +42 -0
- package/.agent/skills/api-patterns/SKILL.md +42 -0
- package/.agent/skills/api-security-auditor/SKILL.md +42 -0
- package/.agent/skills/app-builder/SKILL.md +42 -0
- package/.agent/skills/app-builder/templates/SKILL.md +70 -0
- package/.agent/skills/appflow-wireframe/SKILL.md +42 -0
- package/.agent/skills/architecture/SKILL.md +42 -0
- package/.agent/skills/authentication-best-practices/SKILL.md +42 -0
- package/.agent/skills/bash-linux/SKILL.md +42 -0
- package/.agent/skills/behavioral-modes/SKILL.md +42 -0
- package/.agent/skills/brainstorming/SKILL.md +42 -0
- package/.agent/skills/building-native-ui/SKILL.md +42 -0
- package/.agent/skills/clean-code/SKILL.md +42 -0
- package/.agent/skills/code-review-checklist/SKILL.md +42 -0
- package/.agent/skills/config-validator/SKILL.md +42 -0
- package/.agent/skills/csharp-developer/SKILL.md +42 -0
- package/.agent/skills/data-validation-schemas/SKILL.md +320 -0
- package/.agent/skills/database-design/SKILL.md +42 -0
- package/.agent/skills/deployment-procedures/SKILL.md +42 -0
- package/.agent/skills/devops-engineer/SKILL.md +42 -0
- package/.agent/skills/devops-incident-responder/SKILL.md +42 -0
- package/.agent/skills/documentation-templates/SKILL.md +42 -0
- package/.agent/skills/edge-computing/SKILL.md +42 -0
- package/.agent/skills/error-resilience/SKILL.md +420 -0
- package/.agent/skills/extract-design-system/SKILL.md +42 -0
- package/.agent/skills/framer-motion-expert/SKILL.md +42 -0
- package/.agent/skills/frontend-design/SKILL.md +42 -0
- package/.agent/skills/game-design-expert/SKILL.md +42 -0
- package/.agent/skills/game-engineering-expert/SKILL.md +42 -0
- package/.agent/skills/geo-fundamentals/SKILL.md +42 -0
- package/.agent/skills/github-operations/SKILL.md +42 -0
- package/.agent/skills/gsap-core/SKILL.md +302 -0
- package/.agent/skills/gsap-frameworks/SKILL.md +201 -0
- package/.agent/skills/gsap-performance/SKILL.md +127 -0
- package/.agent/skills/gsap-plugins/SKILL.md +474 -0
- package/.agent/skills/gsap-react/SKILL.md +183 -0
- package/.agent/skills/gsap-scrolltrigger/SKILL.md +344 -0
- package/.agent/skills/gsap-timeline/SKILL.md +155 -0
- package/.agent/skills/gsap-utils/SKILL.md +332 -0
- package/.agent/skills/i18n-localization/SKILL.md +42 -0
- package/.agent/skills/intelligent-routing/SKILL.md +72 -1
- package/.agent/skills/lint-and-validate/SKILL.md +42 -0
- package/.agent/skills/llm-engineering/SKILL.md +42 -0
- package/.agent/skills/local-first/SKILL.md +42 -0
- package/.agent/skills/mcp-builder/SKILL.md +42 -0
- package/.agent/skills/mobile-design/SKILL.md +42 -0
- package/.agent/skills/monorepo-management/SKILL.md +326 -0
- package/.agent/skills/motion-engineering/SKILL.md +42 -0
- package/.agent/skills/nextjs-react-expert/SKILL.md +42 -0
- package/.agent/skills/nodejs-best-practices/SKILL.md +42 -0
- package/.agent/skills/observability/SKILL.md +42 -0
- package/.agent/skills/parallel-agents/SKILL.md +42 -0
- package/.agent/skills/performance-profiling/SKILL.md +42 -0
- package/.agent/skills/plan-writing/SKILL.md +42 -0
- package/.agent/skills/platform-engineer/SKILL.md +42 -0
- package/.agent/skills/playwright-best-practices/SKILL.md +42 -0
- package/.agent/skills/powershell-windows/SKILL.md +42 -0
- package/.agent/skills/project-idioms/SKILL.md +42 -0
- package/.agent/skills/python-patterns/SKILL.md +42 -0
- package/.agent/skills/python-pro/SKILL.md +42 -0
- package/.agent/skills/react-specialist/SKILL.md +42 -0
- package/.agent/skills/readme-builder/SKILL.md +42 -0
- package/.agent/skills/realtime-patterns/SKILL.md +42 -0
- package/.agent/skills/red-team-tactics/SKILL.md +42 -0
- package/.agent/skills/rust-pro/SKILL.md +42 -0
- package/.agent/skills/seo-fundamentals/SKILL.md +42 -0
- package/.agent/skills/server-management/SKILL.md +42 -0
- package/.agent/skills/shadcn-ui-expert/SKILL.md +42 -0
- package/.agent/skills/skill-creator/SKILL.md +42 -0
- package/.agent/skills/sql-pro/SKILL.md +42 -0
- package/.agent/skills/supabase-postgres-best-practices/SKILL.md +42 -0
- package/.agent/skills/swiftui-expert/SKILL.md +42 -0
- package/.agent/skills/systematic-debugging/SKILL.md +42 -0
- package/.agent/skills/tailwind-patterns/SKILL.md +42 -0
- package/.agent/skills/tdd-workflow/SKILL.md +42 -0
- package/.agent/skills/test-result-analyzer/SKILL.md +42 -0
- package/.agent/skills/testing-patterns/SKILL.md +42 -0
- package/.agent/skills/trend-researcher/SKILL.md +42 -0
- package/.agent/skills/typescript-advanced/SKILL.md +327 -0
- package/.agent/skills/ui-ux-pro-max/SKILL.md +42 -0
- package/.agent/skills/ui-ux-researcher/SKILL.md +42 -0
- package/.agent/skills/vue-expert/SKILL.md +42 -0
- package/.agent/skills/vulnerability-scanner/SKILL.md +42 -0
- package/.agent/skills/web-accessibility-auditor/SKILL.md +42 -0
- package/.agent/skills/web-design-guidelines/SKILL.md +42 -0
- package/.agent/skills/webapp-testing/SKILL.md +42 -0
- package/.agent/skills/whimsy-injector/SKILL.md +42 -0
- package/.agent/skills/workflow-optimizer/SKILL.md +42 -0
- package/.agent/workflows/tribunal-backend.md +13 -2
- package/.agent/workflows/tribunal-full.md +15 -8
- package/.agent/workflows/tribunal-speed.md +183 -0
- package/bin/tribunal-kit.js +10 -2
- package/package.json +2 -2
- package/.agent/skills/gsap-expert/SKILL.md +0 -194
|
@@ -22,9 +22,11 @@ import os
|
|
|
22
22
|
import sys
|
|
23
23
|
import json
|
|
24
24
|
import hashlib
|
|
25
|
+
import math
|
|
25
26
|
import re
|
|
26
27
|
from pathlib import Path
|
|
27
28
|
from datetime import datetime
|
|
29
|
+
from collections import Counter
|
|
28
30
|
|
|
29
31
|
# ── Colours ──────────────────────────────────────────────────────────────────
|
|
30
32
|
GREEN = "\033[92m"
|
|
@@ -59,7 +61,28 @@ VALID_DOMAINS = {
|
|
|
59
61
|
"performance", "mobile", "testing", "devops", "general"
|
|
60
62
|
}
|
|
61
63
|
|
|
62
|
-
VALID_VERDICTS = {"REJECTED", "APPROVED_WITH_CONDITIONS", "PRECEDENT_SET"}
|
|
64
|
+
VALID_VERDICTS = {"REJECTED", "APPROVED_WITH_CONDITIONS", "PRECEDENT_SET", "OVERRULED"}
|
|
65
|
+
|
|
66
|
+
# ── Noise filter (skip trivial rejections during auto-record) ────────────────
|
|
67
|
+
NOISE_PATTERNS = [
|
|
68
|
+
r"\bformatting\b",
|
|
69
|
+
r"\bwhitespace\b",
|
|
70
|
+
r"\bindent(ation)?\b",
|
|
71
|
+
r"\bimport\s+order\b",
|
|
72
|
+
r"\btrailing\s+(comma|space|whitespace)\b",
|
|
73
|
+
r"\bsemicolon\b",
|
|
74
|
+
r"\bprettier\b",
|
|
75
|
+
r"\beslint.*fix\b",
|
|
76
|
+
r"\blint.*only\b",
|
|
77
|
+
]
|
|
78
|
+
|
|
79
|
+
def is_noise_rejection(reason: str) -> bool:
|
|
80
|
+
"""Return True if the rejection reason is trivial (formatting/lint-only)."""
|
|
81
|
+
lower = reason.lower()
|
|
82
|
+
for pattern in NOISE_PATTERNS:
|
|
83
|
+
if re.search(pattern, lower):
|
|
84
|
+
return True
|
|
85
|
+
return False
|
|
63
86
|
|
|
64
87
|
# ── Trivial-change filter (Semantic Delta) ────────────────────────────────────
|
|
65
88
|
TRIVIAL_PATTERNS = [
|
|
@@ -158,9 +181,54 @@ def extract_tags(text: str) -> list[str]:
|
|
|
158
181
|
break
|
|
159
182
|
return tags
|
|
160
183
|
|
|
161
|
-
# ── Similarity scoring
|
|
184
|
+
# ── Similarity scoring (TF-IDF Cosine — token-free) ──────────────────────────
|
|
185
|
+
def _build_idf(corpus: list[list[str]]) -> dict[str, float]:
|
|
186
|
+
"""Compute Inverse Document Frequency across all case tag-lists."""
|
|
187
|
+
n = len(corpus)
|
|
188
|
+
if n == 0:
|
|
189
|
+
return {}
|
|
190
|
+
doc_freq: dict[str, int] = Counter()
|
|
191
|
+
for tags in corpus:
|
|
192
|
+
for unique_tag in set(tags):
|
|
193
|
+
doc_freq[unique_tag] += 1
|
|
194
|
+
return {term: math.log((n + 1) / (df + 1)) + 1.0 for term, df in doc_freq.items()}
|
|
195
|
+
|
|
196
|
+
|
|
197
|
+
def tfidf_cosine_similarity(query_tags: list[str], case_tags: list[str],
|
|
198
|
+
idf: dict[str, float]) -> float:
|
|
199
|
+
"""
|
|
200
|
+
TF-IDF weighted cosine similarity. No LLM required.
|
|
201
|
+
Significantly more accurate than Jaccard for code pattern matching.
|
|
202
|
+
"""
|
|
203
|
+
if not query_tags or not case_tags:
|
|
204
|
+
return 0.0
|
|
205
|
+
|
|
206
|
+
# Term frequency vectors
|
|
207
|
+
tf_q = Counter(query_tags)
|
|
208
|
+
tf_c = Counter(case_tags)
|
|
209
|
+
|
|
210
|
+
# All unique terms
|
|
211
|
+
all_terms = set(tf_q) | set(tf_c)
|
|
212
|
+
|
|
213
|
+
# Weighted vectors
|
|
214
|
+
dot = 0.0
|
|
215
|
+
mag_q = 0.0
|
|
216
|
+
mag_c = 0.0
|
|
217
|
+
for term in all_terms:
|
|
218
|
+
w_q = tf_q.get(term, 0) * idf.get(term, 1.0)
|
|
219
|
+
w_c = tf_c.get(term, 0) * idf.get(term, 1.0)
|
|
220
|
+
dot += w_q * w_c
|
|
221
|
+
mag_q += w_q ** 2
|
|
222
|
+
mag_c += w_c ** 2
|
|
223
|
+
|
|
224
|
+
if mag_q == 0 or mag_c == 0:
|
|
225
|
+
return 0.0
|
|
226
|
+
return dot / (math.sqrt(mag_q) * math.sqrt(mag_c))
|
|
227
|
+
|
|
228
|
+
|
|
229
|
+
# Backward-compatibility alias
|
|
162
230
|
def jaccard_similarity(tags_a: list[str], tags_b: list[str]) -> float:
|
|
163
|
-
"""
|
|
231
|
+
"""Legacy fallback — kept for compatibility but no longer primary."""
|
|
164
232
|
if not tags_a or not tags_b:
|
|
165
233
|
return 0.0
|
|
166
234
|
set_a, set_b = set(tags_a), set(tags_b)
|
|
@@ -271,10 +339,14 @@ def cmd_search_cases(args: list[str]) -> None:
|
|
|
271
339
|
print(f"{YELLOW}No cases recorded yet. Use 'add-case' to record your first rejection.{RESET}")
|
|
272
340
|
return
|
|
273
341
|
|
|
274
|
-
#
|
|
342
|
+
# Build corpus IDF from all stored cases
|
|
343
|
+
corpus = [entry.get("tags", []) for entry in index["cases"]]
|
|
344
|
+
idf = _build_idf(corpus)
|
|
345
|
+
|
|
346
|
+
# Score every case with TF-IDF cosine
|
|
275
347
|
scored = []
|
|
276
348
|
for entry in index["cases"]:
|
|
277
|
-
score =
|
|
349
|
+
score = tfidf_cosine_similarity(query_tags, entry.get("tags", []), idf)
|
|
278
350
|
if score > 0.0:
|
|
279
351
|
scored.append((score, entry))
|
|
280
352
|
|
|
@@ -438,6 +510,160 @@ def cmd_stats(args: list[str]) -> None:
|
|
|
438
510
|
print(f"{CYAN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━{RESET}\n")
|
|
439
511
|
|
|
440
512
|
|
|
513
|
+
def cmd_auto_record(args: list[str]) -> None:
|
|
514
|
+
"""
|
|
515
|
+
Non-interactive auto-recording for AI-driven case creation.
|
|
516
|
+
Called by the precedence-reviewer after a Tribunal rejection.
|
|
517
|
+
|
|
518
|
+
Usage:
|
|
519
|
+
python case_law_manager.py auto-record \\
|
|
520
|
+
--diff "code snippet" \\
|
|
521
|
+
--reason "why rejected" \\
|
|
522
|
+
--domain security \\
|
|
523
|
+
--verdict REJECTED \\
|
|
524
|
+
--reviewer security-auditor
|
|
525
|
+
"""
|
|
526
|
+
# Parse flags
|
|
527
|
+
def get_flag(name: str) -> str:
|
|
528
|
+
flag = f"--{name}"
|
|
529
|
+
all_args = sys.argv[1:]
|
|
530
|
+
if flag in all_args:
|
|
531
|
+
idx = all_args.index(flag)
|
|
532
|
+
if idx + 1 < len(all_args):
|
|
533
|
+
return all_args[idx + 1]
|
|
534
|
+
return ""
|
|
535
|
+
|
|
536
|
+
diff_text = get_flag("diff")
|
|
537
|
+
reason = get_flag("reason")
|
|
538
|
+
domain = get_flag("domain") or "general"
|
|
539
|
+
verdict = get_flag("verdict") or "REJECTED"
|
|
540
|
+
reviewer = get_flag("reviewer") or None
|
|
541
|
+
pr_ref = get_flag("pr-ref") or None
|
|
542
|
+
|
|
543
|
+
if not diff_text or not reason:
|
|
544
|
+
print(f"{RED}✖ auto-record requires --diff and --reason flags.{RESET}")
|
|
545
|
+
print(f" Usage: auto-record --diff \"code\" --reason \"why\" --domain security --reviewer agent-name")
|
|
546
|
+
sys.exit(1)
|
|
547
|
+
|
|
548
|
+
# Noise filter — skip trivial rejections
|
|
549
|
+
if is_noise_rejection(reason):
|
|
550
|
+
print(f"{DIM}⊘ Skipped: trivial rejection (noise filter matched).{RESET}")
|
|
551
|
+
return
|
|
552
|
+
|
|
553
|
+
if domain not in VALID_DOMAINS:
|
|
554
|
+
domain = "general"
|
|
555
|
+
if verdict not in VALID_VERDICTS:
|
|
556
|
+
verdict = "REJECTED"
|
|
557
|
+
|
|
558
|
+
# Duplicate check: fingerprint match
|
|
559
|
+
fingerprint = content_hash(diff_text)
|
|
560
|
+
index = load_index()
|
|
561
|
+
for existing in index["cases"]:
|
|
562
|
+
if existing.get("fingerprint") == fingerprint:
|
|
563
|
+
print(f"{YELLOW}⊘ Duplicate: Case #{existing['id']:04d} already records this pattern.{RESET}")
|
|
564
|
+
return
|
|
565
|
+
|
|
566
|
+
# Build and persist
|
|
567
|
+
delta = semantic_delta(diff_text)
|
|
568
|
+
tags = extract_tags(diff_text + " " + reason)
|
|
569
|
+
case_id = index["next_id"]
|
|
570
|
+
|
|
571
|
+
case_record = {
|
|
572
|
+
"id": case_id,
|
|
573
|
+
"fingerprint": fingerprint,
|
|
574
|
+
"timestamp": datetime.now().isoformat(timespec="seconds"),
|
|
575
|
+
"domain": domain,
|
|
576
|
+
"verdict": verdict,
|
|
577
|
+
"reason": reason.strip(),
|
|
578
|
+
"pr_ref": pr_ref,
|
|
579
|
+
"reviewer": reviewer,
|
|
580
|
+
"tags": tags,
|
|
581
|
+
"diff_raw": diff_text.strip(),
|
|
582
|
+
"diff_delta": delta,
|
|
583
|
+
"auto_recorded": True
|
|
584
|
+
}
|
|
585
|
+
|
|
586
|
+
save_case(case_record)
|
|
587
|
+
|
|
588
|
+
index["cases"].append({
|
|
589
|
+
"id": case_id,
|
|
590
|
+
"fingerprint": fingerprint,
|
|
591
|
+
"domain": domain,
|
|
592
|
+
"verdict": verdict,
|
|
593
|
+
"tags": tags,
|
|
594
|
+
"timestamp": case_record["timestamp"],
|
|
595
|
+
"reason_summary": reason.strip()[:120]
|
|
596
|
+
})
|
|
597
|
+
index["next_id"] = case_id + 1
|
|
598
|
+
save_index(index)
|
|
599
|
+
|
|
600
|
+
print(f"{GREEN}✔ Auto-recorded Case #{case_id:04d}{RESET} [{verdict}] domain={domain}")
|
|
601
|
+
print(f" {DIM}Reason: {reason[:80]}{RESET}")
|
|
602
|
+
|
|
603
|
+
|
|
604
|
+
def cmd_overrule(args: list[str]) -> None:
|
|
605
|
+
"""
|
|
606
|
+
Formally overrule a past precedent. Does NOT delete the case —
|
|
607
|
+
marks it as OVERRULED with a reason, preserving legal history.
|
|
608
|
+
"""
|
|
609
|
+
case_id = None
|
|
610
|
+
if "--id" in args:
|
|
611
|
+
try:
|
|
612
|
+
case_id = int(args[args.index("--id") + 1])
|
|
613
|
+
except (IndexError, ValueError):
|
|
614
|
+
pass
|
|
615
|
+
|
|
616
|
+
if case_id is None:
|
|
617
|
+
print(f"{RED}✖ Provide a case ID: overrule --id 7{RESET}")
|
|
618
|
+
sys.exit(1)
|
|
619
|
+
|
|
620
|
+
case_record = load_case(case_id)
|
|
621
|
+
if not case_record:
|
|
622
|
+
print(f"{RED}✖ Case #{case_id:04d} not found.{RESET}")
|
|
623
|
+
sys.exit(1)
|
|
624
|
+
|
|
625
|
+
if case_record["verdict"] == "OVERRULED":
|
|
626
|
+
print(f"{YELLOW}Case #{case_id:04d} is already OVERRULED.{RESET}")
|
|
627
|
+
return
|
|
628
|
+
|
|
629
|
+
# Get reason for overruling
|
|
630
|
+
reason = None
|
|
631
|
+
if "--reason" in args:
|
|
632
|
+
try:
|
|
633
|
+
reason = args[args.index("--reason") + 1]
|
|
634
|
+
except (IndexError, ValueError):
|
|
635
|
+
pass
|
|
636
|
+
|
|
637
|
+
if not reason:
|
|
638
|
+
reason = prompt_line("Reason for overruling this precedent:")
|
|
639
|
+
|
|
640
|
+
if not reason or not reason.strip():
|
|
641
|
+
print(f"{RED}✖ An overrule reason is required.{RESET}")
|
|
642
|
+
sys.exit(1)
|
|
643
|
+
|
|
644
|
+
# Preserve history
|
|
645
|
+
old_verdict = case_record["verdict"]
|
|
646
|
+
case_record["verdict"] = "OVERRULED"
|
|
647
|
+
case_record["overruled_at"] = datetime.now().isoformat(timespec="seconds")
|
|
648
|
+
case_record["overrule_reason"] = reason.strip()
|
|
649
|
+
case_record["previous_verdict"] = old_verdict
|
|
650
|
+
save_case(case_record)
|
|
651
|
+
|
|
652
|
+
# Update index entry
|
|
653
|
+
index = load_index()
|
|
654
|
+
for entry in index["cases"]:
|
|
655
|
+
if entry["id"] == case_id:
|
|
656
|
+
entry["verdict"] = "OVERRULED"
|
|
657
|
+
break
|
|
658
|
+
save_index(index)
|
|
659
|
+
|
|
660
|
+
print(f"\n{GREEN}✔ Case #{case_id:04d} OVERRULED{RESET}")
|
|
661
|
+
print(f" {DIM}Previous verdict : {old_verdict}{RESET}")
|
|
662
|
+
print(f" {DIM}Overrule reason : {reason.strip()}{RESET}")
|
|
663
|
+
print(f" {DIM}The case is preserved in history but no longer blocks reviews.{RESET}")
|
|
664
|
+
print()
|
|
665
|
+
|
|
666
|
+
|
|
441
667
|
# ── Input helpers ─────────────────────────────────────────────────────────────
|
|
442
668
|
def prompt_multiline(prompt: str, sentinel: str) -> str:
|
|
443
669
|
print(f" {BOLD}{prompt}{RESET}")
|
|
@@ -480,9 +706,11 @@ def prompt_choice(label: str, choices: list[str], default: str) -> str:
|
|
|
480
706
|
# ── Main ──────────────────────────────────────────────────────────────────────
|
|
481
707
|
COMMANDS = {
|
|
482
708
|
"add-case": cmd_add_case,
|
|
709
|
+
"auto-record": cmd_auto_record,
|
|
483
710
|
"search-cases": cmd_search_cases,
|
|
484
711
|
"list": cmd_list,
|
|
485
712
|
"show": cmd_show,
|
|
713
|
+
"overrule": cmd_overrule,
|
|
486
714
|
"export": cmd_export,
|
|
487
715
|
"stats": cmd_stats,
|
|
488
716
|
}
|
|
@@ -498,10 +726,12 @@ def main() -> None:
|
|
|
498
726
|
{BOLD}case_law_manager.py{RESET} — Tribunal Case Law Engine
|
|
499
727
|
|
|
500
728
|
{BOLD}Commands:{RESET}
|
|
501
|
-
add-case Record a new rejected pattern
|
|
502
|
-
|
|
729
|
+
add-case Record a new rejected pattern (interactive)
|
|
730
|
+
auto-record --diff --reason Record a rejection (non-interactive, for AI agents)
|
|
731
|
+
search-cases --query <text> Find relevant precedents (TF-IDF cosine, token-free)
|
|
503
732
|
list [--domain <domain>] List all recorded cases
|
|
504
733
|
show --id <N> Show full diff for a case
|
|
734
|
+
overrule --id <N> Formally overrule a past precedent
|
|
505
735
|
export [--stdout] Export all cases to Markdown
|
|
506
736
|
stats Show breakdown by domain/verdict
|
|
507
737
|
|
|
@@ -98,3 +98,45 @@ Automation without oversight is reckless. The Organizer manages when to pause an
|
|
|
98
98
|
2. **Recovery Gate (After 3 Failures):** "The database migration script has failed 3 times. I am halting. How would you like to proceed?"
|
|
99
99
|
|
|
100
100
|
---
|
|
101
|
+
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
## 🤖 LLM-Specific Traps
|
|
106
|
+
|
|
107
|
+
AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
|
|
108
|
+
|
|
109
|
+
1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
|
|
110
|
+
2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
|
|
111
|
+
3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
|
|
112
|
+
4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
113
|
+
5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
## 🏛️ Tribunal Integration (Anti-Hallucination)
|
|
118
|
+
|
|
119
|
+
**Slash command: `/review` or `/tribunal-full`**
|
|
120
|
+
**Active reviewers: `logic-reviewer` · `security-auditor`**
|
|
121
|
+
|
|
122
|
+
### ❌ Forbidden AI Tropes
|
|
123
|
+
|
|
124
|
+
1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
|
|
125
|
+
2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
|
|
126
|
+
3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
127
|
+
|
|
128
|
+
### ✅ Pre-Flight Self-Audit
|
|
129
|
+
|
|
130
|
+
Review these questions before confirming output:
|
|
131
|
+
```
|
|
132
|
+
✅ Did I rely ONLY on real, verified tools and methods?
|
|
133
|
+
✅ Is this solution appropriately scoped to the user's constraints?
|
|
134
|
+
✅ Did I handle potential failure modes and edge cases?
|
|
135
|
+
✅ Have I avoided generic boilerplate that doesn't add value?
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### 🛑 Verification-Before-Completion (VBC) Protocol
|
|
139
|
+
|
|
140
|
+
**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
|
|
141
|
+
- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
|
|
142
|
+
- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.
|
|
@@ -263,3 +263,45 @@ Evidence: [link to terminal output, test result, or file diff]
|
|
|
263
263
|
```
|
|
264
264
|
|
|
265
265
|
---
|
|
266
|
+
|
|
267
|
+
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
## 🤖 LLM-Specific Traps
|
|
271
|
+
|
|
272
|
+
AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
|
|
273
|
+
|
|
274
|
+
1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
|
|
275
|
+
2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
|
|
276
|
+
3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
|
|
277
|
+
4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
278
|
+
5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
|
|
279
|
+
|
|
280
|
+
---
|
|
281
|
+
|
|
282
|
+
## 🏛️ Tribunal Integration (Anti-Hallucination)
|
|
283
|
+
|
|
284
|
+
**Slash command: `/review` or `/tribunal-full`**
|
|
285
|
+
**Active reviewers: `logic-reviewer` · `security-auditor`**
|
|
286
|
+
|
|
287
|
+
### ❌ Forbidden AI Tropes
|
|
288
|
+
|
|
289
|
+
1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
|
|
290
|
+
2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
|
|
291
|
+
3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
292
|
+
|
|
293
|
+
### ✅ Pre-Flight Self-Audit
|
|
294
|
+
|
|
295
|
+
Review these questions before confirming output:
|
|
296
|
+
```
|
|
297
|
+
✅ Did I rely ONLY on real, verified tools and methods?
|
|
298
|
+
✅ Is this solution appropriately scoped to the user's constraints?
|
|
299
|
+
✅ Did I handle potential failure modes and edge cases?
|
|
300
|
+
✅ Have I avoided generic boilerplate that doesn't add value?
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
### 🛑 Verification-Before-Completion (VBC) Protocol
|
|
304
|
+
|
|
305
|
+
**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
|
|
306
|
+
- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
|
|
307
|
+
- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.
|
|
@@ -132,3 +132,45 @@ Many injections occur because the LLM includes malicious data in its output, whi
|
|
|
132
132
|
- **Enforce JSON Schemas.** If the LLM goes off-script and starts blabbering, Zod validation should instantly fail the parsing and reject the output.
|
|
133
133
|
|
|
134
134
|
---
|
|
135
|
+
|
|
136
|
+
|
|
137
|
+
---
|
|
138
|
+
|
|
139
|
+
## 🤖 LLM-Specific Traps
|
|
140
|
+
|
|
141
|
+
AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
|
|
142
|
+
|
|
143
|
+
1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
|
|
144
|
+
2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
|
|
145
|
+
3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
|
|
146
|
+
4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
147
|
+
5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## 🏛️ Tribunal Integration (Anti-Hallucination)
|
|
152
|
+
|
|
153
|
+
**Slash command: `/review` or `/tribunal-full`**
|
|
154
|
+
**Active reviewers: `logic-reviewer` · `security-auditor`**
|
|
155
|
+
|
|
156
|
+
### ❌ Forbidden AI Tropes
|
|
157
|
+
|
|
158
|
+
1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
|
|
159
|
+
2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
|
|
160
|
+
3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
161
|
+
|
|
162
|
+
### ✅ Pre-Flight Self-Audit
|
|
163
|
+
|
|
164
|
+
Review these questions before confirming output:
|
|
165
|
+
```
|
|
166
|
+
✅ Did I rely ONLY on real, verified tools and methods?
|
|
167
|
+
✅ Is this solution appropriately scoped to the user's constraints?
|
|
168
|
+
✅ Did I handle potential failure modes and edge cases?
|
|
169
|
+
✅ Have I avoided generic boilerplate that doesn't add value?
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
### 🛑 Verification-Before-Completion (VBC) Protocol
|
|
173
|
+
|
|
174
|
+
**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
|
|
175
|
+
- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
|
|
176
|
+
- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.
|
|
@@ -195,3 +195,45 @@ Protect against:
|
|
|
195
195
|
| **OAuth 2.0 / OIDC** | Third-party login, delegated access |
|
|
196
196
|
| **API Key** | Server-to-server, public API consumers |
|
|
197
197
|
| **Passkey (WebAuthn)** | Modern passwordless (2026+) |
|
|
198
|
+
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## 🤖 LLM-Specific Traps
|
|
203
|
+
|
|
204
|
+
AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
|
|
205
|
+
|
|
206
|
+
1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
|
|
207
|
+
2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
|
|
208
|
+
3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
|
|
209
|
+
4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
210
|
+
5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
|
|
211
|
+
|
|
212
|
+
---
|
|
213
|
+
|
|
214
|
+
## 🏛️ Tribunal Integration (Anti-Hallucination)
|
|
215
|
+
|
|
216
|
+
**Slash command: `/review` or `/tribunal-full`**
|
|
217
|
+
**Active reviewers: `logic-reviewer` · `security-auditor`**
|
|
218
|
+
|
|
219
|
+
### ❌ Forbidden AI Tropes
|
|
220
|
+
|
|
221
|
+
1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
|
|
222
|
+
2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
|
|
223
|
+
3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
224
|
+
|
|
225
|
+
### ✅ Pre-Flight Self-Audit
|
|
226
|
+
|
|
227
|
+
Review these questions before confirming output:
|
|
228
|
+
```
|
|
229
|
+
✅ Did I rely ONLY on real, verified tools and methods?
|
|
230
|
+
✅ Is this solution appropriately scoped to the user's constraints?
|
|
231
|
+
✅ Did I handle potential failure modes and edge cases?
|
|
232
|
+
✅ Have I avoided generic boilerplate that doesn't add value?
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
### 🛑 Verification-Before-Completion (VBC) Protocol
|
|
236
|
+
|
|
237
|
+
**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
|
|
238
|
+
- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
|
|
239
|
+
- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.
|
|
@@ -141,3 +141,45 @@ const server = new ApolloServer({
|
|
|
141
141
|
```
|
|
142
142
|
|
|
143
143
|
---
|
|
144
|
+
|
|
145
|
+
|
|
146
|
+
---
|
|
147
|
+
|
|
148
|
+
## 🤖 LLM-Specific Traps
|
|
149
|
+
|
|
150
|
+
AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
|
|
151
|
+
|
|
152
|
+
1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
|
|
153
|
+
2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
|
|
154
|
+
3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
|
|
155
|
+
4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
156
|
+
5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## 🏛️ Tribunal Integration (Anti-Hallucination)
|
|
161
|
+
|
|
162
|
+
**Slash command: `/review` or `/tribunal-full`**
|
|
163
|
+
**Active reviewers: `logic-reviewer` · `security-auditor`**
|
|
164
|
+
|
|
165
|
+
### ❌ Forbidden AI Tropes
|
|
166
|
+
|
|
167
|
+
1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
|
|
168
|
+
2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
|
|
169
|
+
3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
170
|
+
|
|
171
|
+
### ✅ Pre-Flight Self-Audit
|
|
172
|
+
|
|
173
|
+
Review these questions before confirming output:
|
|
174
|
+
```
|
|
175
|
+
✅ Did I rely ONLY on real, verified tools and methods?
|
|
176
|
+
✅ Is this solution appropriately scoped to the user's constraints?
|
|
177
|
+
✅ Did I handle potential failure modes and edge cases?
|
|
178
|
+
✅ Have I avoided generic boilerplate that doesn't add value?
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
### 🛑 Verification-Before-Completion (VBC) Protocol
|
|
182
|
+
|
|
183
|
+
**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
|
|
184
|
+
- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
|
|
185
|
+
- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.
|
|
@@ -520,3 +520,45 @@ Monorepo:
|
|
|
520
520
|
|Payment|Stripe|LemonSqueezy, Paddle|
|
|
521
521
|
|Email|-|Resend, SendGrid|
|
|
522
522
|
|Search|-|Algolia, Typesense|
|
|
523
|
+
|
|
524
|
+
|
|
525
|
+
---
|
|
526
|
+
|
|
527
|
+
## 🤖 LLM-Specific Traps
|
|
528
|
+
|
|
529
|
+
AI coding assistants often fall into specific bad habits when dealing with this domain. These are strictly forbidden:
|
|
530
|
+
|
|
531
|
+
1. **Over-engineering:** Proposing complex abstractions or distributed systems when a simpler approach suffices.
|
|
532
|
+
2. **Hallucinated Libraries/Methods:** Using non-existent methods or packages. Always `// VERIFY` or check `package.json` / `requirements.txt`.
|
|
533
|
+
3. **Skipping Edge Cases:** Writing the "happy path" and ignoring error handling, timeouts, or data validation.
|
|
534
|
+
4. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
535
|
+
5. **Silent Degradation:** Catching and suppressing errors without logging or re-raising.
|
|
536
|
+
|
|
537
|
+
---
|
|
538
|
+
|
|
539
|
+
## 🏛️ Tribunal Integration (Anti-Hallucination)
|
|
540
|
+
|
|
541
|
+
**Slash command: `/review` or `/tribunal-full`**
|
|
542
|
+
**Active reviewers: `logic-reviewer` · `security-auditor`**
|
|
543
|
+
|
|
544
|
+
### ❌ Forbidden AI Tropes
|
|
545
|
+
|
|
546
|
+
1. **Blind Assumptions:** Never make an assumption without documenting it clearly with `// VERIFY: [reason]`.
|
|
547
|
+
2. **Silent Degradation:** Catching and suppressing errors without logging or handling.
|
|
548
|
+
3. **Context Amnesia:** Forgetting the user's constraints and offering generic advice instead of tailored solutions.
|
|
549
|
+
|
|
550
|
+
### ✅ Pre-Flight Self-Audit
|
|
551
|
+
|
|
552
|
+
Review these questions before confirming output:
|
|
553
|
+
```
|
|
554
|
+
✅ Did I rely ONLY on real, verified tools and methods?
|
|
555
|
+
✅ Is this solution appropriately scoped to the user's constraints?
|
|
556
|
+
✅ Did I handle potential failure modes and edge cases?
|
|
557
|
+
✅ Have I avoided generic boilerplate that doesn't add value?
|
|
558
|
+
```
|
|
559
|
+
|
|
560
|
+
### 🛑 Verification-Before-Completion (VBC) Protocol
|
|
561
|
+
|
|
562
|
+
**CRITICAL:** You must follow a strict "evidence-based closeout" state machine.
|
|
563
|
+
- ❌ **Forbidden:** Declaring a task complete because the output "looks correct."
|
|
564
|
+
- ✅ **Required:** You are explicitly forbidden from finalizing any task without providing **concrete evidence** (terminal output, passing tests, compile success, or equivalent proof) that your output works as intended.
|
|
@@ -35,3 +35,73 @@ allowed-tools: Read, Glob, Grep
|
|
|
35
35
|
2. Match to appropriate template
|
|
36
36
|
3. Read ONLY that template's TEMPLATE.md
|
|
37
37
|
4. Follow its tech stack and structure
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## 🚨 LLM Trap Table
|
|
42
|
+
|
|
43
|
+
|Pattern|What AI Does Wrong|What Is Actually Correct|
|
|
44
|
+
|:---|:---|:---|
|
|
45
|
+
|[domain-specific trap 1]|[hallucination]|[correct behavior]|
|
|
46
|
+
|[domain-specific trap 2]|[hallucination]|[correct behavior]|
|
|
47
|
+
|[domain-specific trap 3]|[hallucination]|[correct behavior]|
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## ✅ Pre-Flight Self-Audit
|
|
52
|
+
|
|
53
|
+
Before producing any output, verify:
|
|
54
|
+
``
|
|
55
|
+
✅ Did I read the actual files before making claims about them?
|
|
56
|
+
✅ Did I verify all method names against official documentation?
|
|
57
|
+
✅ Did I add // VERIFY: on any uncertain API calls?
|
|
58
|
+
✅ Are all imports from packages that actually exist in package.json?
|
|
59
|
+
✅ Did I test my logic with edge cases (null, empty, 0, max)?
|
|
60
|
+
✅ Did I avoid generating code for more than one module at a time?
|
|
61
|
+
✅ Am I working from evidence, not assumption?
|
|
62
|
+
``
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## 🔠VBC Protocol (Verify → Build → Confirm)
|
|
67
|
+
|
|
68
|
+
``
|
|
69
|
+
VERIFY: Read the actual codebase before writing anything
|
|
70
|
+
BUILD: Generate the smallest meaningful unit of code
|
|
71
|
+
CONFIRM: Verify the output is correct before presenting
|
|
72
|
+
``
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## 🚨 LLM Trap Table
|
|
77
|
+
|
|
78
|
+
|Pattern|What AI Does Wrong|What Is Actually Correct|
|
|
79
|
+
|:---|:---|:---|
|
|
80
|
+
|[domain-specific trap 1]|[hallucination]|[correct behavior]|
|
|
81
|
+
|[domain-specific trap 2]|[hallucination]|[correct behavior]|
|
|
82
|
+
|[domain-specific trap 3]|[hallucination]|[correct behavior]|
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## ✅ Pre-Flight Self-Audit
|
|
87
|
+
|
|
88
|
+
Before producing any output, verify:
|
|
89
|
+
``
|
|
90
|
+
✅ Did I read the actual files before making claims about them?
|
|
91
|
+
✅ Did I verify all method names against official documentation?
|
|
92
|
+
✅ Did I add // VERIFY: on any uncertain API calls?
|
|
93
|
+
✅ Are all imports from packages that actually exist in package.json?
|
|
94
|
+
✅ Did I test my logic with edge cases (null, empty, 0, max)?
|
|
95
|
+
✅ Did I avoid generating code for more than one module at a time?
|
|
96
|
+
✅ Am I working from evidence, not assumption?
|
|
97
|
+
``
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## 🔠VBC Protocol (Verify → Build → Confirm)
|
|
102
|
+
|
|
103
|
+
``
|
|
104
|
+
VERIFY: Read the actual codebase before writing anything
|
|
105
|
+
BUILD: Generate the smallest meaningful unit of code
|
|
106
|
+
CONFIRM: Verify the output is correct before presenting
|
|
107
|
+
``
|