@kevinrabun/judges 3.20.2 → 3.20.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +14 -0
- package/README.md +16 -16
- package/dist/tools/deep-review.d.ts.map +1 -1
- package/dist/tools/deep-review.js +7 -1
- package/dist/tools/deep-review.js.map +1 -1
- package/package.json +2 -2
- package/server.json +5 -5
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,20 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to **@kevinrabun/judges** are documented here.
|
|
4
4
|
|
|
5
|
+
## [3.20.4] — 2026-03-03
|
|
6
|
+
|
|
7
|
+
### Fixed
|
|
8
|
+
- **Stale documentation counts** — Updated all references across README, docs, server.json, action.yml, package.json, Dockerfile, extension metadata, examples, and scripts from "35 judges" → "37 judges", "47 patches" → "53 patches", and test badge "1515" → "1557". Historical changelog entries left unchanged.
|
|
9
|
+
|
|
10
|
+
### Tests
|
|
11
|
+
- **Doc-claim verification tests** — Added 42 new tests covering: JUDGES array count assertion (exactly 37), judge schema validation (id, name, domain, description), unique judge ID enforcement, scoring penalty constants (critical=30, high=18, medium=10, low=5, info=2), confidence-weighted deductions, score floor/ceiling, positive signal bonuses (+3/+3/+3/+2/+2/+2/+2/+1/+1/+1 with cap at 15), verdict threshold logic (fail/warning/pass boundaries), and STRUCT threshold rules not previously covered: STRUCT-001 (CC>10), STRUCT-007 (file CC>40), STRUCT-008 (CC>20), STRUCT-010 (>150 lines).
|
|
12
|
+
- All 1,557 tests pass (976 judges + 218 negative + 251 subsystems + 70 extension + 42 tool-routing)
|
|
13
|
+
|
|
14
|
+
## [3.20.3] — 2026-03-03
|
|
15
|
+
|
|
16
|
+
### Fixed
|
|
17
|
+
- **Azure resource ID false positive** — Layer 2 deep review no longer flags Azure resource identifiers (policy definition IDs, role definition IDs, tenant IDs, subscription GUIDs) as "invalid GUIDs" when they contain characters outside the hex range. All three deep-review builders (single-judge, tribunal, simplified) now include explicit guidance that Azure resource IDs are opaque platform constants and must not be validated for strict UUID compliance.
|
|
18
|
+
|
|
5
19
|
## [3.20.2] — 2026-03-03
|
|
6
20
|
|
|
7
21
|
### Fixed
|
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Judges Panel
|
|
2
2
|
|
|
3
|
-
An MCP (Model Context Protocol) server that provides a panel of **
|
|
3
|
+
An MCP (Model Context Protocol) server that provides a panel of **37 specialized judges** to evaluate AI-generated code — acting as an independent quality gate regardless of which project is being reviewed. Combines **deterministic pattern matching & AST analysis** (instant, offline, zero LLM calls) with **LLM-powered deep-review prompts** that let your AI assistant perform expert-persona analysis across all 37 domains.
|
|
4
4
|
|
|
5
5
|
**Highlights:**
|
|
6
6
|
- Includes an **App Builder Workflow (3-step)** demo for release decisions, plain-language risk summaries, and prioritized fixes — see [Try the Demo](#2-try-the-demo).
|
|
@@ -11,7 +11,7 @@ An MCP (Model Context Protocol) server that provides a panel of **35 specialized
|
|
|
11
11
|
[](https://www.npmjs.com/package/@kevinrabun/judges)
|
|
12
12
|
[](https://www.npmjs.com/package/@kevinrabun/judges)
|
|
13
13
|
[](https://opensource.org/licenses/MIT)
|
|
14
|
-
[](https://github.com/KevinRabun/judges/actions)
|
|
15
15
|
|
|
16
16
|
---
|
|
17
17
|
|
|
@@ -21,10 +21,10 @@ AI code generators (Copilot, Cursor, Claude, ChatGPT, etc.) write code fast —
|
|
|
21
21
|
|
|
22
22
|
| | ESLint / Biome | SonarQube | Semgrep / CodeQL | **Judges** |
|
|
23
23
|
|---|---|---|---|---|
|
|
24
|
-
| **Scope** | Style + some bugs | Bugs + code smells | Security patterns | **
|
|
24
|
+
| **Scope** | Style + some bugs | Bugs + code smells | Security patterns | **37 domains**: security, cost, compliance, a11y, API design, cloud, UX, … |
|
|
25
25
|
| **AI-generated code focus** | No | No | Partial | **Purpose-built** for AI output failure modes |
|
|
26
26
|
| **Setup** | Config per project | Server + scanner | Cloud or local | **One command**: `npx @kevinrabun/judges eval file.ts` |
|
|
27
|
-
| **Auto-fix patches** | Some | No | No | **
|
|
27
|
+
| **Auto-fix patches** | Some | No | No | **53 deterministic patches** — instant, offline |
|
|
28
28
|
| **Non-technical output** | No | Dashboard | No | **Plain-language findings** with What/Why/Next |
|
|
29
29
|
| **MCP native** | No | No | No | **Yes** — works inside Copilot, Claude, Cursor |
|
|
30
30
|
| **SARIF output** | No | Yes | Yes | **Yes** — upload to GitHub Code Scanning |
|
|
@@ -79,7 +79,7 @@ judges eval --min-score 80 src/api.ts
|
|
|
79
79
|
# One-line summary for scripts
|
|
80
80
|
judges eval --summary src/api.ts
|
|
81
81
|
|
|
82
|
-
# List all
|
|
82
|
+
# List all 37 judges
|
|
83
83
|
judges list
|
|
84
84
|
```
|
|
85
85
|
|
|
@@ -190,7 +190,7 @@ npm run build
|
|
|
190
190
|
|
|
191
191
|
### 2. Try the Demo
|
|
192
192
|
|
|
193
|
-
Run the included demo to see all
|
|
193
|
+
Run the included demo to see all 37 judges evaluate a purposely flawed API server:
|
|
194
194
|
|
|
195
195
|
```bash
|
|
196
196
|
npm run demo
|
|
@@ -293,7 +293,7 @@ Install the **[Judges Panel](https://marketplace.visualstudio.com/items?itemName
|
|
|
293
293
|
|
|
294
294
|
- **Inline diagnostics & quick-fixes** on every file save
|
|
295
295
|
- **`@judges` chat participant** — type `@judges` in Copilot Chat, or just ask for a "judges panel review" and Copilot routes automatically
|
|
296
|
-
- **Auto-configured MCP server** — all
|
|
296
|
+
- **Auto-configured MCP server** — all 37 expert-persona prompts available to Copilot with zero setup
|
|
297
297
|
|
|
298
298
|
```bash
|
|
299
299
|
code --install-extension kevinrabun.judges-panel
|
|
@@ -420,7 +420,7 @@ All commands support `--help` for usage details.
|
|
|
420
420
|
|
|
421
421
|
### `judges eval`
|
|
422
422
|
|
|
423
|
-
Evaluate a file with all
|
|
423
|
+
Evaluate a file with all 37 judges or a single judge.
|
|
424
424
|
|
|
425
425
|
| Flag | Description |
|
|
426
426
|
|------|-------------|
|
|
@@ -667,13 +667,13 @@ The tribunal operates in three layers:
|
|
|
667
667
|
|
|
668
668
|
2. **AST-Based Structural Analysis** — The Code Structure judge (`STRUCT-*` rules) uses real Abstract Syntax Tree parsing to measure cyclomatic complexity, nesting depth, function length, parameter count, dead code, and type safety with precision that regex cannot achieve. All supported languages — **TypeScript, JavaScript, Python, Rust, Go, Java, C#, and C++** — are parsed via **tree-sitter WASM grammars** (real syntax trees compiled to WebAssembly, in-process, zero native dependencies). A scope-tracking structural parser is kept as a fallback when WASM grammars are unavailable. No external AST server required.
|
|
669
669
|
|
|
670
|
-
3. **LLM-Powered Deep Analysis (Prompts)** — The server exposes MCP prompts (e.g., `judge-data-security`, `full-tribunal`) that provide each judge's expert persona as a system prompt. When used by an LLM-based client (Copilot, Claude, Cursor, etc.), the host LLM performs deeper, context-aware probabilistic analysis beyond what static patterns can detect. This is where the `systemPrompt` on each judge comes alive — Judges itself makes no LLM calls, but it provides the expert criteria so your AI assistant can act as
|
|
670
|
+
3. **LLM-Powered Deep Analysis (Prompts)** — The server exposes MCP prompts (e.g., `judge-data-security`, `full-tribunal`) that provide each judge's expert persona as a system prompt. When used by an LLM-based client (Copilot, Claude, Cursor, etc.), the host LLM performs deeper, context-aware probabilistic analysis beyond what static patterns can detect. This is where the `systemPrompt` on each judge comes alive — Judges itself makes no LLM calls, but it provides the expert criteria so your AI assistant can act as 37 specialized reviewers.
|
|
671
671
|
|
|
672
672
|
---
|
|
673
673
|
|
|
674
674
|
## Composable by Design
|
|
675
675
|
|
|
676
|
-
Judges Panel is a **dual-layer** review system: instant **deterministic tools** (offline, no API keys) for pattern and AST analysis, plus **
|
|
676
|
+
Judges Panel is a **dual-layer** review system: instant **deterministic tools** (offline, no API keys) for pattern and AST analysis, plus **37 expert-persona MCP prompts** that unlock LLM-powered deep analysis when connected to an AI client. It does not try to be a CVE scanner or a linter. Those capabilities belong in dedicated MCP servers that an AI agent can orchestrate alongside Judges.
|
|
677
677
|
|
|
678
678
|
### Built-in AST Analysis (v2.0.0+)
|
|
679
679
|
|
|
@@ -722,7 +722,7 @@ When your AI coding assistant connects to multiple MCP servers, each one contrib
|
|
|
722
722
|
|
|
723
723
|
| Layer | What It Does | Example Servers |
|
|
724
724
|
|-------|-------------|-----------------|
|
|
725
|
-
| **Judges Panel** |
|
|
725
|
+
| **Judges Panel** | 37-judge quality gate — security patterns, AST analysis, cost, scalability, a11y, compliance, sovereignty, ethics, dependency health, agent instruction governance, AI code safety, framework safety | This server |
|
|
726
726
|
| **CVE / SBOM** | Vulnerability scanning against live databases — known CVEs, license risks, supply chain | OSV, Snyk, Trivy, Grype MCP servers |
|
|
727
727
|
| **Linting** | Language-specific style and correctness rules | ESLint, Ruff, Clippy MCP servers |
|
|
728
728
|
| **Runtime Profiling** | Memory, CPU, latency measurement on running code | Custom profiling MCP servers |
|
|
@@ -876,7 +876,7 @@ Generated from https://github.com/microsoft/vscode on 2026-02-21T12:00:00.000Z.
|
|
|
876
876
|
List all available judges with their domains and descriptions.
|
|
877
877
|
|
|
878
878
|
### `evaluate_code`
|
|
879
|
-
Submit code to the **full judges panel**. All
|
|
879
|
+
Submit code to the **full judges panel**. All 37 judges evaluate independently and return a combined verdict.
|
|
880
880
|
|
|
881
881
|
| Parameter | Type | Required | Description |
|
|
882
882
|
|-----------|------|----------|-------------|
|
|
@@ -900,7 +900,7 @@ Submit code to a **specific judge** for targeted review.
|
|
|
900
900
|
| `config` | object | no | Inline configuration (see [Configuration](#configuration)) |
|
|
901
901
|
|
|
902
902
|
### `evaluate_project`
|
|
903
|
-
Submit multiple files for **project-level analysis**. All
|
|
903
|
+
Submit multiple files for **project-level analysis**. All 37 judges evaluate each file, plus cross-file architectural analysis detects code duplication, inconsistent error handling, and dependency cycles.
|
|
904
904
|
|
|
905
905
|
| Parameter | Type | Required | Description |
|
|
906
906
|
|-----------|------|----------|-------------|
|
|
@@ -911,7 +911,7 @@ Submit multiple files for **project-level analysis**. All 35 judges evaluate eac
|
|
|
911
911
|
| `config` | object | no | Inline configuration (see [Configuration](#configuration)) |
|
|
912
912
|
|
|
913
913
|
### `evaluate_diff`
|
|
914
|
-
Evaluate only the **changed lines** in a code diff. Runs all
|
|
914
|
+
Evaluate only the **changed lines** in a code diff. Runs all 37 judges on the full file but filters findings to lines you specify. Ideal for PR reviews and incremental analysis.
|
|
915
915
|
|
|
916
916
|
| Parameter | Type | Required | Description |
|
|
917
917
|
|-----------|------|----------|-------------|
|
|
@@ -979,7 +979,7 @@ Each judge has a corresponding prompt for LLM-powered deep analysis:
|
|
|
979
979
|
| `judge-agent-instructions` | Deep review of agent instruction markdown quality and safety |
|
|
980
980
|
| `judge-ai-code-safety` | Deep review of AI-generated code risks: prompt injection, insecure LLM output handling, debug defaults, missing validation |
|
|
981
981
|
| `judge-framework-safety` | Deep review of framework-specific safety: React hooks, Express middleware, Next.js SSR/SSG, Angular/Vue patterns |
|
|
982
|
-
| `full-tribunal` | All
|
|
982
|
+
| `full-tribunal` | All 37 judges in a single prompt |
|
|
983
983
|
|
|
984
984
|
---
|
|
985
985
|
|
|
@@ -1101,7 +1101,7 @@ Each judge scores the code from **0 to 100**:
|
|
|
1101
1101
|
- **WARNING** — Any high finding, any medium finding, or score < 80
|
|
1102
1102
|
- **PASS** — Score ≥ 80 with no critical, high, or medium findings
|
|
1103
1103
|
|
|
1104
|
-
The **overall tribunal score** is the average of all
|
|
1104
|
+
The **overall tribunal score** is the average of all 37 judges. The overall verdict fails if **any** judge fails.
|
|
1105
1105
|
|
|
1106
1106
|
---
|
|
1107
1107
|
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"deep-review.d.ts","sourceRoot":"","sources":["../../src/tools/deep-review.ts"],"names":[],"mappings":"AAOA,OAAO,KAAK,EAAE,eAAe,EAAE,MAAM,aAAa,CAAC;AAqBnD;;;;GAIG;AACH,wBAAgB,sBAAsB,CAAC,YAAY,EAAE,MAAM,GAAG,OAAO,CAIpE;AAiBD,6DAA6D;AAC7D,eAAO,MAAM,wBAAwB,QAGoD,CAAC;AAE1F,sDAAsD;AACtD,eAAO,MAAM,oBAAoB,QAMqC,CAAC;AAIvE,wBAAgB,iCAAiC,CAAC,KAAK,EAAE,eAAe,EAAE,QAAQ,EAAE,MAAM,EAAE,OAAO,CAAC,EAAE,MAAM,GAAG,MAAM,
|
|
1
|
+
{"version":3,"file":"deep-review.d.ts","sourceRoot":"","sources":["../../src/tools/deep-review.ts"],"names":[],"mappings":"AAOA,OAAO,KAAK,EAAE,eAAe,EAAE,MAAM,aAAa,CAAC;AAqBnD;;;;GAIG;AACH,wBAAgB,sBAAsB,CAAC,YAAY,EAAE,MAAM,GAAG,OAAO,CAIpE;AAiBD,6DAA6D;AAC7D,eAAO,MAAM,wBAAwB,QAGoD,CAAC;AAE1F,sDAAsD;AACtD,eAAO,MAAM,oBAAoB,QAMqC,CAAC;AAIvE,wBAAgB,iCAAiC,CAAC,KAAK,EAAE,eAAe,EAAE,QAAQ,EAAE,MAAM,EAAE,OAAO,CAAC,EAAE,MAAM,GAAG,MAAM,CA8CpH;AAID,wBAAgB,8BAA8B,CAAC,MAAM,EAAE,eAAe,EAAE,EAAE,QAAQ,EAAE,MAAM,EAAE,OAAO,CAAC,EAAE,MAAM,GAAG,MAAM,CA6CpH;AAOD,wBAAgB,gCAAgC,CAAC,QAAQ,EAAE,MAAM,EAAE,OAAO,CAAC,EAAE,MAAM,GAAG,MAAM,CAiC3F"}
|
|
@@ -73,6 +73,10 @@ export function buildSingleJudgeDeepReviewSection(judge, language, context) {
|
|
|
73
73
|
md += `Every finding MUST cite specific code evidence (exact line numbers, API calls, variable names, or patterns). `;
|
|
74
74
|
md += `Do NOT flag the absence of a feature unless you can identify where it SHOULD have been implemented and why it is required for THIS code. `;
|
|
75
75
|
md += `Speculative findings erode developer trust — prefer fewer, high-confidence findings over many uncertain ones.\n\n`;
|
|
76
|
+
md += `**IaC identifier handling:** Azure resource identifiers (policy definition IDs, role definition IDs, `;
|
|
77
|
+
md += `built-in policy assignments, subscription GUIDs, tenant IDs, etc.) are opaque platform identifiers `;
|
|
78
|
+
md += `provided by Microsoft. Do NOT validate them for strict UUID/GUID hex compliance or flag them as \"invalid\" — `;
|
|
79
|
+
md += `they may contain characters outside the hex range and are still correct. Treat all Azure resource IDs as verbatim constants.\n\n`;
|
|
76
80
|
md += `### False Positive Review\n\n`;
|
|
77
81
|
md += `Before adding new findings, **review each pattern-based finding above for false positives.** `;
|
|
78
82
|
md += `Static pattern matching can flag code that is actually correct — for example:\n`;
|
|
@@ -110,7 +114,7 @@ export function buildTribunalDeepReviewSection(judges, language, context) {
|
|
|
110
114
|
for (const judge of judges) {
|
|
111
115
|
md += `### ${judge.name} — ${judge.domain}\n\n`;
|
|
112
116
|
md += `${judge.description}\n\n`;
|
|
113
|
-
md += `**Rule prefix:** \`${judge.rulePrefix}-\` · **Precision Mandate:** Every finding MUST cite specific code evidence. Do NOT flag absent features speculatively. Prefer fewer, high-confidence findings over many uncertain ones.\n\n`;
|
|
117
|
+
md += `**Rule prefix:** \`${judge.rulePrefix}-\` · **Precision Mandate:** Every finding MUST cite specific code evidence. Do NOT flag absent features speculatively. Do NOT validate Azure resource identifiers for strict UUID/GUID hex compliance — they are opaque platform constants. Prefer fewer, high-confidence findings over many uncertain ones.\n\n`;
|
|
114
118
|
md += `---\n\n`;
|
|
115
119
|
}
|
|
116
120
|
md += `### False Positive Review\n\n`;
|
|
@@ -158,6 +162,8 @@ export function buildSimplifiedDeepReviewSection(language, context) {
|
|
|
158
162
|
md += `7. **Infrastructure** — IaC best practices, network rules, identity management (if applicable)\n\n`;
|
|
159
163
|
md += `### Precision Mandate\n\n`;
|
|
160
164
|
md += `Every finding MUST cite specific code evidence. Do NOT flag absent features speculatively. `;
|
|
165
|
+
md += `Do NOT validate Azure resource identifiers (policy IDs, role IDs, tenant IDs) for strict UUID/GUID hex compliance — `;
|
|
166
|
+
md += `they are opaque platform constants provided by Microsoft. `;
|
|
161
167
|
md += `Prefer fewer, high-confidence findings over many uncertain ones.\n\n`;
|
|
162
168
|
md += `### Response Format\n\n`;
|
|
163
169
|
md += `For each finding provide: severity (critical/high/medium/low/info), title, description, `;
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"deep-review.js","sourceRoot":"","sources":["../../src/tools/deep-review.ts"],"names":[],"mappings":"AAAA,gFAAgF;AAChF,sEAAsE;AACtE,wEAAwE;AACxE,wEAAwE;AACxE,oBAAoB;AACpB,iFAAiF;AAIjF,gFAAgF;AAEhF,mFAAmF;AACnF,MAAM,gBAAgB,GAAG;IACvB,iCAAiC;IACjC,+BAA+B;IAC/B,2BAA2B;IAC3B,yBAAyB;IACzB,gCAAgC;IAChC,wBAAwB;IACxB,yBAAyB;IACzB,yBAAyB;IACzB,wBAAwB;IACxB,0BAA0B;IAC1B,oBAAoB;IACpB,2BAA2B;IAC3B,4BAA4B;CAC7B,CAAC;AAEF;;;;GAIG;AACH,MAAM,UAAU,sBAAsB,CAAC,YAAoB;IACzD,IAAI,YAAY,CAAC,MAAM,GAAG,GAAG;QAAE,OAAO,KAAK,CAAC;IAC5C,MAAM,KAAK,GAAG,YAAY,CAAC,WAAW,EAAE,CAAC,IAAI,EAAE,CAAC;IAChD,OAAO,gBAAgB,CAAC,IAAI,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,KAAK,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC,CAAC;AACzD,CAAC;AAED,gFAAgF;AAChF,8EAA8E;AAC9E,yEAAyE;AACzE,8EAA8E;AAE9E,MAAM,kBAAkB,GACtB,qFAAqF;IACrF,4EAA4E;IAC5E,mFAAmF;IACnF,0EAA0E;IAC1E,0DAA0D;IAC1D,mFAAmF,CAAC;AAEtF,gFAAgF;AAEhF,6DAA6D;AAC7D,MAAM,CAAC,MAAM,wBAAwB,GACnC,8EAA8E;IAC9E,4EAA4E;IAC5E,uFAAuF,CAAC;AAE1F,sDAAsD;AACtD,MAAM,CAAC,MAAM,oBAAoB,GAC/B,gFAAgF;IAChF,iFAAiF;IACjF,sEAAsE;IACtE,0EAA0E;IAC1E,yEAAyE;IACzE,oEAAoE,CAAC;AAEvE,gFAAgF;AAEhF,MAAM,UAAU,iCAAiC,CAAC,KAAsB,EAAE,QAAgB,EAAE,OAAgB;IAC1G,IAAI,EAAE,GAAG,aAAa,CAAC;IACvB,EAAE,IAAI,2CAA2C,CAAC;IAClD,EAAE,IAAI,kBAAkB,CAAC;IACzB,EAAE,IAAI,oEAAoE,CAAC;IAC3E,EAAE,IAAI,yFAAyF,CAAC;IAChG,EAAE,IAAI,0FAA0F,CAAC;IACjG,EAAE,IAAI,kEAAkE,QAAQ,yCAAyC,CAAC;IAC1H,EAAE,IAAI,kKAAkK,CAAC;IAEzK,IAAI,OAAO,EAAE,CAAC;QACZ,EAAE,IAAI,yBAAyB,OAAO,MAAM,CAAC;IAC/C,CAAC;IAED,EAAE,IAAI,OAAO,KAAK,CAAC,IAAI,MAAM,KAAK,CAAC,MAAM,MAAM,CAAC;IAChD,EAAE,IAAI,GAAG,KAAK,CAAC,WAAW,MAAM,CAAC;IACjC,EAAE,IAAI,2BAA2B,CAAC;IAClC,EAAE,IAAI,+GAA+G,CAAC;IACtH,EAAE,IAAI,2IAA2I,CAAC;IAClJ,EAAE,IAAI,mHAAmH,CAAC;
|
|
1
|
+
{"version":3,"file":"deep-review.js","sourceRoot":"","sources":["../../src/tools/deep-review.ts"],"names":[],"mappings":"AAAA,gFAAgF;AAChF,sEAAsE;AACtE,wEAAwE;AACxE,wEAAwE;AACxE,oBAAoB;AACpB,iFAAiF;AAIjF,gFAAgF;AAEhF,mFAAmF;AACnF,MAAM,gBAAgB,GAAG;IACvB,iCAAiC;IACjC,+BAA+B;IAC/B,2BAA2B;IAC3B,yBAAyB;IACzB,gCAAgC;IAChC,wBAAwB;IACxB,yBAAyB;IACzB,yBAAyB;IACzB,wBAAwB;IACxB,0BAA0B;IAC1B,oBAAoB;IACpB,2BAA2B;IAC3B,4BAA4B;CAC7B,CAAC;AAEF;;;;GAIG;AACH,MAAM,UAAU,sBAAsB,CAAC,YAAoB;IACzD,IAAI,YAAY,CAAC,MAAM,GAAG,GAAG;QAAE,OAAO,KAAK,CAAC;IAC5C,MAAM,KAAK,GAAG,YAAY,CAAC,WAAW,EAAE,CAAC,IAAI,EAAE,CAAC;IAChD,OAAO,gBAAgB,CAAC,IAAI,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,KAAK,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC,CAAC;AACzD,CAAC;AAED,gFAAgF;AAChF,8EAA8E;AAC9E,yEAAyE;AACzE,8EAA8E;AAE9E,MAAM,kBAAkB,GACtB,qFAAqF;IACrF,4EAA4E;IAC5E,mFAAmF;IACnF,0EAA0E;IAC1E,0DAA0D;IAC1D,mFAAmF,CAAC;AAEtF,gFAAgF;AAEhF,6DAA6D;AAC7D,MAAM,CAAC,MAAM,wBAAwB,GACnC,8EAA8E;IAC9E,4EAA4E;IAC5E,uFAAuF,CAAC;AAE1F,sDAAsD;AACtD,MAAM,CAAC,MAAM,oBAAoB,GAC/B,gFAAgF;IAChF,iFAAiF;IACjF,sEAAsE;IACtE,0EAA0E;IAC1E,yEAAyE;IACzE,oEAAoE,CAAC;AAEvE,gFAAgF;AAEhF,MAAM,UAAU,iCAAiC,CAAC,KAAsB,EAAE,QAAgB,EAAE,OAAgB;IAC1G,IAAI,EAAE,GAAG,aAAa,CAAC;IACvB,EAAE,IAAI,2CAA2C,CAAC;IAClD,EAAE,IAAI,kBAAkB,CAAC;IACzB,EAAE,IAAI,oEAAoE,CAAC;IAC3E,EAAE,IAAI,yFAAyF,CAAC;IAChG,EAAE,IAAI,0FAA0F,CAAC;IACjG,EAAE,IAAI,kEAAkE,QAAQ,yCAAyC,CAAC;IAC1H,EAAE,IAAI,kKAAkK,CAAC;IAEzK,IAAI,OAAO,EAAE,CAAC;QACZ,EAAE,IAAI,yBAAyB,OAAO,MAAM,CAAC;IAC/C,CAAC;IAED,EAAE,IAAI,OAAO,KAAK,CAAC,IAAI,MAAM,KAAK,CAAC,MAAM,MAAM,CAAC;IAChD,EAAE,IAAI,GAAG,KAAK,CAAC,WAAW,MAAM,CAAC;IACjC,EAAE,IAAI,2BAA2B,CAAC;IAClC,EAAE,IAAI,+GAA+G,CAAC;IACtH,EAAE,IAAI,2IAA2I,CAAC;IAClJ,EAAE,IAAI,mHAAmH,CAAC;IAC1H,EAAE,IAAI,uGAAuG,CAAC;IAC9G,EAAE,IAAI,qGAAqG,CAAC;IAC5G,EAAE,IAAI,gHAAgH,CAAC;IACvH,EAAE,IAAI,kIAAkI,CAAC;IAEzI,EAAE,IAAI,+BAA+B,CAAC;IACtC,EAAE,IAAI,+FAA+F,CAAC;IACtG,EAAE,IAAI,iFAAiF,CAAC;IACxF,EAAE,IAAI,gIAAgI,CAAC;IACvI,EAAE,IAAI,kEAAkE,CAAC;IACzE,EAAE,IAAI,kFAAkF,CAAC;IACzF,EAAE,IAAI,yEAAyE,CAAC;IAChF,EAAE,IAAI,oHAAoH,CAAC;IAC3H,EAAE,IAAI,0BAA0B,CAAC;IACjC,EAAE,IAAI,yDAAyD,CAAC;IAEhE,EAAE,IAAI,yBAAyB,CAAC;IAChC,EAAE,IAAI,0EAA0E,CAAC;IACjF,EAAE,IAAI,uBAAuB,KAAK,CAAC,UAAU,OAAO,CAAC;IACrD,EAAE,IAAI,4DAA4D,CAAC;IACnE,EAAE,IAAI,gFAAgF,CAAC;IACvF,EAAE,IAAI,mHAAmH,CAAC;IAC1H,EAAE,IAAI,gGAAgG,CAAC;IACvG,EAAE,IAAI,oIAAoI,CAAC;IAE3I,OAAO,EAAE,CAAC;AACZ,CAAC;AAED,gFAAgF;AAEhF,MAAM,UAAU,8BAA8B,CAAC,MAAyB,EAAE,QAAgB,EAAE,OAAgB;IAC1G,IAAI,EAAE,GAAG,aAAa,CAAC;IACvB,EAAE,IAAI,2CAA2C,CAAC;IAClD,EAAE,IAAI,kBAAkB,CAAC;IACzB,EAAE,IAAI,6EAA6E,CAAC;IACpF,EAAE,IAAI,yFAAyF,CAAC;IAChG,EAAE,IAAI,0FAA0F,CAAC;IACjG,EAAE,IAAI,kEAAkE,QAAQ,qCAAqC,MAAM,CAAC,MAAM,iBAAiB,CAAC;IACpJ,EAAE,IAAI,wKAAwK,CAAC;IAC/K,EAAE,IAAI,qLAAqL,CAAC;IAE5L,IAAI,OAAO,EAAE,CAAC;QACZ,EAAE,IAAI,yBAAyB,OAAO,MAAM,CAAC;IAC/C,CAAC;IAED,KAAK,MAAM,KAAK,IAAI,MAAM,EAAE,CAAC;QAC3B,EAAE,IAAI,OAAO,KAAK,CAAC,IAAI,MAAM,KAAK,CAAC,MAAM,MAAM,CAAC;QAChD,EAAE,IAAI,GAAG,KAAK,CAAC,WAAW,MAAM,CAAC;QACjC,EAAE,IAAI,sBAAsB,KAAK,CAAC,UAAU,mTAAmT,CAAC;QAChW,EAAE,IAAI,SAAS,CAAC;IAClB,CAAC;IAED,EAAE,IAAI,+BAA+B,CAAC;IACtC,EAAE,IAAI,+FAA+F,CAAC;IACtG,EAAE,IAAI,iFAAiF,CAAC;IACxF,EAAE,IAAI,gIAAgI,CAAC;IACvI,EAAE,IAAI,kEAAkE,CAAC;IACzE,EAAE,IAAI,kFAAkF,CAAC;IACzF,EAAE,IAAI,yEAAyE,CAAC;IAChF,EAAE,IAAI,oHAAoH,CAAC;IAC3H,EAAE,IAAI,0BAA0B,CAAC;IACjC,EAAE,IAAI,yDAAyD,CAAC;IAEhE,EAAE,IAAI,yBAAyB,CAAC;IAChC,EAAE,IAAI,4FAA4F,CAAC;IACnG,EAAE,IAAI,gCAAgC,CAAC;IACvC,EAAE,IAAI,4DAA4D,CAAC;IACnE,EAAE,IAAI,kFAAkF,CAAC;IACzF,EAAE,IAAI,sIAAsI,CAAC;IAC7I,EAAE,IAAI,uKAAuK,CAAC;IAC9K,EAAE,IAAI,2CAA2C,CAAC;IAClD,EAAE,IAAI,mDAAmD,CAAC;IAC1D,EAAE,IAAI,mDAAmD,CAAC;IAE1D,OAAO,EAAE,CAAC;AACZ,CAAC;AAED,gFAAgF;AAChF,6EAA6E;AAC7E,8EAA8E;AAC9E,wEAAwE;AAExE,MAAM,UAAU,gCAAgC,CAAC,QAAgB,EAAE,OAAgB;IACjF,IAAI,EAAE,GAAG,aAAa,CAAC;IACvB,EAAE,IAAI,2CAA2C,CAAC;IAClD,EAAE,IAAI,kBAAkB,CAAC;IACzB,EAAE,IAAI,qFAAqF,CAAC;IAC5F,EAAE,IAAI,+BAA+B,QAAQ,4CAA4C,CAAC;IAE1F,IAAI,OAAO,EAAE,CAAC;QACZ,EAAE,IAAI,yBAAyB,OAAO,MAAM,CAAC;IAC/C,CAAC;IAED,EAAE,IAAI,yCAAyC,CAAC;IAChD,EAAE,IAAI,qGAAqG,CAAC;IAC5G,EAAE,IAAI,+GAA+G,CAAC;IACtH,EAAE,IAAI,8EAA8E,CAAC;IACrF,EAAE,IAAI,+EAA+E,CAAC;IACtF,EAAE,IAAI,2FAA2F,CAAC;IAClG,EAAE,IAAI,6FAA6F,CAAC;IACpG,EAAE,IAAI,oGAAoG,CAAC;IAE3G,EAAE,IAAI,2BAA2B,CAAC;IAClC,EAAE,IAAI,6FAA6F,CAAC;IACpG,EAAE,IAAI,sHAAsH,CAAC;IAC7H,EAAE,IAAI,4DAA4D,CAAC;IACnE,EAAE,IAAI,sEAAsE,CAAC;IAE7E,EAAE,IAAI,yBAAyB,CAAC;IAChC,EAAE,IAAI,0FAA0F,CAAC;IACjG,EAAE,IAAI,0FAA0F,CAAC;IACjG,EAAE,IAAI,8DAA8D,CAAC;IACrE,EAAE,IAAI,uGAAuG,CAAC;IAE9G,OAAO,EAAE,CAAC;AACZ,CAAC"}
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@kevinrabun/judges",
|
|
3
|
-
"version": "3.20.
|
|
4
|
-
"description": "
|
|
3
|
+
"version": "3.20.4",
|
|
4
|
+
"description": "37 specialized judges that evaluate AI-generated code for security, cost, and quality.",
|
|
5
5
|
"mcpName": "io.github.KevinRabun/judges",
|
|
6
6
|
"type": "module",
|
|
7
7
|
"main": "dist/index.js",
|
package/server.json
CHANGED
|
@@ -2,17 +2,17 @@
|
|
|
2
2
|
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
|
|
3
3
|
"name": "io.github.KevinRabun/judges",
|
|
4
4
|
"title": "Judges Panel",
|
|
5
|
-
"description": "
|
|
5
|
+
"description": "37 judges that evaluate AI-generated code for security, cost, and quality with built-in AST.",
|
|
6
6
|
"repository": {
|
|
7
7
|
"url": "https://github.com/kevinrabun/judges",
|
|
8
8
|
"source": "github"
|
|
9
9
|
},
|
|
10
|
-
"version": "3.20.
|
|
10
|
+
"version": "3.20.4",
|
|
11
11
|
"packages": [
|
|
12
12
|
{
|
|
13
13
|
"registryType": "npm",
|
|
14
14
|
"identifier": "@kevinrabun/judges",
|
|
15
|
-
"version": "3.20.
|
|
15
|
+
"version": "3.20.4",
|
|
16
16
|
"transport": {
|
|
17
17
|
"type": "stdio"
|
|
18
18
|
}
|
|
@@ -21,7 +21,7 @@
|
|
|
21
21
|
"tools": [
|
|
22
22
|
{
|
|
23
23
|
"name": "evaluate_code",
|
|
24
|
-
"description": "Submit code to the full
|
|
24
|
+
"description": "Submit code to the full 37-judge tribunal for security, cost, and quality analysis. Handles all code types including application code, infrastructure-as-code (Bicep, Terraform, ARM), and configuration files."
|
|
25
25
|
},
|
|
26
26
|
{
|
|
27
27
|
"name": "evaluate_code_single_judge",
|
|
@@ -59,7 +59,7 @@
|
|
|
59
59
|
"prompts": [
|
|
60
60
|
{
|
|
61
61
|
"name": "full-tribunal",
|
|
62
|
-
"description": "Convene all
|
|
62
|
+
"description": "Convene all 37 judges for a comprehensive LLM-powered deep review."
|
|
63
63
|
},
|
|
64
64
|
{
|
|
65
65
|
"name": "judge-{id}",
|