@kevinrabun/judges 3.7.0 → 3.7.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +222 -0
- package/LICENSE +21 -0
- package/package.json +2 -1
- package/server.json +2 -2
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,222 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to **@kevinrabun/judges** are documented here.
|
|
4
|
+
|
|
5
|
+
## [3.7.1] — 2026-03-01
|
|
6
|
+
|
|
7
|
+
### Fixed
|
|
8
|
+
- Added root `LICENSE` file (MIT) — was referenced in `package.json` `files` but missing from tarball.
|
|
9
|
+
- Added `CHANGELOG.md` to npm `files` array so it ships in the published package.
|
|
10
|
+
- Fixed CHANGELOG date and test count accuracy.
|
|
11
|
+
- VS Code extension: switched to `bundler` module resolution, fixed ESM/CJS import errors.
|
|
12
|
+
- VS Code extension: added `.vscodeignore` tuning, `galleryBanner` metadata, esbuild bundling.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## [3.7.0] — 2026-03-01
|
|
17
|
+
|
|
18
|
+
### Added
|
|
19
|
+
- **`judges --version` command** — display installed version with update check.
|
|
20
|
+
- **`--fix` flag on eval** — evaluate and auto-fix in one step: `judges eval --fix src/app.ts`.
|
|
21
|
+
- **Glob / multi-file eval** — evaluate directories and patterns: `judges eval src/**/*.ts`.
|
|
22
|
+
- **Progress indicators** — `[1/12] src/app.ts…` progress during multi-file evaluation.
|
|
23
|
+
- **VS Code extension** — diagnostics, code actions, and quick-fix integration (`vscode-extension/`).
|
|
24
|
+
- **README terminal mockup** — SVG-based visual showing evaluation output.
|
|
25
|
+
- **`.judgesrc.example.json`** — annotated example configuration file.
|
|
26
|
+
- **GitHub Marketplace metadata** — enhanced `action.yml` for Marketplace discovery.
|
|
27
|
+
|
|
28
|
+
### Changed
|
|
29
|
+
- `server.json` version synced to `3.7.0`.
|
|
30
|
+
- README test badge updated to **842**.
|
|
31
|
+
- Total test count: **842**.
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## [3.6.0] — 2026-03-07
|
|
36
|
+
|
|
37
|
+
### Added
|
|
38
|
+
- **Plugin system** (`--plugin`) — load custom evaluator plugins from npm packages or local files.
|
|
39
|
+
- **Finding fingerprints** — stable content-hash IDs for tracking findings across runs.
|
|
40
|
+
- **Calibration mode** (`judges calibrate`) — tune judge thresholds against known-good codebases.
|
|
41
|
+
- **Diagnostics format** (`--format diagnostics`) — LSP-compatible diagnostic output for editor integration.
|
|
42
|
+
- **Comparison command** (`judges compare`) — side-by-side feature matrix vs ESLint, SonarQube, Semgrep, CodeQL.
|
|
43
|
+
- **Language packs** (`judges pack`) — manage language-specific rule extensions.
|
|
44
|
+
- **Config sharing** (`judges config export/import`) — export and import team configurations.
|
|
45
|
+
- **Custom rules** (`judges rule create`) — define and manage custom evaluation rules.
|
|
46
|
+
- **Fix history** — track applied patches with undo support.
|
|
47
|
+
- **Smart output** — auto-detect terminal width and format output accordingly.
|
|
48
|
+
- **Feedback command** (`judges feedback`) — submit false-positive feedback for rule tuning.
|
|
49
|
+
- **Benchmark command** (`judges benchmark`) — run detection accuracy benchmarks against test suites.
|
|
50
|
+
- **14 new subsystem tests** for plugins, fingerprinting, calibration, and diagnostics.
|
|
51
|
+
|
|
52
|
+
### Changed
|
|
53
|
+
- CLI expanded from 14 to 22 commands.
|
|
54
|
+
- Output formats expanded from 7 to 8 (added `diagnostics`).
|
|
55
|
+
- Total test count: **819** (up from 754).
|
|
56
|
+
|
|
57
|
+
---
|
|
58
|
+
|
|
59
|
+
### Added
|
|
60
|
+
- **`judges diff` command** — evaluate only changed lines from unified diff / git diff output. Pipe `git diff` directly or pass a patch file.
|
|
61
|
+
- **`judges deps` command** — analyze project dependencies for supply-chain risks across 11 manifest types (package.json, requirements.txt, Cargo.toml, go.mod, pom.xml, etc.).
|
|
62
|
+
- **`judges baseline create` command** — create a baseline JSON file from current findings for future suppression.
|
|
63
|
+
- **`judges completions` command** — generate shell completion scripts for bash, zsh, fish, and PowerShell.
|
|
64
|
+
- **`judges docs` command** — generate per-judge rule documentation in Markdown format, with `--output` for file output.
|
|
65
|
+
- **JUnit XML formatter** (`--format junit`) — CI/CD compatible output for Jenkins, Azure DevOps, GitHub Actions, GitLab CI.
|
|
66
|
+
- **CodeClimate JSON formatter** (`--format codeclimate`) — GitLab Code Quality widget compatible output with MD5 fingerprints.
|
|
67
|
+
- **Named presets** (`--preset`) — 6 built-in profiles: `strict`, `lenient`, `security-only`, `startup`, `compliance`, `performance`.
|
|
68
|
+
- **Config file support** (`--config`) — auto-discovers `.judgesrc` / `.judgesrc.json` in project root with full JSON Schema validation support.
|
|
69
|
+
- **`judgesrc.schema.json`** — JSON Schema for `.judgesrc` files with IDE autocomplete and validation.
|
|
70
|
+
- **`--min-score` flag** — exit non-zero when overall score falls below threshold (e.g. `--min-score 80`).
|
|
71
|
+
- **`--verbose` flag** — timing statistics and file-level detail in output.
|
|
72
|
+
- **`--quiet` flag** — suppress informational output, only show findings.
|
|
73
|
+
- **`--no-color` flag** — disable ANSI color codes for piped output.
|
|
74
|
+
- **CI Templates** — `judges ci-templates github` generates GitHub Actions workflow YAML.
|
|
75
|
+
- **24 new tests** covering all new formatters, commands, presets, and JSON Schema validation.
|
|
76
|
+
|
|
77
|
+
### Changed
|
|
78
|
+
- CLI expanded from 8 to 14 commands.
|
|
79
|
+
- Output formats expanded from 5 to 7 (added `junit`, `codeclimate`).
|
|
80
|
+
- Total test count: **754** (up from 730).
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## [3.4.0] — 2026-03-04
|
|
85
|
+
|
|
86
|
+
### Added
|
|
87
|
+
- **Init wizard** (`judges init`) — interactive project setup generating `.judgesrc` config.
|
|
88
|
+
- **Fix command** (`judges fix`) — auto-apply suggested patches from findings with `--apply` flag.
|
|
89
|
+
- **Watch mode** (`judges watch`) — file-system watcher for continuous evaluation during development.
|
|
90
|
+
- **Report command** (`judges report`) — full project analysis with HTML/JSON/Markdown output.
|
|
91
|
+
- **Hook command** (`judges hook`) — git pre-commit hook installation.
|
|
92
|
+
- **HTML formatter** — interactive browser-based report with severity filters and per-judge sections.
|
|
93
|
+
- **Baseline suppression** — suppress known findings from previous runs.
|
|
94
|
+
- **CI template generator** — `judges ci-templates` for GitLab CI, Azure Pipelines, Bitbucket Pipelines.
|
|
95
|
+
|
|
96
|
+
### Changed
|
|
97
|
+
- Total test count: **730**.
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## [3.3.0] — 2026-03-02
|
|
102
|
+
|
|
103
|
+
### Changed
|
|
104
|
+
- **Unified tree-sitter AST** — consolidated `typescript-ast.ts` into `tree-sitter-ast.ts`, single parser for all 8 languages.
|
|
105
|
+
- Removed legacy TypeScript Compiler API dependency.
|
|
106
|
+
|
|
107
|
+
---
|
|
108
|
+
|
|
109
|
+
## [3.2.0] — 2026-02-29
|
|
110
|
+
|
|
111
|
+
### Added
|
|
112
|
+
- **Tree-sitter WASM integration** — structural AST analysis for 8 languages (TypeScript, JavaScript, Python, Go, Rust, Java, C#, C++).
|
|
113
|
+
- Language-specific structural patterns for each grammar.
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
## [3.1.1] — 2026-02-28
|
|
118
|
+
|
|
119
|
+
### Added
|
|
120
|
+
- **GitHub Action** (`action.yml`) — composite action for CI/CD with SARIF upload, fail-on-findings, and job summary.
|
|
121
|
+
- **Dockerfile** — multi-stage Node 20 Alpine build with non-root user for containerized usage.
|
|
122
|
+
- **GitHub Pages dashboard** (`docs/index.html`) — dark-themed dashboard showing project analysis results and judge directory.
|
|
123
|
+
- **Real-world evidence document** (`docs/real-world-evidence.md`) — Express.js, Flask, FastAPI analysis + before/after showcase.
|
|
124
|
+
- **Pages deployment workflow** (`.github/workflows/pages.yml`).
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## [3.1.0] — 2026-02-28
|
|
129
|
+
|
|
130
|
+
### Added
|
|
131
|
+
- **CLI evaluation mode** — `npx @kevinrabun/judges eval --file app.ts` runs the full tribunal from the command line, no MCP setup required. Supports `--language`, `--format`, `--judge`, and stdin piping.
|
|
132
|
+
- **Enhanced Python AST** — class-aware method extraction (`ClassName.method_name`), decorator detection, async function detection, self/cls parameter filtering, multi-line import handling.
|
|
133
|
+
- **Framework-aware analysis** — detects 14 frameworks (Express, React, Django, Flask, Spring, FastAPI, etc.) and reduces confidence on framework-idiomatic findings to cut false positives.
|
|
134
|
+
- **Content-hash LRU caching** — caches AST structure, taint flow, and tribunal results by content hash for faster re-evaluation of unchanged files.
|
|
135
|
+
- **SARIF 2.1.0 structural validator** — `validateSarifLog()` checks all mandatory SARIF properties before output.
|
|
136
|
+
- **Multi-line auto-fix patches** — 5 structural patch rules for Express helmet, CORS, rate limiting, error handlers, and health endpoints.
|
|
137
|
+
- **Confidence-weighted scoring** — findings now carry estimated confidence; low-confidence findings have reduced score impact.
|
|
138
|
+
- **Finding provenance** — every finding includes `provenance` field with rule ID and evidence trail for auditability.
|
|
139
|
+
- **Absence-based finding demotion** — findings flagging *missing* patterns are demoted from critical/high to medium to reduce false positives.
|
|
140
|
+
- **28 negative tests** for false positive prevention.
|
|
141
|
+
- **169 subsystem unit tests** (scoring, dedup, config, patches, suppression, SARIF, Python parser).
|
|
142
|
+
- **Quickstart example** (`examples/quickstart.ts`) using the package API.
|
|
143
|
+
- **CHANGELOG.md** with full version history.
|
|
144
|
+
|
|
145
|
+
### Fixed
|
|
146
|
+
- `server.json` version now stays in sync with `package.json`.
|
|
147
|
+
- MCP server version string updated from `2.0.0` to `3.1.0`.
|
|
148
|
+
- Demo example includes guidance for both in-repo and package-installed usage.
|
|
149
|
+
|
|
150
|
+
### Changed
|
|
151
|
+
- Total test count: **899** (702 integration + 28 negative + 169 subsystem).
|
|
152
|
+
- Python structural parser fully rewritten with two-pass class boundary detection.
|
|
153
|
+
- Class name extraction added for all supported languages (Python, Java, C#, Rust, Go).
|
|
154
|
+
|
|
155
|
+
---
|
|
156
|
+
|
|
157
|
+
## [3.0.3] — 2026-02-27
|
|
158
|
+
|
|
159
|
+
### Fixed
|
|
160
|
+
- Resolved all 14 CodeQL ReDoS alerts via atomic character classes and possessive-style patterns.
|
|
161
|
+
- Suppressed 4 intentional vulnerability alerts in `examples/sample-vulnerable-api.ts` (test fixture).
|
|
162
|
+
- Resolved Dependabot `hono` IP spoofing alert via `overrides`.
|
|
163
|
+
- GitHub Releases now auto-created on tag push (`publish-mcp.yml`).
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
## [3.0.2] — 2026-02-26
|
|
168
|
+
|
|
169
|
+
### Fixed
|
|
170
|
+
- Publish workflow repaired (npm provenance, correct trigger).
|
|
171
|
+
- Removed dead code from build artifacts.
|
|
172
|
+
|
|
173
|
+
---
|
|
174
|
+
|
|
175
|
+
## [3.0.1] — 2026-02-26
|
|
176
|
+
|
|
177
|
+
### Fixed
|
|
178
|
+
- Dropped Node 18 from CI matrix (ESLint 10 requires Node >= 20).
|
|
179
|
+
- Added adversarial mandate to code-structure and framework-safety judges.
|
|
180
|
+
- Fixed `FW-` rule prefix in README documentation.
|
|
181
|
+
|
|
182
|
+
---
|
|
183
|
+
|
|
184
|
+
## [3.0.0] — 2026-02-25
|
|
185
|
+
|
|
186
|
+
### Added
|
|
187
|
+
- **Monolith decomposition**: 35 specialized judges split from single evaluator file.
|
|
188
|
+
- **Built-in AST analysis** via TypeScript Compiler API — no separate parser needed.
|
|
189
|
+
- **App Builder Workflow** (3-step): release decision, plain-language risk summaries, prioritized remediation tasks.
|
|
190
|
+
- **V2 context-aware evaluation** with policy profiles, evidence calibration, specialty feedback, confidence scoring.
|
|
191
|
+
- **Public repository URL reporting** — clone any public repo and generate a full tribunal report.
|
|
192
|
+
- **Project-level analysis** with cross-file architectural detection (duplication, dependency cycles, god modules).
|
|
193
|
+
- **Diff evaluation** — analyze only changed lines for PR reviews.
|
|
194
|
+
- **Dependency analysis** — supply-chain manifest scanning.
|
|
195
|
+
- **SARIF output** for GitHub Code Scanning integration.
|
|
196
|
+
- **Inline suppression** via `judges-disable` comments.
|
|
197
|
+
- CI/CD infrastructure with GitHub Actions (CI, publish, PR review, daily automation).
|
|
198
|
+
|
|
199
|
+
---
|
|
200
|
+
|
|
201
|
+
## [2.3.0] — 2026-02-24
|
|
202
|
+
|
|
203
|
+
### Added
|
|
204
|
+
- AI Code Safety judge with 12 AICS rules.
|
|
205
|
+
- Full `suggestedFix` and `confidence` coverage across all 427 findings.
|
|
206
|
+
- Multi-language detection via language pattern system.
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
[3.7.0]: https://github.com/KevinRabun/judges/compare/v3.6.0...v3.7.0
|
|
211
|
+
[3.6.0]: https://github.com/KevinRabun/judges/compare/v3.5.0...v3.6.0
|
|
212
|
+
[3.5.0]: https://github.com/KevinRabun/judges/compare/v3.4.0...v3.5.0
|
|
213
|
+
[3.4.0]: https://github.com/KevinRabun/judges/compare/v3.3.0...v3.4.0
|
|
214
|
+
[3.3.0]: https://github.com/KevinRabun/judges/compare/v3.2.0...v3.3.0
|
|
215
|
+
[3.2.0]: https://github.com/KevinRabun/judges/compare/v3.1.1...v3.2.0
|
|
216
|
+
[3.1.1]: https://github.com/KevinRabun/judges/compare/v3.1.0...v3.1.1
|
|
217
|
+
[3.1.0]: https://github.com/KevinRabun/judges/compare/v3.0.3...v3.1.0
|
|
218
|
+
[3.0.3]: https://github.com/KevinRabun/judges/compare/v3.0.2...v3.0.3
|
|
219
|
+
[3.0.2]: https://github.com/KevinRabun/judges/compare/v3.0.1...v3.0.2
|
|
220
|
+
[3.0.1]: https://github.com/KevinRabun/judges/compare/v3.0.0...v3.0.1
|
|
221
|
+
[3.0.0]: https://github.com/KevinRabun/judges/compare/v2.3.0...v3.0.0
|
|
222
|
+
[2.3.0]: https://github.com/KevinRabun/judges/releases/tag/v2.3.0
|
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Kevin Rabun
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@kevinrabun/judges",
|
|
3
|
-
"version": "3.7.
|
|
3
|
+
"version": "3.7.1",
|
|
4
4
|
"description": "35 specialized judges that evaluate AI-generated code for security, cost, and quality.",
|
|
5
5
|
"mcpName": "io.github.KevinRabun/judges",
|
|
6
6
|
"type": "module",
|
|
@@ -59,6 +59,7 @@
|
|
|
59
59
|
"grammars",
|
|
60
60
|
"server.json",
|
|
61
61
|
"judgesrc.schema.json",
|
|
62
|
+
"CHANGELOG.md",
|
|
62
63
|
"README.md",
|
|
63
64
|
"LICENSE"
|
|
64
65
|
],
|
package/server.json
CHANGED
|
@@ -7,12 +7,12 @@
|
|
|
7
7
|
"url": "https://github.com/kevinrabun/judges",
|
|
8
8
|
"source": "github"
|
|
9
9
|
},
|
|
10
|
-
"version": "3.7.
|
|
10
|
+
"version": "3.7.1",
|
|
11
11
|
"packages": [
|
|
12
12
|
{
|
|
13
13
|
"registryType": "npm",
|
|
14
14
|
"identifier": "@kevinrabun/judges",
|
|
15
|
-
"version": "3.7.
|
|
15
|
+
"version": "3.7.1",
|
|
16
16
|
"transport": {
|
|
17
17
|
"type": "stdio"
|
|
18
18
|
}
|