circle-ir-ai 2.8.3 → 2.8.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/CHANGELOG.md +78 -0
  2. package/package.json +1 -1
package/CHANGELOG.md CHANGED
@@ -5,6 +5,84 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [2.8.5] - 2026-06-09
9
+
10
+ ### Fixed
11
+
12
+ - **#84: CWE-Bench-Java runner produced 2 unrecoverable JSON parse
13
+ errors with `gemma3:12b`** (and any other local Ollama model) on
14
+ the 2026-06-09 run. Root cause turned out to be two distinct bugs
15
+ in `benchmarks/runners/run-cwe-bench-java.ts`:
16
+
17
+ **(1) Deterministic context overflow on large files (#118
18
+ rocketmq).** The Ollama `/v1/chat/completions` (OpenAI-compat)
19
+ endpoint defaults to `num_ctx=8192` — much smaller than the
20
+ model's native context window. `AdminBrokerProcessor.java`
21
+ (2655 lines, ~35K tokens) filled the entire prompt buffer,
22
+ leaving exactly 1 token for the response ("Okay"). The parser
23
+ then logged `No JSON found in response`.
24
+
25
+ Repro confirmed with `num_ctx∈{8192,16384}` → `eval_count=1`,
26
+ `num_ctx∈{32768,49152}` → `eval_count≈630`, valid array of 5
27
+ entries.
28
+
29
+ Fix: the runner now sets `options.num_ctx=32768` for any
30
+ `localhost:11434` / `127.0.0.1:11434` base URL. Honors
31
+ `LLM_OLLAMA_NUM_CTX` override for users with smaller VRAM or
32
+ models that don't support 32K (rare). 32K covers every file
33
+ in CWE-Bench-Java; gemma3:12b / qwen3-coder:30b / llama3 all
34
+ support 32K natively in <10GB VRAM.
35
+
36
+ **(2) Transient temp=0 stochasticity on tiny files (#109
37
+ spring-security).** `DefaultHttpFirewall.java` is 68 lines and
38
+ the parse error did NOT reproduce on 3 fresh repro attempts —
39
+ diagnosed as KV-cache / batch-grouping non-determinism that
40
+ surfaces at ~1% rate even with `temperature=0`.
41
+
42
+ Fix: added a single retry on any JSON parse failure
43
+ (`PARSE_ERR_ARRAY`, `PARSE_ERR_OBJECT`, `NO_JSON`). One retry
44
+ is sufficient because the failure is non-deterministic; a
45
+ second consecutive failure indicates a real prompt/model
46
+ problem worth recording as `parseError` in stats. Adds at
47
+ most ~1% extra LLM calls in the worst case, ~0 in the
48
+ common case.
49
+
50
+ Together these fixes should drop gemma3:12b's failure rate
51
+ from 2/109 (1.8%) to ~0/109. Smoke-tested on #118 only —
52
+ full re-run will happen on next benchmark cycle.
53
+
54
+ New env var: `LLM_OLLAMA_NUM_CTX` (integer, defaults to
55
+ 32768). Only consulted when the LLM base URL is local
56
+ Ollama.
57
+
58
+ ## [2.8.4] - 2026-06-09
59
+
60
+ ### Fixed
61
+
62
+ - **#72: benchmark runners ignored externally-set env vars (e.g.
63
+ `LLM_ENRICHMENT_MODEL`).** Symptom: `LLM_ENRICHMENT_MODEL=gpt-oss-120b
64
+ npm run benchmark:cwe` silently used whatever value was in the local
65
+ `.env` instead — masking LLM uplift in CWE-Bench-Java runs and
66
+ producing static-only numbers when the user had explicitly requested
67
+ an LLM model on the command line.
68
+
69
+ Root cause: 4 benchmark runners loaded `.env` via `dotenv.config()`
70
+ with its default `override: true` behavior, so `.env` clobbered any
71
+ pre-existing `process.env` value (the opposite of POSIX
72
+ precedence).
73
+
74
+ Fix: pass `{ override: false }` in all four call sites:
75
+ - `benchmarks/runners/run-cwe-bench-java.ts`
76
+ - `benchmarks/runners/run-all-benchmarks-parallel.ts`
77
+ - `benchmarks/instruction-safety/run-benchmark.ts`
78
+ - `benchmarks/skills/run-skills-benchmark.ts`
79
+
80
+ External env vars (CLI invocation, exported shell vars) now win;
81
+ `.env` is consulted only for keys not already set. `circle-pack`'s
82
+ `src/api/server.ts` is intentionally left as-is — different threat
83
+ model (production REST server where `.env` is the canonical config
84
+ source).
85
+
8
86
  ## [2.7.19] - 2026-05-28
9
87
 
10
88
  ### Versioning policy
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "circle-ir-ai",
3
- "version": "2.8.3",
3
+ "version": "2.8.5",
4
4
  "description": "LLM-enhanced SAST analysis built on circle-ir",
5
5
  "main": "dist/index.js",
6
6
  "module": "dist/index.js",