claudeos-core 2.3.1 → 2.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,730 @@
1
1
  # Changelog
2
2
 
3
+ ## [2.3.2] — 2026-04-23
4
+
5
+ Internal refactor + UX polish + prompt/validator co-evolution for
6
+ path-hallucination defense + stack-detector hardening. Five co-shipped
7
+ changes: (1) `bin/commands/init.js` — `cmdInit` decomposed from a
8
+ single 970-line function into 16 focused stage helpers plus a 107-line
9
+ orchestrator; (2) `content-validator` output reframed from the
10
+ vocabulary of generation failures to the vocabulary of quality
11
+ advisories; (3) library-convention hallucination warning in
12
+ `pass3-footer.md` / `pass4.md` rescoped from filename-binding to
13
+ topic-binding, with validator-side placeholder-pattern expansion
14
+ (`Xxx` / `XXX` / glob-star) and a narrow file-level exclusion for
15
+ `00.core/52.ai-work-rules.md`, plus a follow-up hypothetical /
16
+ future-tense framing guard that closes the "if this feature were
17
+ added, it would live at `src/middleware.ts`" class of path
18
+ fabrication; (4) `claude-md-scaffold.md` Section 1
19
+ generation rules hardened with a canonical 10-language translation
20
+ table, and Section heading parenthetical gloss reclassified from
21
+ optional to required (for non-English output) / forbidden (for
22
+ English output), with a 10-language × 8-section gloss table;
23
+ (5) `plan-installer/stack-detector.js` extended to cover Gradle
24
+ variable-reference patterns (`sourceCompatibility = "${var}"`, ext-
25
+ block Spring Boot version), Maven property references, Spring
26
+ property-placeholder ports (`${APP_PORT:8090}`), iBatis detection as
27
+ distinct from MyBatis, multi-dialect database arrays, MariaDB
28
+ detection (previously missing from `DB_KEYWORD_RULES`), and logging-
29
+ framework identification (Logback / Log4j2 / log4jdbc / Log4j 1.x
30
+ with oauth-style false-positive guards). Also `pass-prompts/templates/
31
+ java-spring/pass3.md` and `kotlin-spring/pass3.md` logging-rule glob
32
+ extended to cover `.properties`, `.groovy`, and `log4jdbc*` file
33
+ patterns; Pass 1 Java / Kotlin prompts now include an explicit
34
+ "configuration file verification" block instructing the LLM to read
35
+ `build.gradle` / `pom.xml` / `application*.yml` directly as
36
+ ground-truth sources when stack metadata is incomplete. Zero
37
+ functional regression: identical pipeline behavior, identical exit
38
+ codes for CI consumers. Test suite 694 / 694 pass (up from 662).
39
+
40
+ ### Refactor — `cmdInit` decomposition
41
+
42
+ - **Problem addressed.** The main entry-point function had accumulated
43
+ 970 lines, 77 `if` statements, and 17 `try` blocks as each new
44
+ pipeline stage (Pass 1 batching, Pass 2 structural validation,
45
+ Pass 3 split + resume, Pass 3 stale-marker detection, Pass 4
46
+ gap-fill, lint, content-validator) was spliced into the same linear
47
+ body. The function was readable one stage at a time but not as a
48
+ whole, and every new contribution required paging through the
49
+ entire body to locate the correct insertion point. This release
50
+ extracts each stage into a named helper, leaving `cmdInit` as a
51
+ top-to-bottom pipeline of 16 function calls with progress
52
+ accounting between them (107 lines, 2 `if`, 0 `try`; estimated
53
+ McCabe complexity ≥94 → ≤5).
54
+
55
+ - **Extracted stage helpers.** Each owns exactly one phase of the
56
+ pipeline and nothing else:
57
+ `checkPrerequisites`, `resolveLanguage`, `applyResumeMode`,
58
+ `ensureDirectories`, `loadDomainGroups`, `loadPass1Prompts`,
59
+ `makeProgressBar`, `runPass1Loop`, `runPass2`,
60
+ `buildPass3ContextJson`, `handlePass3StaleMarker`, `dispatchPass3`,
61
+ `runPass4`, `runVerificationTools`, `runLint`,
62
+ `runContentValidator`, `printCompletionBanner`. Stage functions
63
+ that advance the outer progress bar return a step-delta that
64
+ `cmdInit` accumulates into its local `completedSteps` counter,
65
+ preserving the `completedSteps++` token required by the
66
+ `pass3-marker.test.js` stale-region regex.
67
+
68
+ - **`runPass3Split` intentionally NOT extracted.** Eight test files
69
+ (`pass3-marker`, `master-plan-removal`, `pass3-batch-subdivision`,
70
+ `pass4-marker-validation`, `pass3-guards`, `translation-skip-env`,
71
+ `pass2-validation`, `pass4-claude-md-untouched`) read
72
+ `bin/commands/init.js` as source text and grep for internal
73
+ patterns (`runStage("3d-aux"`, `function computeBatches`,
74
+ `DOMAINS_PER_BATCH = 15`, `if (isBatched) { ... runStage("3b-core"`
75
+ proximity, etc.). Moving `runPass3Split` to a separate module
76
+ would require re-designing those eight source-parity checks
77
+ against importable exports. That is a deliberate follow-up; this
78
+ patch keeps the test boundary untouched so the refactor is pure
79
+ mechanical decomposition.
80
+
81
+ - **Semantic preservation.** All user-visible behavior is identical:
82
+ every log line, every `InitError` message, every banner frame,
83
+ every progress-bar tick, every resume/fresh branch, every
84
+ stale-marker code path, the static-fallback marker body, and the
85
+ `applyStaticFallback` gap-fill sequence are byte-identical to
86
+ v2.3.1. The only change is *where* the code lives within the same
87
+ file.
88
+
89
+ ### UX — `content-validator` advisory vocabulary
90
+
91
+ - **Problem addressed.** When `init` finished cleanly and the user
92
+ saw the celebratory `✅ ClaudeOS-Core — Complete` banner, the
93
+ previous step's output already said `❌ ERRORS (6): [STALE_PATH] ...`.
94
+ The ordering produced a "success or failure?" flinch even though
95
+ the two messages were describing different questions: `init` had
96
+ succeeded (files are on disk, structure valid, tests pass);
97
+ `content-validator` had merely observed that some LLM-guessed
98
+ filenames inside the generated rules don't resolve on disk. Those
99
+ are quality advisories — the generated docs are usable — but the
100
+ word "ERRORS" made users reach for `init --force`, which does not
101
+ reliably fix the advisories (re-running Pass 3 with the same fact
102
+ JSON often produces the same mis-inference).
103
+
104
+ - **Fix.** Purely linguistic. No logic changes.
105
+
106
+ - **`content-validator/index.js` — output relabeling.** The banner
107
+ `❌ ERRORS (N)` becomes `ℹ️ ADVISORIES (N)`; `⚠️ WARNINGS (M)`
108
+ becomes `⚠️ NOTES (M)`; the final summary `Total: N errors,
109
+ M warnings` becomes `Total: N advisories, M notes`. The internal
110
+ arrays stay named `errors` and `warnings` because they encode
111
+ severity for programmatic consumers.
112
+
113
+ - **Exit code preserved at source.** `content-validator` still
114
+ returns `process.exit(1)` when advisories exist. This is a
115
+ deliberate asymmetry: the tool reports advisories softly in
116
+ output but still signals a non-zero exit code, because
117
+ `npx claudeos-core health` and any CI pipeline wired to it need
118
+ a real gate. Stripping the exit code would silently pass
119
+ `STALE_PATH` / `MANIFEST_DRIFT` findings through `health-checker`
120
+ (which branches on tool exit code + `warnOnly` flag), destroying
121
+ the detection signal v2.3.0 was built for.
122
+
123
+ - **`bin/commands/init.js` `runContentValidator` — advisory
124
+ framing.** The post-subprocess message is rewritten as
125
+ "Content advisories detected — these are quality notes, NOT
126
+ generation failures. Your generated docs are ready to use as-is."
127
+ The guidance pointer reads "npx claudeos-core health (standalone
128
+ gate with exit code)" so users who want a hard gate know where
129
+ to find one.
130
+
131
+ - **`stale-report.json` schema unchanged.** Fields `contentErrors`
132
+ and `contentWarnings` keep their names — they are part of the
133
+ public schema read by `health-checker` and any external CI
134
+ consumer.
135
+
136
+ - **Why this is not severity down-grading.** A naive fix would move
137
+ `STALE_PATH` and `MANIFEST_DRIFT` from the `errors[]` array into
138
+ the `warnings[]` array and exit 0. That would flatten the signal
139
+ in `health-checker` (which distinguishes pass/fail/warn by exit
140
+ code + `warnOnly` flag), so an advisory-heavy project would report
141
+ "✅ All systems operational" even with 20 stale paths — the exact
142
+ silent-failure class v2.3.0 eliminated. This release instead keeps
143
+ the severity distinction intact inside the tool and stale-report,
144
+ and only changes the words the user reads.
145
+
146
+ ### Prompt + Validator — Library-convention hallucination
147
+
148
+ - **Problem addressed.** The library-convention warning in
149
+ `pass3-footer.md` / `pass4.md` was previously scoped to specific
150
+ filenames (`testing-strategy.md`, `styling-patterns.md`,
151
+ `state-management.md`). When a file's topic matched (testing,
152
+ env typing, styling, state management) but its filename did not,
153
+ the LLM ignored the warning and cited canonical library paths
154
+ from training data (`src/test/setup.ts`, `src/types/env.d.ts`,
155
+ `src/__mocks__/handlers.ts`, etc.) that do not exist in the
156
+ project. `content-validator [10/10]` then flagged these as
157
+ `STALE_PATH` advisories.
158
+
159
+ A second, distinct failure class exists when a prompt enumerates
160
+ convention-trap paths as a denylist: the LLM, when generating a
161
+ file whose purpose is to teach future sessions about hallucination
162
+ traps (notably `52.ai-work-rules.md`), treats the denylist as
163
+ source material and copies the literal paths into the output as
164
+ cautionary illustrations ("AI sessions should not invent paths
165
+ like these"). `content-validator`'s path-claim check is content-
166
+ blind and treats the illustrations as literal claims. This is
167
+ **prompt-to-output educational leakage** — not a hallucination,
168
+ but a teaching example that the validator cannot distinguish from
169
+ a real claim.
170
+
171
+ - **Fix.** Four coordinated changes across prompt and validator:
172
+
173
+ - **Scope expansion to topic-binding.** The warning block in
174
+ `pass3-footer.md` and `pass4.md` was rescoped from filename-
175
+ binding to topic-binding — the trigger is "the topic the file
176
+ is about", not "the filename of the document". A "Scope note
177
+ (v2.3.2+)" paragraph makes this explicit.
178
+
179
+ - **No literal convention paths in prompt templates.** The
180
+ enumerated denylist approach was abandoned. The warning
181
+ describes the class behaviorally ("PROJECT-CHOICE files",
182
+ "library's canonical path may not exist here") and points to
183
+ abstract replacement forms ("a shared setup module under a
184
+ test directory of your choice", "augment `ImportMetaEnv` in a
185
+ type-declaration file of your choosing"). Literal example
186
+ paths have been removed from anti-pattern blocks in both
187
+ templates and rewritten as mechanism labels (e.g.
188
+ `Framework-convention entry-point invention`, `Parent-directory
189
+ or constant-name renormalization`, `Plausibly-named utility
190
+ invention`) with prose explanations but no `src/...` strings.
191
+
192
+ - **Educational-example placeholder guidance.** A new block in
193
+ both `pass3-footer.md` and `pass4.md` explains that rule files
194
+ which need to illustrate bad path habits (notably
195
+ `52.ai-work-rules.md`) should use abstract placeholders —
196
+ `{placeholder}`, `Xxx` / `XXX`, glob stars, or prose — rather
197
+ than literal paths. Literal example paths are interpreted as
198
+ real claims by `content-validator [10/10]` regardless of
199
+ surrounding prose.
200
+
201
+ - **Validator: placeholder detection expanded.** The
202
+ `hasPlaceholder(path)` predicate in `content-validator/index.js`
203
+ now skips three placeholder forms:
204
+ 1. `{...}` — the original v2.3.0 curly-brace form.
205
+ 2. `X{3,}` / `Xxx` — uppercase-XXX / `Xxx` placeholder
206
+ tokens. No word boundaries, so `useXXX_CONFIG` and
207
+ `XXXParser.ts` are both correctly skipped.
208
+ 3. `*` — glob wildcards describing a class of files.
209
+
210
+ - **Validator: file-level exclusion for by-design educational
211
+ files.** A new `PATH_CLAIM_EXCLUDE_FILES` set in
212
+ `content-validator/index.js` skips path-claim verification on
213
+ files whose purpose is to cite convention-trap paths as
214
+ warnings. Currently one file: `00.core/52.ai-work-rules.md`
215
+ (the AI Work Rules file). The exclusion is narrow, explicit,
216
+ and documented in a code comment explaining why the exclusion
217
+ is a design choice rather than a band-aid. The output line
218
+ shows "(N file(s) excluded by design)" so users understand the
219
+ count is reduced intentionally.
220
+
221
+ - **Why a split (prompt + validator) rather than prompt-only.** The
222
+ prompt change alone cannot guarantee the LLM will never produce a
223
+ literal example path when writing an educational rule — the LLM
224
+ may genuinely believe a concrete example is pedagogically clearer.
225
+ The validator change alone (exclusion only) would let a genuine
226
+ hallucination in `52.ai-work-rules.md` go undetected. The
227
+ combination is defense-in-depth: the prompt nudges toward
228
+ placeholder form (reducing false positives at source); the
229
+ validator tolerates educational examples in the one file where
230
+ they are expected (eliminating the remaining false positives);
231
+ and genuine hallucinations in every other file continue to be
232
+ flagged as before.
233
+
234
+ - **Test impact.** `tests/pass4-prompt.test.js`'s `pass4 enforces
235
+ path fact grounding` test was updated: literal-path matchers
236
+ (e.g., ``/❌ `src\/__mocks__\/handlers\.ts`/``) were replaced
237
+ with topic-level and mechanism-label matchers (`/Library-
238
+ convention canonical paths | testing.*env typing.*styling/`,
239
+ `/Framework-convention entry-point invention/`, etc.). The test's
240
+ intent is unchanged: it still verifies that the Pass 4 prompt
241
+ warns about library-convention hallucinations; only the form of
242
+ the warning has evolved.
243
+
244
+ #### Follow-up: hypothetical / future-tense framing guard
245
+
246
+ - **Problem addressed.** The library-convention fix closed the
247
+ "canonical path exists here" failure mode, but a sibling failure
248
+ mode was observed: when describing *future* or *hypothetical*
249
+ feature additions, the LLM would wrap a framework-canonical path
250
+ in conditional framing ("if middleware is added later, place it
251
+ at `src/middleware.ts`", "for a future health endpoint,
252
+ `src/app/api/health/route.ts`") and write the literal path
253
+ verbatim. `content-validator [10/10]` is content-blind: it treats
254
+ every backticked `src/...` path as a path claim regardless of the
255
+ conditional prose around it, so these hypothetical examples are
256
+ flagged as `STALE_PATH` advisories even though the author
257
+ understood they were speculative.
258
+
259
+ The topic-binding library-convention warning did not cover this
260
+ case because the framing shifts the register from "this project
261
+ HAS X" to "this project WOULD HAVE X if …" — a different surface
262
+ form the original warning did not name.
263
+
264
+ - **Fix (prompt-only, two files).** Added a dedicated "Hypothetical
265
+ / future-tense framing is NOT a loophole" block to both
266
+ `pass-prompts/templates/common/pass3-footer.md` and
267
+ `pass-prompts/templates/common/pass4.md`. Key rules:
268
+
269
+ - **Conditional framing does not change the validator's
270
+ decision.** `if we adopted X`, `were this feature introduced,
271
+ it would live at …`, `for a future Y`, `when Z is added later`
272
+ (and translated equivalents in any output language) do NOT
273
+ make a literal `src/...` path safe. The block states this
274
+ invariance explicitly so the LLM does not interpret conditional
275
+ prose as a validator-bypass.
276
+
277
+ - **Role / directory form, not filename.** The correct
278
+ hypothetical is expressed as a ROLE + DIRECTORY description
279
+ without committing to a filename (e.g., "If middleware is
280
+ added later, place it at the path the routing convention
281
+ expects — do not cite a specific filename until the file
282
+ actually exists"). Three worked `✅ RIGHT` examples cover the
283
+ middleware, health-endpoint, and env-typing cases.
284
+
285
+ - **OMIT as last resort.** If the LLM cannot name the role +
286
+ directory without committing to a `src/...` path that does NOT
287
+ appear in `pass3a-facts.md`, the guidance is to omit the
288
+ example entirely. An omitted example is better than a
289
+ fabricated path downstream readers may treat as authoritative.
290
+ The OMIT condition is double-gated: (a) role + directory
291
+ description is not possible, AND (b) the path is not in
292
+ `pass3a-facts.md` — paths that DO appear in the allowlist
293
+ continue to be written verbatim per the existing
294
+ "directory-scoped rule is correct" guidance.
295
+
296
+ - **Language-invariant.** The rule explicitly states that
297
+ translated conditional phrases in any output language (Korean,
298
+ Japanese, Chinese, etc.) are subject to the same constraint,
299
+ because the validator matches on the literal path string, not
300
+ on the surrounding prose.
301
+
302
+ - **Placement in `pass4.md`.** The new block is added as a fifth
303
+ ❌ mechanism (after `Framework-convention entry-point invention`,
304
+ `Parent-directory or constant-name renormalization`,
305
+ `Plausibly-named utility invention`, and the topic-binding
306
+ `Library-convention canonical paths` block), immediately before
307
+ the ✅ guidance that says "If pass3a-facts.md shows a specific
308
+ filename and path for a role, write that exact path verbatim".
309
+ The adjacency makes the interaction between the prohibition
310
+ (hypothetical fabrication) and the permission (existing
311
+ allowlist) visible to the LLM at a glance.
312
+
313
+ - **Why prompt-only, not validator-side.** Distinguishing
314
+ "assertive claim about an existing file" from "conditional
315
+ description of a future file" would require NLP-level prose
316
+ understanding, which is out of scope for the structural
317
+ `content-validator`. The existing placeholder forms
318
+ (`{placeholder}`, `Xxx`/`XXX`, glob `*`) remain as the
319
+ validator-side defense-in-depth: an LLM that cannot phrase the
320
+ hypothetical in role/directory form can still fall back to a
321
+ placeholder.
322
+
323
+ - **Test impact.** Five independent verification surfaces now
324
+ cover this block: (1) template-content checks (header,
325
+ ✅/❌ examples, OMIT fallback, language-invariant clause, CJK
326
+ absence); (2) related unit tests unchanged — `pass4-prompt.test.js`
327
+ (12/12) and `prompt-generator.test.js` (33/33) continue to pass
328
+ because existing mechanism-label matchers are unaffected by the
329
+ new fifth block; (3) end-to-end prompt-generation smoke confirms
330
+ the block survives assembly into `pass3-prompt.md` and
331
+ `pass4-prompt.md`; (4) full suite 694/694 unchanged; (5) the
332
+ 4-mechanism ordering invariant
333
+ (`Framework-convention → Parent-directory → Plausibly-named →
334
+ Hypothetical`) is asserted via regex proximity match in the
335
+ smoke test.
336
+
337
+ ### Prompt — CLAUDE.md Section 1 language localization
338
+
339
+ - **Problem addressed.** For non-English `--lang` targets,
340
+ `claude-md-scaffold.md` Section 1 generation rules previously
341
+ instructed "emit in the target output language" but immediately
342
+ followed with a fixed English template containing `{OUTPUT_LANG}`
343
+ as the only substitution slot:
344
+
345
+ ```
346
+ As the senior developer for this repository, you are responsible
347
+ for writing, modifying, and reviewing code. Responses must be
348
+ written in {OUTPUT_LANG}.
349
+ ```
350
+
351
+ The specific English sentence acted as a stronger signal than the
352
+ abstract instruction to translate — LLMs copy concrete templates
353
+ verbatim when the template's only visible variable is a
354
+ substitution slot. The generated output therefore carried a
355
+ Section 1 Line 1 in English for non-English targets, producing
356
+ the ironic effect of "Responses must be written in {LANG}" where
357
+ {LANG} is correctly substituted yet the containing sentence
358
+ itself is in English. Other sections escaped this trap because
359
+ their templates were table-shaped or keyword-shaped (`Language |
360
+ {value}`, etc.); Section 1 was unique in carrying a complete
361
+ English sentence as "template".
362
+
363
+ - **Fix.** Added canonical translations for all 10 supported
364
+ languages (`en`, `ko`, `zh-CN`, `ja`, `es`, `vi`, `hi`, `ru`,
365
+ `fr`, `de`) directly inside `claude-md-scaffold.md` Section 1
366
+ generation rules. Each translation is paired with its language
367
+ code; the LLM picks the one matching `{OUTPUT_LANG}` and emits it
368
+ verbatim. Languages outside the canonical 10 fall back to the
369
+ semantic structure described by the English reference.
370
+
371
+ Supporting changes:
372
+
373
+ - **Scaffold body warning comment.** The body template's Line 1
374
+ (still English, since it serves as the generic slot) now
375
+ carries an inline `{!-- ... --}` comment instructing the LLM
376
+ to replace with the canonical translation when
377
+ `{OUTPUT_LANG} != en`. This defends against LLMs that scan the
378
+ body template first and overlook the generation rules lower in
379
+ the same file.
380
+
381
+ - **Checklist augmentation.** The scaffold's verification
382
+ checklist gained a new item: "Section 1 Line 1 is in
383
+ `{OUTPUT_LANG}` — matches the canonical translation (if
384
+ `{OUTPUT_LANG}` is one of the 10 canonical codes). If Line 1
385
+ contains 'As the senior developer' while `{OUTPUT_LANG}` is
386
+ NOT `en`, the translation was skipped — fix it." This gives
387
+ the LLM an explicit self-check predicate before finalizing
388
+ output.
389
+
390
+ - **Example block framing.** The "Example: Section 1 for
391
+ different stacks" block's framing comment was upgraded from
392
+ "Emit the final output in the target output language; the
393
+ semantic content should match" (weak) to an explicit
394
+ `⚠️ Language note:` block stating that the English examples
395
+ show SEMANTIC structure only and pointing back to the
396
+ canonical translations for Line 1.
397
+
398
+ - **Why scaffold-level and not code-level.** This is not a
399
+ post-processing concern. The translation must happen at
400
+ generation time inside the LLM context, not as a sed/replace
401
+ step afterward — sed would catch only the English reference
402
+ sentence but would miss subsequent rephrased variants the LLM
403
+ might produce. Making the scaffold explicit about the canonical
404
+ text eliminates ambiguity at source.
405
+
406
+ - **Test impact — none.** Scaffold files are runtime resources;
407
+ no test asserts on the text of `claude-md-scaffold.md`.
408
+
409
+ #### Follow-up: Section heading gloss now required (not optional)
410
+
411
+ - **Problem addressed.** A second localization inconsistency existed
412
+ in `##` section headings: run-to-run variation in whether headings
413
+ carried their native-language gloss. Some runs emitted
414
+ `## 1. Role Definition ({gloss})` (English canonical + target-
415
+ language gloss); others emitted only `## 1. Role Definition`,
416
+ omitting the gloss entirely. Both outputs were technically
417
+ compliant with the v2.3.1 scaffold rules, which stated the gloss
418
+ was "optional" and "a courtesy, not a requirement". The
419
+ inconsistency broke the operator's expectation that two runs of
420
+ the same project would produce the same heading format, and
421
+ removed a useful intelligibility cue for non-English readers.
422
+
423
+ - **Fix.** Reclassified the parenthetical gloss from "optional" to
424
+ "REQUIRED when `{OUTPUT_LANG}` != `en`" / "OMITTED when
425
+ `{OUTPUT_LANG}` == `en`". This is now a deterministic rule with
426
+ no LLM-side discretion.
427
+
428
+ - **`claude-md-scaffold.md` "Section heading format" rewrite.**
429
+ The format rule now reads: primary English canonical REQUIRED;
430
+ parenthetical native-language gloss REQUIRED when non-English,
431
+ OMITTED when English. A canonical gloss table covering all 10
432
+ supported languages × all 8 sections (80 entries) was added
433
+ below the rule so the LLM picks the exact gloss verbatim. The
434
+ example blocks (ko, ja, en) were expanded to show both the
435
+ correct form and two failure modes each: missing gloss on
436
+ non-English output, and gloss present on English output.
437
+
438
+ - **Scaffold body template annotation.** A `{!-- SECTION HEADING
439
+ RULE --}` comment was added at the top of the scaffold body
440
+ template pointing to the gloss table above. This defends
441
+ against LLMs that scan the body template first and copy its
442
+ English-only headings verbatim without consulting the format
443
+ rule.
444
+
445
+ - **Pass 3-footer STEP 4b rewrite.** The title determinism check
446
+ (executed as a post-generation self-audit by the LLM) was
447
+ upgraded from "a native-language translation may follow in
448
+ parentheses" to explicit `(a)` + `(b)` clauses: (a) English
449
+ canonical as primary (language-invariant); (b) parenthetical
450
+ native-language gloss required when non-English, omitted when
451
+ English. Worked examples for `en`, `ko`, `ja` output
452
+ illustrate each case.
453
+
454
+ - **Checklist augmentation (two new items).** The scaffold's
455
+ verification checklist gained a "Section heading gloss rule"
456
+ item requiring all 8 headings to carry the parenthetical gloss
457
+ when `{OUTPUT_LANG}` != `en`, and a paired "English gloss-
458
+ absence rule" item requiring gloss to be OMITTED when
459
+ `{OUTPUT_LANG}` == `en`. Both items name-check the canonical
460
+ table so the LLM knows where to resolve the exact gloss text.
461
+
462
+ - **Why strictly a follow-up, not a separate change.** The
463
+ underlying problem is the same class as the Section 1 Line 1
464
+ bug: the scaffold left room for LLM discretion on language-
465
+ localization decisions, and two runs of the same project
466
+ produced divergent results. The Line 1 fix addressed one
467
+ specific slot with a canonical translation; this follow-up
468
+ applies the same "canonical translations, no discretion"
469
+ pattern to the heading gloss slot.
470
+
471
+ - **Test impact — none.** No test asserts on scaffold text;
472
+ `claude-md-validator`'s heading check (which predates this
473
+ release) already tolerates the gloss via a regex that matches
474
+ "English canonical, optionally followed by parenthetical text",
475
+ so the stricter scaffold rule does not require validator
476
+ changes to enforce.
477
+
478
+ ### Stack detector — variable-reference patterns, iBatis, multi-dialect DBs, logging frameworks
479
+
480
+ - **Problem addressed.** `plan-installer/stack-detector.js` is the
481
+ static analyzer that produces `project-analysis.json`, the input
482
+ to every Pass 1 run. A class of hallucinations in generated
483
+ CLAUDE.md (incorrect Java version, server port, or logging-
484
+ framework labels) traces to the same root cause: the stack-
485
+ detector regex returns `null` for a field, and the Pass 1 LLM
486
+ fills the gap by assuming framework defaults (e.g. "Java 17+"
487
+ for any Spring Boot 3.x project, "port 8080" for any Spring
488
+ Boot project). Tracing the regexes surfaced a broader gap:
489
+ multiple modern Gradle/Maven patterns, legacy iBatis projects,
490
+ multi-dialect backends, and logging-framework identification
491
+ were all outside the detector's coverage.
492
+
493
+ - **Fix — Gradle Java version (4 patterns, not 1).** The v2.3.1
494
+ regex `sourceCompatibility\s*=\s*['"]?(\d+)['"]?` only matched
495
+ the direct-literal form. Extended to four patterns, tried in
496
+ order:
497
+ 1. Direct literal: `sourceCompatibility = 21` / `'21'` / `"21"`
498
+ (also matches `targetCompatibility`).
499
+ 2. `JavaVersion` enum: `sourceCompatibility = JavaVersion.VERSION_21`
500
+ (with `VERSION_1_8` → Java 8 legacy form).
501
+ 3. Toolchain block: `JavaLanguageVersion.of(21)` inside
502
+ `java { toolchain { ... } }`.
503
+ 4. Variable-reference fallback: when `sourceCompatibility =
504
+ "${javaVersion}"`, resolve the variable name inside the same
505
+ file's `ext` block. The RegExp for the resolution
506
+ dynamically escapes the variable name with the standard
507
+ regex-meta-character escape pattern.
508
+
509
+ - **Fix — Gradle Spring Boot version variable reference.** Parallel
510
+ fallback for `ext { springBootVersion = '3.5.5' }` combined with
511
+ `id 'org.springframework.boot' version "${springBootVersion}"`.
512
+ The three existing patterns are tried first; only when none
513
+ captures a numeric value (captures starting with `${` are
514
+ rejected as variable references) does the fallback resolve the
515
+ variable inside the same file.
516
+
517
+ - **Fix — Maven Java version (3 patterns).** Extended from
518
+ `<java.version>\d+` literal-only to:
519
+ 1. Direct `<java.version>` value.
520
+ 2. `<maven.compiler.source>` / `<maven.compiler.target>`
521
+ values.
522
+ 3. Property reference like
523
+ `<java.version>${project.javaVersion}</java.version>` where
524
+ the referenced property is defined earlier in `<properties>`.
525
+ Cross-file resolution (parent POM, BOM) is intentionally out
526
+ of scope — those cases fall through to LLM-side analysis.
527
+
528
+ - **Fix — Yml server port Spring placeholder (4 patterns).** The
529
+ v2.3.1 regexes `server:\n port: (\d+)` and
530
+ `server\.port[=:](\d+)` only matched literal port numbers.
531
+ Spring Boot accepts property-placeholder defaults like
532
+ `port: ${APP_PORT:8090}` — extended to capture the post-colon
533
+ default value in both yml-nested and flat-key forms. The default
534
+ is the correct value because it represents what the application
535
+ falls back to when the environment variable is unset.
536
+
537
+ - **Feature — iBatis detection as a first-class ORM.** Apache
538
+ iBatis (EOL 2010) and Spring iBatis are distinct from MyBatis;
539
+ MyBatis evolved out of iBatis but uses a different XML namespace
540
+ and runtime architecture. Conflating them in Pass 3 output would
541
+ produce incorrect guidance. `IBATIS_REGEX` matches specific
542
+ coord patterns (`org.apache.ibatis`, `spring-ibatis`,
543
+ `ibatis-sqlmap`, `ibatis-core`, `ibatis-common`) and runs BEFORE
544
+ the generic ORM_RULES table in both Gradle and Maven branches.
545
+ MyBatis projects (`org.mybatis:mybatis`,
546
+ `mybatis-spring-boot-starter`) continue to resolve to
547
+ `orm: "mybatis"` — the detection boundary between the two is
548
+ precise.
549
+
550
+ - **Feature — multi-dialect database arrays (`stack.databases`).**
551
+ v2.x consumers expected a single primary DB (`stack.database`);
552
+ backends declaring multiple dialect drivers simultaneously lost
553
+ all but the first indicator. Added a second field
554
+ `stack.databases` (plural) that collects every DB keyword
555
+ across all config sources (Gradle `build.gradle`, Maven
556
+ `pom.xml`, Gradle version catalogs, yml, `.env`, Node
557
+ `package.json`, Python `requirements.txt`). Order-preserving and
558
+ deduped. `stack.database` keeps its v2.x semantics as "the
559
+ first-match primary" for backward compatibility; Pass 1 prompts
560
+ and Pass 3 standard generation should prefer `stack.databases`
561
+ when present and non-empty. Empty array (not null) when no DB
562
+ is detected, to simplify array comprehensions in prompts.
563
+
564
+ - **Fix — MariaDB detection.** The `DB_KEYWORD_RULES` table
565
+ previously had entries for PostgreSQL, MySQL, Oracle, MongoDB,
566
+ SQLite, and H2 — but NOT for MariaDB. Projects using
567
+ `org.mariadb.jdbc:mariadb-java-client` were classified as `null`
568
+ (or as MySQL, when the MySQL driver was also present). MariaDB
569
+ is now a distinct entry in the keyword table and in the Maven
570
+ / yml inline DB scans.
571
+
572
+ - **Feature — logging framework detection (`stack.loggingFrameworks`).**
573
+ New array field enumerating JVM logging frameworks detected
574
+ from Gradle/Maven dependencies and yml `logging.config:`
575
+ references. Recognizes four frameworks:
576
+ (a) Log4j2 via `org.apache.logging.log4j:log4j-core` coord or
577
+ `log4j2-*.xml` config file;
578
+ (b) Logback via `ch.qos.logback:logback-classic` coord or
579
+ `logback-*.xml` / `logback*.groovy` config file;
580
+ (c) log4jdbc (JDBC logging adapter, reported alongside the
581
+ primary framework);
582
+ (d) Log4j 1.x (EOL 2015) via precise coord regex `log4j:log4j`
583
+ with quote/whitespace boundaries to avoid matching
584
+ `log4j-to-slf4j` or `log4j-api` (Log4j2 ecosystem
585
+ libraries that contain `log4j:log4j` as a substring). The
586
+ Log4j 1.x boundary required a specific regex form
587
+ (quote/colon/whitespace character class before the coord)
588
+ because word boundaries alone were insufficient.
589
+
590
+ - **Fix — Pass 3 logging rule glob extended.** The Pass 3 prompt
591
+ for Java and Kotlin Spring stacks specified auto-load paths as
592
+ `["**/*.java", "**/logback*.xml", "**/log4j*.xml"]`. This
593
+ missed three file types commonly present in real Spring
594
+ projects: Logback's Groovy DSL configuration (`logback*.groovy`),
595
+ Log4j / Log4j2 properties files (`log4j*.properties`), and
596
+ log4jdbc adapter configuration (`log4jdbc*.properties`).
597
+ Extended the glob to cover all five file patterns.
598
+
599
+ - **Fix — Pass 1 prompts include configuration-file verification
600
+ block.** Both `java-spring/pass1.md` and `kotlin-spring/pass1.md`
601
+ now begin with a "MANDATORY: Configuration file verification"
602
+ section instructing the LLM to read `build.gradle` (or
603
+ `build.gradle.kts` / `pom.xml`), `application*.yml` (and profile
604
+ variants), and referenced logging configuration files BEFORE
605
+ analyzing domain source code. The LLM is told that
606
+ `project-analysis.json`'s stack metadata may be incomplete and
607
+ that the configuration files are ground-truth sources. Explicit
608
+ examples show variable-reference resolution (`sourceCompatibility
609
+ = "${javaVersion}"` → resolve via `ext { ... }`) and Spring
610
+ placeholder port extraction (`port: ${APP_PORT:8090}` → extract
611
+ `8090`). When the analyzer output and the configuration files
612
+ disagree, the LLM is instructed to trust the configuration file
613
+ and record the discrepancy. This adds a second defensive layer:
614
+ even if future Gradle/Maven syntax evolves past the detector's
615
+ regex coverage, the LLM's direct file read catches the
616
+ discrepancy.
617
+
618
+ - **Fix — Config file glob expanded to cover Spring's full naming
619
+ space.** The yml scan in v2.3.1 globbed only
620
+ `**/application*.yml`, missing three file classes that Spring
621
+ Boot loads identically: `application.yaml` (spec-official
622
+ extension), `application.properties` (Spring Initializr default
623
+ when no format is specified), and
624
+ `bootstrap.{yml,yaml,properties}` (Spring Cloud Config /
625
+ Consul / Eureka — loaded BEFORE `application.*` and commonly
626
+ declaring service ports and config-server URIs). The new glob
627
+ `**/{application,bootstrap}*.{yml,yaml,properties}` covers all
628
+ combinations including profile variants (`application-local.yml`,
629
+ `application-dev.properties`). The inner regex set was already
630
+ format-agnostic — yml `server:\n port: N` syntax and
631
+ `.properties`-style `server.port=N` flat-key syntax were both
632
+ covered by the same pattern list, so no additional regex work
633
+ was needed.
634
+
635
+ - **Fix — Comment stripping (`stripComments()` shared helper).**
636
+ Commented-out dependency lines must not match `LOGGING_RULES`
637
+ or the Maven DB / ORM / framework scans. A shared helper strips
638
+ three comment styles in a single pass:
639
+ 1. Line-level `//` (Gradle Kotlin/Groovy DSL).
640
+ 2. Line-level `#` (yml, properties, shell).
641
+ 3. Block-level `<!-- ... -->` (Maven `pom.xml`, XML config;
642
+ non-greedy multi-line, so a commented-out `<dependency>`
643
+ block spanning many lines is handled in a single regex
644
+ pass).
645
+
646
+ `detectLogging` runs on `stripComments(content)`. The Maven
647
+ branch of `detectStack` derives `pomClean = stripComments(pom)`
648
+ after `<properties>` parsing is complete, and uses `pomClean`
649
+ for ALL dependency-layer scans (framework check, ORM, iBatis,
650
+ DB keyword array, H2, logging). The raw `pom` is retained for
651
+ `<properties>` reads because commented-out property definitions
652
+ are rare in practice and the property-reference resolution
653
+ already scopes itself to the declared property name.
654
+
655
+ - **Feature — Maven XML form for Log4j2 / Logback detection.** The
656
+ Gradle coord regex `org\.apache\.logging\.log4j[.:]log4j-core`
657
+ expects a `:` or `.` separator between groupId and artifactId.
658
+ In Maven XML, the two are in separate tags, so the separator
659
+ is `</groupId>...<artifactId>`, not a single character. Paired
660
+ regexes now match the XML form within a 300-character window
661
+ (large enough to span typical whitespace and `<version>` /
662
+ `<scope>` siblings, small enough that unrelated `<dependency>`
663
+ blocks further down the file do not falsely pair):
664
+ - Log4j2: `<groupId>\s*org\.apache\.logging\.log4j\s*<\/groupId>[\s\S]{0,300}?<artifactId>\s*log4j-core\s*<\/artifactId>`.
665
+ The `log4j-core` artifactId is required — `log4j-to-slf4j`
666
+ and `log4j-api` (bridges) must NOT trigger "Log4j2 is the
667
+ primary framework".
668
+ - Logback: `<groupId>\s*ch\.qos\.logback\s*<\/groupId>[\s\S]{0,300}?<artifactId>\s*logback-(?:classic|core)\s*<\/artifactId>`.
669
+ Both `logback-classic` (runtime shipped with Spring Boot)
670
+ and `logback-core` are recognized.
671
+
672
+ - **Fix — Placeholder regex boundary relaxation (`X{3,}` without
673
+ word boundary).** The v2.3.0 `hasPlaceholder` predicate in
674
+ `content-validator/index.js` used `/\bX{3,}\b|Xxx/` to
675
+ recognize uppercase-XXX placeholder tokens. The `\b` boundaries
676
+ caused two false negatives:
677
+ - `XXXParser.ts`: the right `\b` expects a non-word character
678
+ after the X run, but `Parser` is alphanumeric.
679
+ - `useXXX_CONFIG`: the left `\b` requires a non-word
680
+ character before the X run, but `useXXX` has `e` directly
681
+ before.
682
+ Removed both word boundaries. The predicate is now `/X{3,}/`
683
+ (with the separate `/Xxx/` branch preserved for the
684
+ capital-lower-lower convention). Audited against a curated set
685
+ of typical identifier patterns (`matrix`, `XMLParser`,
686
+ `indexXY`, `taxi`, `examineX`, `textX`, `XX1`): none contain
687
+ three or more consecutive uppercase X's, so the relaxation
688
+ introduces no new false positives.
689
+
690
+ - **Tests added.** 32 new unit tests in `stack-detector.test.js`:
691
+ - 8 for Java-version patterns and port patterns (literal,
692
+ JavaVersion enum, toolchain, ext-variable reference; yml
693
+ literal, flat-key, yml placeholder, flat-key placeholder).
694
+ - 18 for iBatis vs MyBatis distinction (4), Maven Java
695
+ version patterns (3), Gradle ext Spring Boot version
696
+ reference (1), multi-dialect databases incl. MariaDB (4),
697
+ logging framework detection incl. false-positive prevention
698
+ from `log4j-to-slf4j` and comment-stripping (6).
699
+ - 6 for config-file glob expansion (`.properties`, `.yaml`,
700
+ `bootstrap.yml`, profile variants) and comment-stripping.
701
+
702
+ ### Combined guarantees
703
+
704
+ - **Test suite.** 694 / 694 pass (up from 662 in v2.3.1 — 32 new
705
+ tests for the stack-detector extensions), with one existing
706
+ test updated (`pass4-prompt.test.js` assertions migrated from
707
+ literal-path matchers to topic-level and mechanism-label
708
+ matchers as part of the library-convention warning rewrite).
709
+ `tests/content-validator.test.js` line 103
710
+ (`notStrictEqual(result.code, 0, "should exit non-zero")`)
711
+ still passes because the exit code is preserved. No stdout
712
+ assertions reference the strings `ERRORS` or `WARNINGS` — they
713
+ match on advisory types (`STALE_PATH`, `MANIFEST_DRIFT`,
714
+ `STALE_SKILL_ENTRY`) which are untouched.
715
+
716
+ - **No new dependencies. No CLI surface changes.** Template
717
+ changes are limited to prompt-layer guidance: the library-
718
+ convention warning block in `pass3-footer.md` and `pass4.md`
719
+ gained topic-binding scope; `claude-md-scaffold.md` Section 1
720
+ gained a 10-language canonical translation table plus
721
+ verification checklist items — all targeted expansions of
722
+ existing anti-hallucination / language-localization guidance,
723
+ not structural changes to Pass 3, Pass 4, or CLAUDE.md format.
724
+ Same two runtime deps (`glob`, `gray-matter`). Same commands,
725
+ same flags, same outputs (just different labels for
726
+ `content-validator`).
727
+
3
728
  ## [2.3.1] — 2026-04-23
4
729
 
5
730
  Patch release. Fixes Windows CI breakage in `npm test`.