haechi 1.3.1 → 1.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  # Reliability Hardening Track
2
2
 
3
- - Status: Plan (pinned 2026-06-12; grounded in a 5-lens read-only audit of the 1.1.1 core)
3
+ - Status: Shipped — WS1–WS6 all delivered and cut in core 1.2.0 (release gate G7 Pass). This doc is retained as the planning/audit record. (Pinned 2026-06-12; grounded in a 5-lens read-only audit of the 1.1.1 core.)
4
4
  - Target line: 1.1.2 (patch) → 1.2.0 (minor); no new product surface
5
5
  - Purpose: raise Haechi to **commercial-solution-level reliability** — the trust, operability, and detection-quality density a production AI-security gateway is expected to have. This is a quality objective, not a commercialization plan. Every item **tightens, measures, or documents what already exists**; none adds a new feature.
6
6
 
@@ -14,9 +14,9 @@ Haechi는 `1.x` stable 라인을 출시했습니다. developer preview 게이트
14
14
  | 구분 | 판단 | 이유 |
15
15
  |---|---|---|
16
16
  | GitHub public | 허용 | 보안 한계, threat model, shared responsibility가 문서화됨 |
17
- | GitHub release/tag | 허용 (`v1.3.1` 릴리스됨) | `v1.3.1` 보완 컷이 태깅·릴리스됨; §5.7 항목이 모두 Resolved이고 G9은 Pass |
18
- | npm stable | `haechi@1.3.1` publish됨 | 코드리뷰 보완이 `haechi@1.3.1` attested OIDC publish(2026-06-16)로 발행됨; 이전 `1.3.0`은 수정 이전 동작을 담고 있음 |
19
- | production use | 운영자 게이트; `1.3.1`로 업그레이드 | 운영자 네트워크 통제, 인가/인증, key custody가 있을 때만 지원; `haechi@1.3.0` 운영자는 민감한 제3자 업스트림 트래픽을 프록시로 라우팅하기 전에 프록시 헤더 경계 수정(P0-CR-001)을 반영하도록 `1.3.1`로 업그레이드해야 함 |
17
+ | GitHub release/tag | 허용 (`v1.3.3` 릴리스됨) | `v1.3.3`이 현재 릴리스(CR2 컷 `1.3.2` 위의 선제적 하드닝 패치); §5.7 §5.8(`CR2-001..008`) 항목은 모두 Resolved 유지, G9/G10은 Pass |
18
+ | npm stable | `haechi@1.3.3` publish됨 | `1.3.3`은 CR2-보완된 `1.3.2` 기준 위에 response-direction marker-skip 강화 + cosign 서명 GHCR 컨테이너 이미지를 더한 attested OIDC publish |
19
+ | production use | 운영자 게이트; `1.3.3`로 업그레이드 | 운영자 네트워크 통제, 인가/인증, key custody가 있을 때만 지원; 운영자는 민감한 제3자 업스트림 트래픽을 프록시로 라우팅하기 전에 최신 `haechi@1.3.3`(1.3.2의 CR2 수정 + marker-skip 하드닝 포함)을 실행해야 함 |
20
20
 
21
21
  ## 2. 릴리스 게이트
22
22
 
@@ -32,6 +32,7 @@ Haechi는 `1.x` stable 라인을 출시했습니다. developer preview 게이트
32
32
  | G7 | 1.2.0 신뢰성 강화 트랙 (WS1–WS6) | 탐지 품질 측정+강화(WS2: 라벨 코퍼스 precision/recall `bench:detection` 게이트, 자격증명+국제 PII 커버리지, 하드블록 타입 불변식이 적용된 `filters.minConfidence` / `filters.allowlist`, offset 무결성을 갖춘 NFKC 유니코드 회피 폴딩); WS3 주입 가능한 `rateLimiter` 시임 + bounded fixed-window map; WS4 운영성(`/__haechi/live`+`/ready` 분리, 주입 가능한 `/metrics`, 구조적 로그 + 요청별 `correlationId`, graceful drain, max-in-flight backpressure, env overlay, 하드닝 Dockerfile/compose/runbook, `configVersion`); WS6 proxy TLS / remote-bind 하드닝(`proxy.tls` / `proxy.trustForwardedProto`, fail-closed `assertSafeProxyTransport`) + OWASP-LLM/NIST 컨트롤 매핑 백서 + RFC 9116 `security.txt` + 취약점 공개 경로. 모든 변경은 1.1 동작을 보존하는 기본값 뒤의 additive(`tests/api-contract.test.mjs` 통과); no-plaintext-in-audit 불변식이 텔레메트리까지 확장; core는 zero runtime dependency 유지; core 1.2.0 bump(additive 마이너) | Pass |
33
33
  | G8 | 1.3.0 백엔드 + 탐지 커버리지 확장 | **Anthropic Messages API**(`/v1/messages`, content-block + SSE `delta.text`, `event:` 라인 보존 재직렬화)와 **Google Gemini API**(model-in-path `:generateContent`/`:streamGenerateContent`, 기존 정확-매칭 어댑터를 바이트 동일하게 두는 additive `:method`-suffix 라우트 매처) 프로토콜 어댑터 추가; 탐지 커버리지 확장 — 클라우드/SaaS provider 키(OpenAI/Anthropic/Google-OAuth/SendGrid/Twilio/npm/Azure, anchored)와 국제 PII(FR/ES/JP + IT/SG/IN/DE/NL 국가 ID, 체크섬 validator), 각 하드블록-대-dial-eligible 결정은 측정된 충돌률 기반(하드블록은 비숫자 앵커 또는 비현실적으로 드문 형태가 필요; 흔한 길이의 bare-digit run은 allowlist로 정리 가능 유지); `bench:throughput` proxy 부하 벤치; `haechi-ratelimit-redis` 공유 저장소 rate-limiter 위성(WS3 시임의 운영 소비자; proxy가 이제 `rateLimiter.allow`를 `await`); `haechi-dashboard`가 요청별 `correlationId` 노출. 모든 변경은 additive — 새 `target.type`/탐지타입/`privacy.profile` *값*이며 새 config 키가 아님(`configVersion`은 `1` 유지); `tests/api-contract.test.mjs` 통과; core는 zero runtime dependency 유지; core 1.3.0 bump(additive 마이너) | Pass |
34
34
  | G9 | 2026-06-16 전체 코드리뷰 보완 게이트 (1.3.1로 발행) | `P0-CR-001` 및 `P1-CR-002`부터 `P1-CR-005`까지 해결 또는 책임자 명시 수용; P2 항목은 해결 또는 명시적 non-blocking 근거와 일정 기록; 연결된 등록부 갱신. **13개 `P*-CR-*` 항목이 모두 Resolved이며(§5.7) `haechi@1.3.1`(2026-06-16, attested OIDC publish)로 발행되었습니다; core가 1.3.0 → 1.3.1로 bump(patch, 보완 전용 — API/config 표면 변경 없음, `configVersion`은 `1` 유지)되었습니다.** | Pass (`haechi@1.3.1`, 2026-06-16) |
35
+ | G10 | 2026-06-16 코드리뷰 round 2 (CR2) 보완 게이트 | CR2 등록부(`code-review-risk-register-2026-06-16-round2.md`, §5.8)는 **P0/P1을 발견하지 못했습니다**; 세 개의 P2(`CR2-001` 프록시 upstream-cancel, `CR2-002` token-vault audit hygiene, `CR2-003` plugin IPC reply 경계)와 P3 묶음(`CR2-004..008`)이 모두 **Resolved이며 `haechi@1.3.2`로 발행되었고**(`CR2-009` won't-fix, `CR2-010` accepted) 연결된 등록부가 갱신되었습니다. | Pass (`haechi@1.3.2`, 2026-06-16) |
35
36
 
36
37
  ## 3. P0 배포 차단 리스크 상태
37
38
 
@@ -151,6 +152,23 @@ base64/인코딩 값 디코딩 검사, query string 검사, audit tail truncatio
151
152
  | P2-CR-012 | KMS vault IPv6 loopback carve-out의 IPv6 테스트 부족 | Resolved | `satellites/crypto-kms/vault.test.mjs`에 전용 IPv6 loopback 정책 테스트("…enforces the IPv6 loopback policy (::1, [::1], dotted + hex mapped) — P2-CR-012")를 추가해 bare `::1`, bracketed `[::1]`, dotted `::ffff:127.0.0.1`, hex `::ffff:7f00:1`/`::ffff:7f00:0001`(및 bracketed 변형)을 검증하고, 공인 mapped 주소(`::ffff:8.8.8.8`/`::ffff:808:808`)가 과차단되지 않음을 단언; 확장된 range table과 `ssrf-parity.test.mjs`가 auth-jwt와의 dotted+hex 일치를 고정 |
152
153
  | P2-CR-013 | SSE multi-line `data:` 필드를 newline separator 없이 합침 | Resolved | `parseFrame`이 여러 `data:` line을 `join("\n")`(스펙 separator)으로 합치고 line별 스펙 선행 공백 1개만 제거; multi-line JSON은 여전히 `JSON.parse`되고 multi-line plain text는 newline과 함께 재구성되어 검사되며 `serializeTextFrame`가 multi-line payload를 여러 `data:` line으로 재방출; `tests/stream-filter.test.mjs`가 multi-line JSON event와 PII 포함 multi-line plain-text event를 커버 |
153
154
 
155
+ ## 5.8 2026-06-16 코드리뷰 Round 2 (CR2) 상태 — 게이트 G10
156
+
157
+ 권위 있는 항목별 등록부는 `docs/current/code-review-risk-register-2026-06-16-round2.md`입니다; 이 절은 릴리스 게이트 요약입니다. 1.3.1 컷 이후 진행한 2차 심층 리뷰는 **P0도 P1도 발견하지 못했습니다**(외부에서 P1로 보고된 두 항목 모두 검증 결과 P2로 내려갔습니다 — 둘 다 stored-plaintext leak도, auth/SSRF 우회도 아닙니다). 세 개의 P2 + P3 묶음(`CR2-001..008`)은 **Resolved이며 `haechi@1.3.2`로 발행되었습니다**; 보고된 한 항목은 **false positive**(`CR2-009`, won't-fix)였고 한 항목은 **이미 문서화된 수용 잔여 리스크**(`CR2-010`, accepted)였습니다. **G10은 Pass입니다.**
158
+
159
+ | ID | 리스크 | 상태 | 종료에 필요한 증거 |
160
+ |---|---|---|---|
161
+ | CR2-001 | pass-through streaming이 downstream disconnect 시 upstream reader를 절대 취소하지 않음(`pipeUpstreamBodyBounded`가 `drain`에서 영원히 park) — 인증되지 않은 resource leak | Resolved | per-request `AbortController` + upstream reader를 취소하고 fetch를 abort하는 클라이언트 `close`/`aborted` listener; `drain` 대기를 `close`와 race; 스트림 도중 disconnect가 reader를 즉시 취소하는 회귀 테스트 |
162
+ | CR2-002 | token-vault reveal/purge가 호출자 제공 raw `token` + `error.message`(token interpolate됨)를 audit event에 기록; `FORBIDDEN_KEYS`는 key 이름으로만 제거 | Resolved | 일반화된 오류 메시지; 기록 이전에 `token`을 keyed-HMAC하거나 `tok_` 형태로 검증; `error.message` 대신 enum `reasonCode`; raw token이 `reason`/`token`에 도달하지 않는다는 회귀 테스트; 불변식 표현 정합화 |
163
+ | CR2-003 | plugin IPC reply가 `JSON.parse` 이전에 size-bound되지 않음; process child에 heap cap 없음 → 적대적 signed plugin으로 인한 event-loop 정지 + 메모리 급증 | Resolved | 두 sandbox 모두에서 parse 이전 reply byte-length 검사(oversized를 deny로 drop); 새 `resourceLimits` knob을 통한 process child의 `--max-old-space-size` heap cap; oversized-reply fixture 회귀 테스트 |
164
+ | CR2-004 | `sanitizeResponseHeaders`가 변환된 응답에 stale body-coupled validator(`etag`/`content-md5`/`digest`/`last-modified`)를 유지 | Resolved | 모든 body-mutating 경로에서 해당 헤더 drop + `cache-control: no-store`; 변경된 응답이 upstream `ETag`를 drop하는 테스트 |
165
+ | CR2-005 | `maxBytes` 초과 request body가 (유한한) Node `requestTimeout`까지 read-and-discard됨 — socket teardown 없음 | Resolved | 413 경로에서 `request.pause()`/`destroy()`(또는 `Connection: close`); 선택적으로 non-null 기본 timeout |
166
+ | CR2-006 | `mcp-wrap --stderr filter`가 라인 지향이라 newline-split secret이 회피함(본질적; single-line secret은 잡힘, `drop` 사용 가능) | Resolved | `COMMAND_HELP` + 등록부 노트; 고민감 도구에 `--stderr drop` 권장 |
167
+ | CR2-007 | README가 mcp-wrap "stderr ... pass through"라고 하지만 기본값은 이제 `--stderr filter` | Resolved | README + `README.ko.md` 수정 |
168
+ | CR2-008 | README streaming split-match 주장이 범위 한정 없음(cross-frame buffering은 delta 채널만) | Resolved | README 두 구절 + `README.ko.md`를 delta 채널로 한정 |
169
+ | CR2-009 | (보고된 P2) credential `maxMessageBytes` 검사 이후 append된 `keyMaterial` | Won't fix (FALSE POSITIVE) | `keyMaterial`은 운영자 통제 + fetcher `maxBytes`로 hard-bound; 공격자 증폭 없음 — 선택적 cosmetic re-assert만 |
170
+ | CR2-010 | (보고된 P2) 두 NON-JSON SSE frame에 걸쳐 분할된 secret 미포착 | Accepted (documented) | round-1 `P1-CR-005`, `threat-model.md`, in-code comment에 이미 범위 외; JSON delta 채널은 `maxMatchBytes`까지 buffering함 |
171
+
154
172
  ## 6. P2 제품/문서 리스크 상태
155
173
 
156
174
  | ID | 기존 리스크 | 상태 | 해소 증거 |
@@ -164,7 +182,7 @@ base64/인코딩 값 디코딩 검사, query string 검사, audit tail truncatio
164
182
 
165
183
  이 체크리스트는 `1.x` stable 라인의 모든 릴리스에 대한 상시 배포 전 템플릿이며, `0.3.2` developer preview에서 처음 적용되었습니다. 그 결과를 아래에 참조 기록으로 보존합니다.
166
184
 
167
- 2026-06-16 현재 상태: G9은 `Pass`입니다 — 코드리뷰 보완이 `haechi@1.3.1`로 발행되었습니다.체크리스트는 해당 컷에 대해 해제되었습니다.
185
+ 2026-06-16 현재 상태: G9은 `Pass`입니다(round-1 보완이 `haechi@1.3.1`로 발행됨). 게이트 **G10**(CR2, §5.8)은 이제 `Pass`입니다 — CR2 P2 + P3 묶음(`CR2-001..008`)Resolved이며 `haechi@1.3.2`로 발행되었으므로, 그 컷에 대해 이 체크리스트가 해제되었습니다.
168
186
 
169
187
  외부 npm 게이트 확인 결과(`0.3.2` developer preview, 2026-06-10, 배포 후)는 다음과 같습니다.
170
188
 
@@ -14,9 +14,9 @@ Haechi has shipped its `1.x` stable line. The developer-preview gate (G2, `haech
14
14
  | Category | Judgment | Rationale |
15
15
  |---|---|---|
16
16
  | GitHub public | Allowed | Security limitations, threat model, and shared responsibility are documented |
17
- | GitHub release/tag | Allowed (`v1.3.1` released) | The `v1.3.1` remediation cut is tagged and released; all §5.7 findings are Resolved and G9 is Pass |
18
- | npm stable | `haechi@1.3.1` published | The code-review remediation shipped in the `haechi@1.3.1` attested OIDC publish (2026-06-16); the prior `1.3.0` carries the pre-fix behavior |
19
- | Production use | Operator-gated; upgrade to `1.3.1` | Supported only with operator network controls, authz/authn, and key custody; operators on `haechi@1.3.0` should upgrade to `1.3.1` to pick up the proxy header-boundary fix (P0-CR-001) before routing sensitive third-party upstream traffic through the proxy |
17
+ | GitHub release/tag | Allowed (`v1.3.3` released) | `v1.3.3` is the current release (a proactive-hardening patch over the CR2 cut `1.3.2`); all §5.7 and §5.8 (`CR2-001..008`) findings remain Resolved and G9/G10 are Pass |
18
+ | npm stable | `haechi@1.3.3` published | `1.3.3` is an attested OIDC publish adding the response-direction marker-skip tightening + a cosign-signed GHCR container image, over the CR2-remediated `1.3.2` baseline |
19
+ | Production use | Operator-gated; upgrade to `1.3.3` | Supported only with operator network controls, authz/authn, and key custody; operators should run the latest `haechi@1.3.3` (it carries the CR2 fixes from `1.3.2` plus the marker-skip hardening) before routing sensitive third-party upstream traffic through the proxy |
20
20
 
21
21
  ## 2. Release Gates
22
22
 
@@ -32,6 +32,7 @@ Haechi has shipped its `1.x` stable line. The developer-preview gate (G2, `haech
32
32
  | G7 | 1.2.0 Reliability Hardening Track (WS1–WS6) | Detection quality measured + tightened (WS2: a labeled-corpus precision/recall `bench:detection` gate, credential + international-PII coverage, `filters.minConfidence` / `filters.allowlist` with the hard-block-types invariant, NFKC unicode-evasion folding with offset-integrity); WS3 injectable `rateLimiter` seam + bounded fixed-window map; WS4 operability (`/__haechi/live`+`/ready` split, injectable `/metrics`, structured logs + per-request `correlationId`, graceful drain, max-in-flight backpressure, env overlay, hardened Dockerfile/compose/runbook, `configVersion`); WS6 proxy TLS / remote-bind hardening (`proxy.tls` / `proxy.trustForwardedProto`, fail-closed `assertSafeProxyTransport`) + OWASP-LLM/NIST control-mapping whitepaper + RFC 9116 `security.txt` + vulnerability-disclosure path. Every change is additive behind 1.1-preserving defaults (`tests/api-contract.test.mjs` green); the no-plaintext-in-audit invariant extends to telemetry; core stays zero runtime dependency; core bumped to 1.2.0 (additive minor) | Pass |
33
33
  | G8 | 1.3.0 backend + detection coverage expansion | New protocol adapters for the **Anthropic Messages API** (`/v1/messages`, content-block + SSE `delta.text` with `event:`-line-preserving re-serialize) and the **Google Gemini API** (model-in-path `:generateContent`/`:streamGenerateContent` via an additive `:method`-suffix route matcher that leaves the exact-match adapters byte-identical); detection coverage expansion — cloud/SaaS provider keys (OpenAI/Anthropic/Google-OAuth/SendGrid/Twilio/npm/Azure, anchored) and international PII (FR/ES/JP + IT/SG/IN/DE/NL national IDs with checksum validators), each hard-block-vs-dial-eligible decision driven by measured collision rates (a non-numeric anchor or implausibly-rare shape is required for hard-block; a bare-digit run over a common length stays allowlist-clearable); a `bench:throughput` proxy load benchmark; the `haechi-ratelimit-redis` shared-store rate-limiter satellite (the WS3 seam's production consumer; the proxy now `await`s `rateLimiter.allow`); `haechi-dashboard` surfaces the per-request `correlationId`. Every change is additive — new `target.type`/detection-type/`privacy.profile` *values*, not new config keys (`configVersion` stays `1`); `tests/api-contract.test.mjs` green; core stays zero runtime dependency; core bumped to 1.3.0 (additive minor) | Pass |
34
34
  | G9 | 2026-06-16 full code-review remediation gate (shipped in 1.3.1) | `P0-CR-001` and `P1-CR-002` through `P1-CR-005` resolved or formally accepted; P2 items either resolved or scheduled with explicit non-blocking rationale; linked register updated. **All 13 `P*-CR-*` findings are Resolved (§5.7) and shipped in `haechi@1.3.1` (2026-06-16, attested OIDC publish); core bumped 1.3.0 → 1.3.1 (patch, remediation-only — no API/config surface change, `configVersion` stays `1`).** | Pass (`haechi@1.3.1`, 2026-06-16) |
35
+ | G10 | 2026-06-16 code-review round 2 (CR2) remediation gate | The CR2 register (`code-review-risk-register-2026-06-16-round2.md`, §5.8) found **no P0/P1**; its three P2s (`CR2-001` proxy upstream-cancel, `CR2-002` token-vault audit hygiene, `CR2-003` plugin IPC reply bound) plus the P3 cluster (`CR2-004..008`) are all **Resolved and shipped in `haechi@1.3.2`** (`CR2-009` won't-fix, `CR2-010` accepted) and the linked register is updated. | Pass (`haechi@1.3.2`, 2026-06-16) |
35
36
 
36
37
  ## 3. P0 Distribution-Blocking Risk Status
37
38
 
@@ -159,6 +160,23 @@ The authoritative itemized register is `docs/current/code-review-risk-register-2
159
160
  | P2-CR-012 | KMS vault IPv6 loopback carve-out lacks IPv6-focused tests | Resolved | `satellites/crypto-kms/vault.test.mjs` adds a dedicated IPv6 loopback policy test ("…enforces the IPv6 loopback policy (::1, [::1], dotted + hex mapped) — P2-CR-012") covering bare `::1`, bracketed `[::1]`, dotted `::ffff:127.0.0.1`, and hex `::ffff:7f00:1`/`::ffff:7f00:0001` (plus bracketed variants), and asserts a public mapped address (`::ffff:8.8.8.8`/`::ffff:808:808`) is NOT over-blocked; the extended range table and `ssrf-parity.test.mjs` lock the dotted+hex agreement with auth-jwt |
160
161
  | P2-CR-013 | SSE multi-line `data:` fields are joined without newline separators | Resolved | `parseFrame` joins multiple `data:` lines with `join("\n")` (spec separator) and strips only the single spec leading space per line; multi-line JSON still `JSON.parse`s, multi-line plain text is reconstructed with newlines for inspection, and `serializeTextFrame` re-emits a multi-line payload as multiple `data:` lines; `tests/stream-filter.test.mjs` covers a multi-line JSON event and a multi-line plain-text event with PII |
161
162
 
163
+ ## 5.8 2026-06-16 Code Review Round 2 (CR2) Status — gate G10
164
+
165
+ The authoritative itemized register is `docs/current/code-review-risk-register-2026-06-16-round2.md`; this is the release-gate summary. A second deep review after the 1.3.1 cut found **no P0 and no P1** (the two externally-reported P1s both verified down to P2 — neither is a stored-plaintext leak or an auth/SSRF bypass). The three P2s + the P3 cluster (`CR2-001..008`) are **Resolved and shipped in `haechi@1.3.2`**; one reported item was a **false positive** (`CR2-009`, won't-fix) and one is an **already-documented accepted residual** (`CR2-010`, accepted). **G10 is Pass.**
166
+
167
+ | ID | Risk | Status | Required closure evidence |
168
+ |---|---|---|---|
169
+ | CR2-001 | Pass-through streaming never cancels the upstream reader on downstream disconnect (`pipeUpstreamBodyBounded` parks on `drain` forever) — unauthenticated resource leak | Resolved | Per-request `AbortController` + client `close`/`aborted` listener that cancels the upstream reader and aborts the fetch; `drain` wait raced against `close`; regression test that a mid-stream disconnect cancels the reader promptly |
170
+ | CR2-002 | Token-vault reveal/purge writes the raw caller-supplied `token` + `error.message` (token-interpolated) into the audit event; `FORBIDDEN_KEYS` strips by key name only | Resolved | Generic error messages; keyed-HMAC or `tok_`-shape-validate the `token` before recording; enum `reasonCode` instead of `error.message`; regression test that no raw token reaches `reason`/`token`; reconcile the invariant wording |
171
+ | CR2-003 | Plugin IPC reply not size-bounded before `JSON.parse`; process child has no heap cap → event-loop stall + memory spike from a hostile signed plugin | Resolved | Reply byte-length check before parse in both sandboxes (drop oversized as deny); `--max-old-space-size` heap cap on the process child via a new `resourceLimits` knob; regression test with an oversized-reply fixture |
172
+ | CR2-004 | `sanitizeResponseHeaders` keeps stale body-coupled validators (`etag`/`content-md5`/`digest`/`last-modified`) on a transformed response | Resolved | Drop those headers on every body-mutating path + `cache-control: no-store`; test that a mutated response drops the upstream `ETag` |
173
+ | CR2-005 | Over-`maxBytes` request body is read-and-discarded until the (finite) Node `requestTimeout` — no socket teardown | Resolved | `request.pause()`/`destroy()` (or `Connection: close`) on the 413 path; optionally non-null default timeouts |
174
+ | CR2-006 | `mcp-wrap --stderr filter` is line-oriented, so a newline-split secret evades it (inherent; single-line secrets caught, `drop` available) | Resolved | `COMMAND_HELP` + register note; recommend `--stderr drop` for high-sensitivity tools |
175
+ | CR2-007 | README says mcp-wrap "stderr ... pass through" but the default is now `--stderr filter` | Resolved | Correct README + `README.ko.md` |
176
+ | CR2-008 | README streaming split-match claim is unscoped (cross-frame buffering is delta-channel only) | Resolved | Scope both README passages + `README.ko.md` to the delta channel |
177
+ | CR2-009 | (reported P2) `keyMaterial` appended after the credential `maxMessageBytes` check | Won't fix (FALSE POSITIVE) | `keyMaterial` is operator-controlled + hard-bounded by the fetcher `maxBytes`; no attacker amplification — optional cosmetic re-assert only |
178
+ | CR2-010 | (reported P2) secret split across two NON-JSON SSE frames not caught | Accepted (documented) | Already out-of-scope in round-1 `P1-CR-005`, `threat-model.md`, and an in-code comment; the JSON delta channel does buffer up to `maxMatchBytes` |
179
+
162
180
  ## 6. P2 Product/Documentation Risk Status
163
181
 
164
182
  | ID | Risk | Status | Resolution evidence |
@@ -172,7 +190,7 @@ The authoritative itemized register is `docs/current/code-review-risk-register-2
172
190
 
173
191
  This checklist is the standing pre-distribution template for every release on the `1.x` stable line; it was first exercised for the `0.3.2` developer preview, whose results are retained below as the reference record.
174
192
 
175
- Current 2026-06-16 status: G9 is `Pass` — the code-review remediation shipped in `haechi@1.3.1`. This checklist is cleared for that cut.
193
+ Current 2026-06-16 status: G9 is `Pass` (round-1 remediation shipped in `haechi@1.3.1`). Gate **G10** (CR2, §5.8) is now `Pass` — the CR2 P2s + P3 cluster (`CR2-001..008`) are Resolved and shipped in `haechi@1.3.2`, so the checklist is cleared for that cut.
176
194
 
177
195
  External npm gate check results (`0.3.2` developer preview, 2026-06-10, post-publish):
178
196
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "haechi",
3
- "version": "1.3.1",
3
+ "version": "1.3.3",
4
4
  "description": "Self-hosted AI context enforcement across LLM, MCP, vLLM, Ollama, and agent traffic — a stable, zero-dependency security gateway.",
5
5
  "license": "Apache-2.0",
6
6
  "type": "module",
@@ -66,6 +66,7 @@
66
66
  ],
67
67
  "scripts": {
68
68
  "test": "node --test",
69
+ "test:inference:live": "node --test tests/local-inference.integration.test.mjs",
69
70
  "check:types": "tsc -p jsconfig.json --noEmit",
70
71
  "pack:dry": "npm pack --dry-run",
71
72
  "scan:stale-names": "node scripts/stale-name-scan.mjs",
@@ -766,7 +766,7 @@ const COMMAND_HELP = {
766
766
  "mcp-wrap": {
767
767
  usage: "haechi mcp-wrap [--config haechi.config.json] [--stderr filter|drop|inherit] -- <command> [args...]",
768
768
  summary: "Wrap an MCP server with bidirectional stdio protection.",
769
- detail: "Spawns <command>, applies the method allowlist + params protection client→server, and result protection + injection heuristics server→client. Drop-in for MCP client configs. --stderr controls the child's stderr: filter (default) protects each line with the same policy before re-emitting, drop discards it, inherit passes it through raw (an explicit, opt-in local-process boundary). filter follows the configured policy mode — in dry-run/report-only it detects but does not transform (like the rest of the pipeline), so set policy.mode=enforce for stderr redaction to take effect."
769
+ detail: "Spawns <command>, applies the method allowlist + params protection client→server, and result protection + injection heuristics server→client. Drop-in for MCP client configs. --stderr controls the child's stderr: filter (default) protects each line with the same policy before re-emitting, drop discards it, inherit passes it through raw (an explicit, opt-in local-process boundary). filter follows the configured policy mode — in dry-run/report-only it detects but does not transform (like the rest of the pipeline), so set policy.mode=enforce for stderr redaction to take effect. filter protects each COMPLETE line independently, so it cannot catch a secret a child deliberately splits across a newline; use drop for high-sensitivity tools."
770
770
  },
771
771
  auth: {
772
772
  usage: "haechi auth add --type user|service|agent [--scope k:v ...] [--label k=v ...]\n haechi auth list [--config haechi.config.json]\n haechi auth revoke <id> [--config haechi.config.json]",
@@ -1022,7 +1022,11 @@ function resolveAuthProvider(config, providers, cryptoProvider, auditSink) {
1022
1022
  return createProcessIsolatedAuthProviderSync({
1023
1023
  ...common,
1024
1024
  netEnforcement: plugin.netEnforcement ?? "require-permission",
1025
- keyMaterial: plugin.keyMaterial ?? null
1025
+ keyMaterial: plugin.keyMaterial ?? null,
1026
+ // CR2-003: reuse the worker's resourceLimits.maxOldGenerationSizeMb knob to
1027
+ // cap the child heap. Optional for the process runtime (the sandbox defaults
1028
+ // when absent), so pass it through whether or not the config supplied it.
1029
+ resourceLimits: plugin.resourceLimits ?? null
1026
1030
  });
1027
1031
  }
1028
1032
  return createSandboxedAuthProviderSync({ ...common, resourceLimits: plugin.resourceLimits });
@@ -540,12 +540,15 @@ function scanEntry(entry, rules, context = {}) {
540
540
  // own token. This is response-only on purpose: a REQUEST that contains a
541
541
  // marker-shaped string is NOT Haechi output (Haechi hasn't transformed it yet),
542
542
  // so it is scanned normally — otherwise an attacker could wrap a real secret in
543
- // a fake `[TOKEN:…]` to evade request-side detection.
543
+ // a fake `[TOKEN:…]` to evade request-side detection. On the RESPONSE side the
544
+ // same wrap-a-secret risk is closed by haechiMarkerSpans recording a span only
545
+ // when the inner content matches a GENUINE emitted format — a fake marker
546
+ // wrapping a real secret stays in the scan and is detected/blocked.
544
547
  // Markers are pure ASCII and NFKC-stable, so their spans are computed on the
545
548
  // ORIGINAL value exactly as before — they line up with the same-length
546
549
  // normalized scan (Case 2 below) and are irrelevant to the whole-leaf scan
547
550
  // (Case 3).
548
- const markerSpans = context?.direction === "response" ? haechiMarkerSpans(entry.value) : [];
551
+ const markerSpans = context?.direction === "response" ? haechiMarkerSpans(entry.value, rules, context) : [];
549
552
 
550
553
  // WS2d — Unicode evasion via NFKC normalization. A client can defeat every
551
554
  // regex rule by sending PII/secrets in a Unicode form that folds to ASCII
@@ -824,12 +827,157 @@ function isPositionStableNfkc(value, normalized) {
824
827
  return rebuilt === normalized;
825
828
  }
826
829
 
827
- // Spans of Haechi's own transform markers in a string, so detection can skip
828
- // them: `[TOKEN:…]`, `[HAECHI_ENC:…]`, `[REDACTED:…]`.
829
- function haechiMarkerSpans(text) {
830
+ // Spans of Haechi's own transform markers in a string, so RESPONSE-direction
831
+ // detection can skip them: `[TOKEN:…]`, `[HAECHI_ENC:…]`, `[REDACTED:…]`. A
832
+ // tokenized round-trip echoed by the model would otherwise be re-flagged as a
833
+ // secret (Haechi blocking its own output).
834
+ //
835
+ // CR-???: a span is recorded ONLY when its inner content matches a GENUINE
836
+ // format actually emitted by core's transform (packages/core/index.mjs
837
+ // replacementFor). Without this check the marker frame `[(?:TOKEN|…):[^\]]*]`
838
+ // would skip ANY inner content, so a hostile model could exfiltrate a real
839
+ // secret by wrapping it in a FAKE marker — `[TOKEN:sk-ant-api03-<secret>]`,
840
+ // `[HAECHI_ENC:<secret>]`, `[REDACTED:<secret>]` — and that span would be
841
+ // dropped from the scan. A marker-SHAPED string whose inner content is not
842
+ // genuine is left in the scan, so the wrapped secret is detected/blocked.
843
+ // Genuine inner formats:
844
+ // [REDACTED:<type>] <type> is a detection type name (lowercase
845
+ // identifier: [a-z][a-z0-9_]*).
846
+ // [TOKEN:<vaultTokenId>] vault id shape `tok_<type>_<hexhash>`
847
+ // (matches token-vault VAULT_TOKEN_SHAPE).
848
+ // [TOKEN:<type>:<shortHash>] non-vault deterministic token: type name + hex.
849
+ // [HAECHI_ENC:<base64url>] base64url that decodes to a VALID envelope
850
+ // JSON object (cryptoProvider.encrypt envelope:
851
+ // has `kid`+`aadHash`). A real secret string will
852
+ // not base64url-decode to such an object.
853
+ // Markers are pure ASCII / NFKC-stable and spans are computed on the ORIGINAL
854
+ // entry.value, so offset integrity is unchanged.
855
+
856
+ // Detection-type name shape (the `detection.type` written by core into REDACTED
857
+ // and the type segment of a non-vault TOKEN). Built-in rule types and custom
858
+ // rule types are lowercase identifiers; a real secret (hyphens, uppercase,
859
+ // length) does not match, so a wrapped secret stays in the scan.
860
+ const MARKER_TYPE_NAME = /^[a-z][a-z0-9_]*$/;
861
+ // Vault token id shape — mirrors token-vault VAULT_TOKEN_SHAPE
862
+ // (`tok_<type>_<hexhash>`, random: 16 hex, deterministic: 32 hex). Kept in sync
863
+ // with packages/token-vault/index.mjs (not exported from there).
864
+ const MARKER_VAULT_TOKEN = /^tok_[a-z0-9_]+_[a-f0-9]{16,}$/;
865
+ // Non-vault deterministic token: `<type>:<hex>` (core shortHash → 12 hex; allow
866
+ // any reasonable hex run so the check does not over-fit a single length).
867
+ const MARKER_NONVAULT_TOKEN = /^[a-z][a-z0-9_]*:[a-f0-9]{8,}$/;
868
+ // base64url alphabet only (core emits base64url with no padding).
869
+ const MARKER_BASE64URL = /^[A-Za-z0-9_-]+$/;
870
+
871
+ function isGenuineTokenInner(inner) {
872
+ return MARKER_VAULT_TOKEN.test(inner) || MARKER_NONVAULT_TOKEN.test(inner);
873
+ }
874
+
875
+ function isGenuineRedactedInner(inner) {
876
+ return MARKER_TYPE_NAME.test(inner);
877
+ }
878
+
879
+ // True only when `inner` base64url-decodes to a valid UTF-8 JSON object that
880
+ // carries the encrypt-envelope signature (`kid` + `aadHash` — the contract keys
881
+ // asserted by assertCryptoProviderConformance, present in the local AES-GCM
882
+ // envelope and any conformant external provider). Any decode/parse failure or a
883
+ // non-envelope shape → NOT a genuine marker (so a wrapped secret is scanned).
884
+ function isGenuineEncInner(inner) {
885
+ if (!MARKER_BASE64URL.test(inner)) {
886
+ return false;
887
+ }
888
+ try {
889
+ const bytes = Buffer.from(inner, "base64url");
890
+ // Reject inputs that do not round-trip through base64url (e.g. an invalid
891
+ // tail that Buffer silently truncates): a genuine marker always round-trips.
892
+ if (bytes.toString("base64url") !== inner) {
893
+ return false;
894
+ }
895
+ if (!isUtf8(bytes)) {
896
+ return false;
897
+ }
898
+ const parsed = JSON.parse(bytes.toString("utf8"));
899
+ return (
900
+ parsed !== null &&
901
+ typeof parsed === "object" &&
902
+ !Array.isArray(parsed) &&
903
+ typeof parsed.kid === "string" &&
904
+ typeof parsed.aadHash === "string"
905
+ );
906
+ } catch {
907
+ return false;
908
+ }
909
+ }
910
+
911
+ // Belt-and-suspenders for the genuine-marker shapes: even a correctly-SHAPED
912
+ // TOKEN/REDACTED inner must not itself carry a detectable secret. The lowercase-
913
+ // identifier classes (MARKER_TYPE_NAME, the type segments of the token shapes)
914
+ // overlap the body of real lowercase-bodied secrets (notably GitHub `gh[pousr]_`
915
+ // tokens), so a hostile model could smuggle such a secret as the `<type>` segment
916
+ // of an otherwise genuine-shaped marker. Re-scan the inner with the SAME rules and
917
+ // refuse to treat it as genuine if anything detectable is inside — this un-skips a
918
+ // marker exactly when skipping it would hide a leak.
919
+ function textHasDetection(text, rules, context) {
920
+ for (const rule of rules) {
921
+ if (rule.direction && rule.direction !== context?.direction) {
922
+ continue;
923
+ }
924
+ const regex = new RegExp(rule.pattern, rule.flags.includes("g") ? rule.flags : `${rule.flags}g`);
925
+ for (const match of text.matchAll(regex)) {
926
+ if (!rule.validate || rule.validate(match[0])) {
927
+ return true;
928
+ }
929
+ }
930
+ }
931
+ return false;
932
+ }
933
+
934
+ // The attacker-controllable segment(s) of a genuine-shaped marker inner — i.e. the
935
+ // `<type>` position(s) a hostile model could smuggle a secret into. For TOKEN we
936
+ // peel off the structural framing (`tok_<type>_<hex>` → `<type>`, `<type>:<hex>` →
937
+ // `<type>`) and scan the segment IN ISOLATION as well as the whole inner: a `\b`-
938
+ // anchored rule (e.g. GitHub `\bghp_…`) misses a token glued to the `tok_` prefix
939
+ // (no word boundary after `_`), but matches the segment scanned on its own.
940
+ function markerSecretSurfaces(kind, inner) {
941
+ const surfaces = [inner];
942
+ if (kind === "TOKEN") {
943
+ const vault = /^tok_(.+)_[a-f0-9]{16,}$/.exec(inner);
944
+ if (vault) {
945
+ surfaces.push(vault[1]);
946
+ }
947
+ const nonVault = /^(.+):[a-f0-9]{8,}$/.exec(inner);
948
+ if (nonVault) {
949
+ surfaces.push(nonVault[1]);
950
+ }
951
+ }
952
+ return surfaces;
953
+ }
954
+
955
+ function innerContainsDetection(kind, inner, rules, context) {
956
+ return markerSecretSurfaces(kind, inner).some((surface) => textHasDetection(surface, rules, context));
957
+ }
958
+
959
+ function haechiMarkerSpans(text, rules = [], context = {}) {
830
960
  const spans = [];
831
- for (const m of text.matchAll(/\[(?:TOKEN|HAECHI_ENC|REDACTED):[^\]]*\]/g)) {
832
- spans.push([m.index, m.index + m[0].length]);
961
+ for (const m of text.matchAll(/\[(TOKEN|HAECHI_ENC|REDACTED):([^\]]*)\]/g)) {
962
+ const kind = m[1];
963
+ const inner = m[2];
964
+ let genuine = false;
965
+ if (kind === "TOKEN") {
966
+ genuine = isGenuineTokenInner(inner);
967
+ } else if (kind === "REDACTED") {
968
+ genuine = isGenuineRedactedInner(inner);
969
+ } else {
970
+ genuine = isGenuineEncInner(inner);
971
+ }
972
+ // HAECHI_ENC is exempt from the inner re-scan: its inner is an opaque base64url
973
+ // envelope validated by decode above (a raw secret cannot forge a valid
974
+ // envelope, and the envelope's base64url body is not a detectable leaf).
975
+ if (genuine && kind !== "HAECHI_ENC" && innerContainsDetection(kind, inner, rules, context)) {
976
+ genuine = false;
977
+ }
978
+ if (genuine) {
979
+ spans.push([m.index, m.index + m[0].length]);
980
+ }
833
981
  }
834
982
  return spans;
835
983
  }
@@ -44,8 +44,19 @@ import {
44
44
  // The child flags. `--permission` enables the deny-by-default Node permission
45
45
  // model; we pass NO --allow-* grant, so fs/child-process/worker/addons/wasi/net
46
46
  // are all kernel-denied. `--disable-proto=delete` removes Object.prototype.__proto__.
47
+ // A `--max-old-space-size=<mb>` heap cap is appended PER-SPAWN (see spawnAndLoad):
48
+ // unlike the worker (resourceLimits OOMs a runaway), a process child has NO heap
49
+ // cap by default, so a hostile/buggy signed plugin could build a reply up to the
50
+ // child's default V8 heap. The cap bounds the child; the host-side reply-size bound
51
+ // (CR2-003) bounds the host regardless.
47
52
  const CHILD_FLAGS = Object.freeze(["--permission", "--disable-proto=delete"]);
48
53
 
54
+ // Default child heap cap (MB) when a process-runtime config does not supply
55
+ // resourceLimits.maxOldGenerationSizeMb. Non-breaking: the worker REQUIRES the
56
+ // knob, but the process runtime defaults rather than throwing so an isolation:
57
+ // process config without resourceLimits keeps working.
58
+ const DEFAULT_MAX_OLD_GEN_MB = 128;
59
+
49
60
  // A CONSTANT bootstrap harness, passed via `node -e`. It is identical for every
50
61
  // plugin (the plugin bytes arrive over IPC, NOT on the command line — so there is
51
62
  // no ARG_MAX limit and the harness never varies). It runs as CommonJS under -e and
@@ -155,6 +166,10 @@ function createProcessIsolatedAuthProviderHandle({
155
166
  timeoutMs,
156
167
  maxPendingCalls = 8,
157
168
  maxMessageBytes = 16384,
169
+ // Child V8 heap cap. Reuses the worker's resourceLimits.maxOldGenerationSizeMb
170
+ // knob (CR2-003). Optional for the process runtime: a config that omits it falls
171
+ // back to DEFAULT_MAX_OLD_GEN_MB rather than throwing (non-breaking).
172
+ resourceLimits = null,
158
173
  coreVersion = null,
159
174
  now = Date.now,
160
175
  allowedLabelKeys,
@@ -201,6 +216,21 @@ function createProcessIsolatedAuthProviderHandle({
201
216
  if (!Number.isInteger(maxMessageBytes) || maxMessageBytes < 1) {
202
217
  throw new Error("maxMessageBytes must be a positive integer");
203
218
  }
219
+ // Resolve the child heap cap (MB). Optional for the process runtime; if supplied
220
+ // it must be a positive-integer maxOldGenerationSizeMb (same shape as the worker),
221
+ // else default to DEFAULT_MAX_OLD_GEN_MB (non-breaking — never throws on absence).
222
+ let maxOldGenerationSizeMb = DEFAULT_MAX_OLD_GEN_MB;
223
+ if (resourceLimits !== null && resourceLimits !== undefined) {
224
+ if (typeof resourceLimits !== "object" || Array.isArray(resourceLimits)) {
225
+ throw new Error("createProcessIsolatedAuthProvider resourceLimits must be an object");
226
+ }
227
+ if (resourceLimits.maxOldGenerationSizeMb !== undefined) {
228
+ if (!Number.isInteger(resourceLimits.maxOldGenerationSizeMb) || resourceLimits.maxOldGenerationSizeMb <= 0) {
229
+ throw new Error("createProcessIsolatedAuthProvider resourceLimits.maxOldGenerationSizeMb must be a positive integer");
230
+ }
231
+ maxOldGenerationSizeMb = resourceLimits.maxOldGenerationSizeMb;
232
+ }
233
+ }
204
234
  // Fail-closed network containment. PR1 supports only the "require-permission"
205
235
  // mode; if this Node cannot enforce --allow-net, refuse to construct rather than
206
236
  // run a plugin whose network egress is uncontained.
@@ -273,7 +303,11 @@ function createProcessIsolatedAuthProviderHandle({
273
303
  // any failure kills the child and throws → fail closed. NOTE the plugin source
274
304
  // crosses over IPC (not the command line) so there is no ARG_MAX limit.
275
305
  async function spawnAndLoad({ entrySource, pluginId: pid }) {
276
- const c = spawn(execPath, [...CHILD_FLAGS, "-e", PROCESS_HARNESS], {
306
+ // Build the spawn args by spreading the frozen base flags + the per-spawn heap
307
+ // cap. `--max-old-space-size` composes with `--permission`/`--disable-proto=
308
+ // delete` and the data:-URL load (verified). The cap bounds a runaway child;
309
+ // the host-side reply-size bound bounds the host regardless of the child heap.
310
+ const c = spawn(execPath, [...CHILD_FLAGS, `--max-old-space-size=${maxOldGenerationSizeMb}`, "-e", PROCESS_HARNESS], {
277
311
  stdio: ["ignore", "ignore", "ignore", "ipc"],
278
312
  serialization: "json",
279
313
  env: scrubbedEnv(),
@@ -291,6 +325,27 @@ function createProcessIsolatedAuthProviderHandle({
291
325
  const failed = new Promise((_, reject) => { onFail = reject; });
292
326
 
293
327
  c.on("message", (raw) => {
328
+ // REPLY SIZE BOUND (CR2-003): bound host-side work BEFORE JSON.parse. Unlike
329
+ // the worker (resourceLimits OOMs a runaway), the child has only the
330
+ // --max-old-space-size cap, so it can still build a reply up to that heap and
331
+ // process.send it; a synchronous JSON.parse of a multi-MB string stalls the
332
+ // host event loop (the per-call timeout cannot fire mid-parse). The reply is a
333
+ // STRING (serialization:'json'); measure its byte length and, if it exceeds the
334
+ // SAME maxMessageBytes ceiling the outbound credential obeys, drop the frame as
335
+ // an oversized DENY WITHOUT parsing. The auth reply is the only attacker-sized
336
+ // frame (claims come from the plugin); the tiny ready/loaded/load-error control
337
+ // frames are always far under the ceiling, so the uniform bound never harms the
338
+ // handshake. Single-occupancy: settle the one live pending call as oversized.
339
+ const replyBytes = typeof raw === "string"
340
+ ? Buffer.byteLength(raw, "utf8")
341
+ : Buffer.byteLength(String(raw), "utf8");
342
+ if (replyBytes > maxMessageBytes) {
343
+ for (const [cid, settle] of pending) {
344
+ pending.delete(cid);
345
+ settle({ __oversized: true });
346
+ }
347
+ return;
348
+ }
294
349
  let parsed;
295
350
  try {
296
351
  parsed = JSON.parse(typeof raw === "string" ? raw : String(raw));
@@ -150,6 +150,29 @@ function createSandboxedAuthProviderHandle({
150
150
  workerData: {}
151
151
  });
152
152
  w.on("message", (raw) => {
153
+ // REPLY SIZE BOUND (CR2-003): bound host-side work BEFORE JSON.parse. The
154
+ // worker has an implicit heap cap (resourceLimits), but enforce the same
155
+ // maxMessageBytes ceiling on the INBOUND plugin→host reply that the OUTBOUND
156
+ // host→plugin credential message obeys — a hostile/buggy plugin can build a
157
+ // multi-MB reply and a synchronous JSON.parse would stall the host event loop
158
+ // (the per-call timeout cannot fire mid-parse). The reply is a STRING posted
159
+ // via JSON.stringify; measure its byte length and, if oversized, settle the
160
+ // matched call as an oversized DENY (mirroring the credential deny) WITHOUT
161
+ // parsing. We must locate the pending settle WITHOUT parsing the cid, so an
162
+ // oversized reply settles the single live pending call (single-occupancy: at
163
+ // most one entry is ever live).
164
+ const replyBytes = typeof raw === "string"
165
+ ? Buffer.byteLength(raw, "utf8")
166
+ : Buffer.byteLength(String(raw), "utf8");
167
+ if (replyBytes > maxMessageBytes) {
168
+ // Single-occupancy: settle the one live pending call as oversized, never
169
+ // touching JSON.parse on the oversized payload.
170
+ for (const [cid, settle] of pending) {
171
+ pending.delete(cid);
172
+ settle({ __oversized: true });
173
+ }
174
+ return;
175
+ }
153
176
  let parsed;
154
177
  try {
155
178
  parsed = JSON.parse(typeof raw === "string" ? raw : String(raw));