npm - haechi - Versions diffs - 1.3.1 → 1.3.3 - Mend

haechi 1.3.1 → 1.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/README.ko.md +3 -3
package/README.md +3 -3
package/docs/current/code-review-risk-register-2026-06-16-round2.ko.md +142 -0
package/docs/current/code-review-risk-register-2026-06-16-round2.md +142 -0
package/docs/current/operations-runbook.ko.md +32 -1
package/docs/current/operations-runbook.md +39 -1
package/docs/current/release-process.ko.md +15 -6
package/docs/current/release-process.md +15 -6
package/docs/current/reliability-hardening-track.ko.md +1 -1
package/docs/current/reliability-hardening-track.md +1 -1
package/docs/current/risk-register-release-gate.ko.md +22 -4
package/docs/current/risk-register-release-gate.md +22 -4
package/package.json +2 -1
package/packages/cli/bin/haechi.mjs +1 -1
package/packages/cli/runtime.mjs +5 -1
package/packages/filter/index.mjs +155 -7
package/packages/plugin/process-sandbox.mjs +56 -1
package/packages/plugin/sandbox.mjs +23 -0
package/packages/proxy/index.mjs +128 -12
package/packages/token-vault/index.mjs +46 -5

package/docs/current/reliability-hardening-track.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Reliability Hardening Track
-- Status: Plan (pinned 2026-06-12; grounded in a 5-lens read-only audit of the 1.1.1 core)
+- Status: Shipped — WS1–WS6 all delivered and cut in core 1.2.0 (release gate G7 Pass). This doc is retained as the planning/audit record. (Pinned 2026-06-12; grounded in a 5-lens read-only audit of the 1.1.1 core.)
 - Target line: 1.1.2 (patch) → 1.2.0 (minor); no new product surface
 - Purpose: raise Haechi to **commercial-solution-level reliability** — the trust, operability, and detection-quality density a production AI-security gateway is expected to have. This is a quality objective, not a commercialization plan. Every item **tightens, measures, or documents what already exists**; none adds a new feature.

package/docs/current/risk-register-release-gate.ko.md CHANGED Viewed

@@ -14,9 +14,9 @@ Haechi는 `1.x` stable 라인을 출시했습니다. developer preview 게이트
 | 구분 | 판단 | 이유 |
 |---|---|---|
 | GitHub public | 허용 | 보안 한계, threat model, shared responsibility가 문서화됨 |
-| GitHub release/tag | 허용 (`v1.3.1` 릴리스됨) | `v1.3.1` 보완 컷이 태깅·릴리스됨; §5.7 항목이 모두 Resolved이고 G9은 Pass |
-| npm stable | `haechi@1.3.1` publish됨 | 코드리뷰 보완이 `haechi@1.3.1` attested OIDC publish(2026-06-16)로 발행됨; 이전 `1.3.0`은 수정 이전 동작을 담고 있음 |
-| production use | 운영자 게이트; `1.3.1`로 업그레이드 | 운영자 네트워크 통제, 인가/인증, key custody가 있을 때만 지원; `haechi@1.3.0` 운영자는 민감한 제3자 업스트림 트래픽을 프록시로 라우팅하기 전에 프록시 헤더 경계 수정(P0-CR-001)을 반영하도록 `1.3.1`로 업그레이드해야 함 |
+| GitHub release/tag | 허용 (`v1.3.3` 릴리스됨) | `v1.3.3`이 현재 릴리스(CR2 컷 `1.3.2` 위의 선제적 하드닝 패치); §5.7 및 §5.8(`CR2-001..008`) 항목은 모두 Resolved 유지, G9/G10은 Pass |
+| npm stable | `haechi@1.3.3` publish됨 | `1.3.3`은 CR2-보완된 `1.3.2` 기준 위에 response-direction marker-skip 강화 + cosign 서명 GHCR 컨테이너 이미지를 더한 attested OIDC publish |
+| production use | 운영자 게이트; `1.3.3`로 업그레이드 | 운영자 네트워크 통제, 인가/인증, key custody가 있을 때만 지원; 운영자는 민감한 제3자 업스트림 트래픽을 프록시로 라우팅하기 전에 최신 `haechi@1.3.3`(1.3.2의 CR2 수정 + marker-skip 하드닝 포함)을 실행해야 함 |
 ## 2. 릴리스 게이트
@@ -32,6 +32,7 @@ Haechi는 `1.x` stable 라인을 출시했습니다. developer preview 게이트
 | G7 | 1.2.0 신뢰성 강화 트랙 (WS1–WS6) | 탐지 품질 측정+강화(WS2: 라벨 코퍼스 precision/recall `bench:detection` 게이트, 자격증명+국제 PII 커버리지, 하드블록 타입 불변식이 적용된 `filters.minConfidence` / `filters.allowlist`, offset 무결성을 갖춘 NFKC 유니코드 회피 폴딩); WS3 주입 가능한 `rateLimiter` 시임 + bounded fixed-window map; WS4 운영성(`/__haechi/live`+`/ready` 분리, 주입 가능한 `/metrics`, 구조적 로그 + 요청별 `correlationId`, graceful drain, max-in-flight backpressure, env overlay, 하드닝 Dockerfile/compose/runbook, `configVersion`); WS6 proxy TLS / remote-bind 하드닝(`proxy.tls` / `proxy.trustForwardedProto`, fail-closed `assertSafeProxyTransport`) + OWASP-LLM/NIST 컨트롤 매핑 백서 + RFC 9116 `security.txt` + 취약점 공개 경로. 모든 변경은 1.1 동작을 보존하는 기본값 뒤의 additive(`tests/api-contract.test.mjs` 통과); no-plaintext-in-audit 불변식이 텔레메트리까지 확장; core는 zero runtime dependency 유지; core 1.2.0 bump(additive 마이너) | Pass |
 | G8 | 1.3.0 백엔드 + 탐지 커버리지 확장 | **Anthropic Messages API**(`/v1/messages`, content-block + SSE `delta.text`, `event:` 라인 보존 재직렬화)와 **Google Gemini API**(model-in-path `:generateContent`/`:streamGenerateContent`, 기존 정확-매칭 어댑터를 바이트 동일하게 두는 additive `:method`-suffix 라우트 매처) 프로토콜 어댑터 추가; 탐지 커버리지 확장 — 클라우드/SaaS provider 키(OpenAI/Anthropic/Google-OAuth/SendGrid/Twilio/npm/Azure, anchored)와 국제 PII(FR/ES/JP + IT/SG/IN/DE/NL 국가 ID, 체크섬 validator), 각 하드블록-대-dial-eligible 결정은 측정된 충돌률 기반(하드블록은 비숫자 앵커 또는 비현실적으로 드문 형태가 필요; 흔한 길이의 bare-digit run은 allowlist로 정리 가능 유지); `bench:throughput` proxy 부하 벤치; `haechi-ratelimit-redis` 공유 저장소 rate-limiter 위성(WS3 시임의 운영 소비자; proxy가 이제 `rateLimiter.allow`를 `await`); `haechi-dashboard`가 요청별 `correlationId` 노출. 모든 변경은 additive — 새 `target.type`/탐지타입/`privacy.profile` *값*이며 새 config 키가 아님(`configVersion`은 `1` 유지); `tests/api-contract.test.mjs` 통과; core는 zero runtime dependency 유지; core 1.3.0 bump(additive 마이너) | Pass |
 | G9 | 2026-06-16 전체 코드리뷰 보완 게이트 (1.3.1로 발행) | `P0-CR-001` 및 `P1-CR-002`부터 `P1-CR-005`까지 해결 또는 책임자 명시 수용; P2 항목은 해결 또는 명시적 non-blocking 근거와 일정 기록; 연결된 등록부 갱신. **13개 `P*-CR-*` 항목이 모두 Resolved이며(§5.7) `haechi@1.3.1`(2026-06-16, attested OIDC publish)로 발행되었습니다; core가 1.3.0 → 1.3.1로 bump(patch, 보완 전용 — API/config 표면 변경 없음, `configVersion`은 `1` 유지)되었습니다.** | Pass (`haechi@1.3.1`, 2026-06-16) |
+| G10 | 2026-06-16 코드리뷰 round 2 (CR2) 보완 게이트 | CR2 등록부(`code-review-risk-register-2026-06-16-round2.md`, §5.8)는 **P0/P1을 발견하지 못했습니다**; 세 개의 P2(`CR2-001` 프록시 upstream-cancel, `CR2-002` token-vault audit hygiene, `CR2-003` plugin IPC reply 경계)와 P3 묶음(`CR2-004..008`)이 모두 **Resolved이며 `haechi@1.3.2`로 발행되었고**(`CR2-009` won't-fix, `CR2-010` accepted) 연결된 등록부가 갱신되었습니다. | Pass (`haechi@1.3.2`, 2026-06-16) |
 ## 3. P0 배포 차단 리스크 상태
@@ -151,6 +152,23 @@ base64/인코딩 값 디코딩 검사, query string 검사, audit tail truncatio
 | P2-CR-012 | KMS vault IPv6 loopback carve-out의 IPv6 테스트 부족 | Resolved | `satellites/crypto-kms/vault.test.mjs`에 전용 IPv6 loopback 정책 테스트("…enforces the IPv6 loopback policy (::1, [::1], dotted + hex mapped) — P2-CR-012")를 추가해 bare `::1`, bracketed `[::1]`, dotted `::ffff:127.0.0.1`, hex `::ffff:7f00:1`/`::ffff:7f00:0001`(및 bracketed 변형)을 검증하고, 공인 mapped 주소(`::ffff:8.8.8.8`/`::ffff:808:808`)가 과차단되지 않음을 단언; 확장된 range table과 `ssrf-parity.test.mjs`가 auth-jwt와의 dotted+hex 일치를 고정 |
 | P2-CR-013 | SSE multi-line `data:` 필드를 newline separator 없이 합침 | Resolved | `parseFrame`이 여러 `data:` line을 `join("\n")`(스펙 separator)으로 합치고 line별 스펙 선행 공백 1개만 제거; multi-line JSON은 여전히 `JSON.parse`되고 multi-line plain text는 newline과 함께 재구성되어 검사되며 `serializeTextFrame`가 multi-line payload를 여러 `data:` line으로 재방출; `tests/stream-filter.test.mjs`가 multi-line JSON event와 PII 포함 multi-line plain-text event를 커버 |
+## 5.8 2026-06-16 코드리뷰 Round 2 (CR2) 상태 — 게이트 G10
+권위 있는 항목별 등록부는 `docs/current/code-review-risk-register-2026-06-16-round2.md`입니다; 이 절은 릴리스 게이트 요약입니다. 1.3.1 컷 이후 진행한 2차 심층 리뷰는 **P0도 P1도 발견하지 못했습니다**(외부에서 P1로 보고된 두 항목 모두 검증 결과 P2로 내려갔습니다 — 둘 다 stored-plaintext leak도, auth/SSRF 우회도 아닙니다). 세 개의 P2 + P3 묶음(`CR2-001..008`)은 **Resolved이며 `haechi@1.3.2`로 발행되었습니다**; 보고된 한 항목은 **false positive**(`CR2-009`, won't-fix)였고 한 항목은 **이미 문서화된 수용 잔여 리스크**(`CR2-010`, accepted)였습니다. **G10은 Pass입니다.**
+| ID | 리스크 | 상태 | 종료에 필요한 증거 |
+|---|---|---|---|
+| CR2-001 | pass-through streaming이 downstream disconnect 시 upstream reader를 절대 취소하지 않음(`pipeUpstreamBodyBounded`가 `drain`에서 영원히 park) — 인증되지 않은 resource leak | Resolved | per-request `AbortController` + upstream reader를 취소하고 fetch를 abort하는 클라이언트 `close`/`aborted` listener; `drain` 대기를 `close`와 race; 스트림 도중 disconnect가 reader를 즉시 취소하는 회귀 테스트 |
+| CR2-002 | token-vault reveal/purge가 호출자 제공 raw `token` + `error.message`(token interpolate됨)를 audit event에 기록; `FORBIDDEN_KEYS`는 key 이름으로만 제거 | Resolved | 일반화된 오류 메시지; 기록 이전에 `token`을 keyed-HMAC하거나 `tok_` 형태로 검증; `error.message` 대신 enum `reasonCode`; raw token이 `reason`/`token`에 도달하지 않는다는 회귀 테스트; 불변식 표현 정합화 |
+| CR2-003 | plugin IPC reply가 `JSON.parse` 이전에 size-bound되지 않음; process child에 heap cap 없음 → 적대적 signed plugin으로 인한 event-loop 정지 + 메모리 급증 | Resolved | 두 sandbox 모두에서 parse 이전 reply byte-length 검사(oversized를 deny로 drop); 새 `resourceLimits` knob을 통한 process child의 `--max-old-space-size` heap cap; oversized-reply fixture 회귀 테스트 |
+| CR2-004 | `sanitizeResponseHeaders`가 변환된 응답에 stale body-coupled validator(`etag`/`content-md5`/`digest`/`last-modified`)를 유지 | Resolved | 모든 body-mutating 경로에서 해당 헤더 drop + `cache-control: no-store`; 변경된 응답이 upstream `ETag`를 drop하는 테스트 |
+| CR2-005 | `maxBytes` 초과 request body가 (유한한) Node `requestTimeout`까지 read-and-discard됨 — socket teardown 없음 | Resolved | 413 경로에서 `request.pause()`/`destroy()`(또는 `Connection: close`); 선택적으로 non-null 기본 timeout |
+| CR2-006 | `mcp-wrap --stderr filter`가 라인 지향이라 newline-split secret이 회피함(본질적; single-line secret은 잡힘, `drop` 사용 가능) | Resolved | `COMMAND_HELP` + 등록부 노트; 고민감 도구에 `--stderr drop` 권장 |
+| CR2-007 | README가 mcp-wrap "stderr ... pass through"라고 하지만 기본값은 이제 `--stderr filter` | Resolved | README + `README.ko.md` 수정 |
+| CR2-008 | README streaming split-match 주장이 범위 한정 없음(cross-frame buffering은 delta 채널만) | Resolved | README 두 구절 + `README.ko.md`를 delta 채널로 한정 |
+| CR2-009 | (보고된 P2) credential `maxMessageBytes` 검사 이후 append된 `keyMaterial` | Won't fix (FALSE POSITIVE) | `keyMaterial`은 운영자 통제 + fetcher `maxBytes`로 hard-bound; 공격자 증폭 없음 — 선택적 cosmetic re-assert만 |
+| CR2-010 | (보고된 P2) 두 NON-JSON SSE frame에 걸쳐 분할된 secret 미포착 | Accepted (documented) | round-1 `P1-CR-005`, `threat-model.md`, in-code comment에 이미 범위 외; JSON delta 채널은 `maxMatchBytes`까지 buffering함 |
 ## 6. P2 제품/문서 리스크 상태
 | ID | 기존 리스크 | 상태 | 해소 증거 |
@@ -164,7 +182,7 @@ base64/인코딩 값 디코딩 검사, query string 검사, audit tail truncatio
 이 체크리스트는 `1.x` stable 라인의 모든 릴리스에 대한 상시 배포 전 템플릿이며, `0.3.2` developer preview에서 처음 적용되었습니다. 그 결과를 아래에 참조 기록으로 보존합니다.
-2026-06-16 현재 상태: G9은 `Pass`입니다 — 코드리뷰 보완이 `haechi@1.3.1`로 발행되었습니다. 이 체크리스트는 해당 컷에 대해 해제되었습니다.
+2026-06-16 현재 상태: G9은 `Pass`입니다(round-1 보완이 `haechi@1.3.1`로 발행됨). 게이트 **G10**(CR2, §5.8)은 이제 `Pass`입니다 — CR2 P2 + P3 묶음(`CR2-001..008`)이 Resolved이며 `haechi@1.3.2`로 발행되었으므로, 그 컷에 대해 이 체크리스트가 해제되었습니다.
 외부 npm 게이트 확인 결과(`0.3.2` developer preview, 2026-06-10, 배포 후)는 다음과 같습니다.

package/docs/current/risk-register-release-gate.md CHANGED Viewed

@@ -14,9 +14,9 @@ Haechi has shipped its `1.x` stable line. The developer-preview gate (G2, `haech
 | Category | Judgment | Rationale |
 |---|---|---|
 | GitHub public | Allowed | Security limitations, threat model, and shared responsibility are documented |
-| GitHub release/tag | Allowed (`v1.3.1` released) | The `v1.3.1` remediation cut is tagged and released; all §5.7 findings are Resolved and G9 is Pass |
-| npm stable | `haechi@1.3.1` published | The code-review remediation shipped in the `haechi@1.3.1` attested OIDC publish (2026-06-16); the prior `1.3.0` carries the pre-fix behavior |
-| Production use | Operator-gated; upgrade to `1.3.1` | Supported only with operator network controls, authz/authn, and key custody; operators on `haechi@1.3.0` should upgrade to `1.3.1` to pick up the proxy header-boundary fix (P0-CR-001) before routing sensitive third-party upstream traffic through the proxy |
+| GitHub release/tag | Allowed (`v1.3.3` released) | `v1.3.3` is the current release (a proactive-hardening patch over the CR2 cut `1.3.2`); all §5.7 and §5.8 (`CR2-001..008`) findings remain Resolved and G9/G10 are Pass |
+| npm stable | `haechi@1.3.3` published | `1.3.3` is an attested OIDC publish adding the response-direction marker-skip tightening + a cosign-signed GHCR container image, over the CR2-remediated `1.3.2` baseline |
+| Production use | Operator-gated; upgrade to `1.3.3` | Supported only with operator network controls, authz/authn, and key custody; operators should run the latest `haechi@1.3.3` (it carries the CR2 fixes from `1.3.2` plus the marker-skip hardening) before routing sensitive third-party upstream traffic through the proxy |
 ## 2. Release Gates
@@ -32,6 +32,7 @@ Haechi has shipped its `1.x` stable line. The developer-preview gate (G2, `haech
 | G7 | 1.2.0 Reliability Hardening Track (WS1–WS6) | Detection quality measured + tightened (WS2: a labeled-corpus precision/recall `bench:detection` gate, credential + international-PII coverage, `filters.minConfidence` / `filters.allowlist` with the hard-block-types invariant, NFKC unicode-evasion folding with offset-integrity); WS3 injectable `rateLimiter` seam + bounded fixed-window map; WS4 operability (`/__haechi/live`+`/ready` split, injectable `/metrics`, structured logs + per-request `correlationId`, graceful drain, max-in-flight backpressure, env overlay, hardened Dockerfile/compose/runbook, `configVersion`); WS6 proxy TLS / remote-bind hardening (`proxy.tls` / `proxy.trustForwardedProto`, fail-closed `assertSafeProxyTransport`) + OWASP-LLM/NIST control-mapping whitepaper + RFC 9116 `security.txt` + vulnerability-disclosure path. Every change is additive behind 1.1-preserving defaults (`tests/api-contract.test.mjs` green); the no-plaintext-in-audit invariant extends to telemetry; core stays zero runtime dependency; core bumped to 1.2.0 (additive minor) | Pass |
 | G8 | 1.3.0 backend + detection coverage expansion | New protocol adapters for the **Anthropic Messages API** (`/v1/messages`, content-block + SSE `delta.text` with `event:`-line-preserving re-serialize) and the **Google Gemini API** (model-in-path `:generateContent`/`:streamGenerateContent` via an additive `:method`-suffix route matcher that leaves the exact-match adapters byte-identical); detection coverage expansion — cloud/SaaS provider keys (OpenAI/Anthropic/Google-OAuth/SendGrid/Twilio/npm/Azure, anchored) and international PII (FR/ES/JP + IT/SG/IN/DE/NL national IDs with checksum validators), each hard-block-vs-dial-eligible decision driven by measured collision rates (a non-numeric anchor or implausibly-rare shape is required for hard-block; a bare-digit run over a common length stays allowlist-clearable); a `bench:throughput` proxy load benchmark; the `haechi-ratelimit-redis` shared-store rate-limiter satellite (the WS3 seam's production consumer; the proxy now `await`s `rateLimiter.allow`); `haechi-dashboard` surfaces the per-request `correlationId`. Every change is additive — new `target.type`/detection-type/`privacy.profile` *values*, not new config keys (`configVersion` stays `1`); `tests/api-contract.test.mjs` green; core stays zero runtime dependency; core bumped to 1.3.0 (additive minor) | Pass |
 | G9 | 2026-06-16 full code-review remediation gate (shipped in 1.3.1) | `P0-CR-001` and `P1-CR-002` through `P1-CR-005` resolved or formally accepted; P2 items either resolved or scheduled with explicit non-blocking rationale; linked register updated. **All 13 `P*-CR-*` findings are Resolved (§5.7) and shipped in `haechi@1.3.1` (2026-06-16, attested OIDC publish); core bumped 1.3.0 → 1.3.1 (patch, remediation-only — no API/config surface change, `configVersion` stays `1`).** | Pass (`haechi@1.3.1`, 2026-06-16) |
+| G10 | 2026-06-16 code-review round 2 (CR2) remediation gate | The CR2 register (`code-review-risk-register-2026-06-16-round2.md`, §5.8) found **no P0/P1**; its three P2s (`CR2-001` proxy upstream-cancel, `CR2-002` token-vault audit hygiene, `CR2-003` plugin IPC reply bound) plus the P3 cluster (`CR2-004..008`) are all **Resolved and shipped in `haechi@1.3.2`** (`CR2-009` won't-fix, `CR2-010` accepted) and the linked register is updated. | Pass (`haechi@1.3.2`, 2026-06-16) |
 ## 3. P0 Distribution-Blocking Risk Status
@@ -159,6 +160,23 @@ The authoritative itemized register is `docs/current/code-review-risk-register-2
 | P2-CR-012 | KMS vault IPv6 loopback carve-out lacks IPv6-focused tests | Resolved | `satellites/crypto-kms/vault.test.mjs` adds a dedicated IPv6 loopback policy test ("…enforces the IPv6 loopback policy (::1, [::1], dotted + hex mapped) — P2-CR-012") covering bare `::1`, bracketed `[::1]`, dotted `::ffff:127.0.0.1`, and hex `::ffff:7f00:1`/`::ffff:7f00:0001` (plus bracketed variants), and asserts a public mapped address (`::ffff:8.8.8.8`/`::ffff:808:808`) is NOT over-blocked; the extended range table and `ssrf-parity.test.mjs` lock the dotted+hex agreement with auth-jwt |
 | P2-CR-013 | SSE multi-line `data:` fields are joined without newline separators | Resolved | `parseFrame` joins multiple `data:` lines with `join("\n")` (spec separator) and strips only the single spec leading space per line; multi-line JSON still `JSON.parse`s, multi-line plain text is reconstructed with newlines for inspection, and `serializeTextFrame` re-emits a multi-line payload as multiple `data:` lines; `tests/stream-filter.test.mjs` covers a multi-line JSON event and a multi-line plain-text event with PII |
+## 5.8 2026-06-16 Code Review Round 2 (CR2) Status — gate G10
+The authoritative itemized register is `docs/current/code-review-risk-register-2026-06-16-round2.md`; this is the release-gate summary. A second deep review after the 1.3.1 cut found **no P0 and no P1** (the two externally-reported P1s both verified down to P2 — neither is a stored-plaintext leak or an auth/SSRF bypass). The three P2s + the P3 cluster (`CR2-001..008`) are **Resolved and shipped in `haechi@1.3.2`**; one reported item was a **false positive** (`CR2-009`, won't-fix) and one is an **already-documented accepted residual** (`CR2-010`, accepted). **G10 is Pass.**
+| ID | Risk | Status | Required closure evidence |
+|---|---|---|---|
+| CR2-001 | Pass-through streaming never cancels the upstream reader on downstream disconnect (`pipeUpstreamBodyBounded` parks on `drain` forever) — unauthenticated resource leak | Resolved | Per-request `AbortController` + client `close`/`aborted` listener that cancels the upstream reader and aborts the fetch; `drain` wait raced against `close`; regression test that a mid-stream disconnect cancels the reader promptly |
+| CR2-002 | Token-vault reveal/purge writes the raw caller-supplied `token` + `error.message` (token-interpolated) into the audit event; `FORBIDDEN_KEYS` strips by key name only | Resolved | Generic error messages; keyed-HMAC or `tok_`-shape-validate the `token` before recording; enum `reasonCode` instead of `error.message`; regression test that no raw token reaches `reason`/`token`; reconcile the invariant wording |
+| CR2-003 | Plugin IPC reply not size-bounded before `JSON.parse`; process child has no heap cap → event-loop stall + memory spike from a hostile signed plugin | Resolved | Reply byte-length check before parse in both sandboxes (drop oversized as deny); `--max-old-space-size` heap cap on the process child via a new `resourceLimits` knob; regression test with an oversized-reply fixture |
+| CR2-004 | `sanitizeResponseHeaders` keeps stale body-coupled validators (`etag`/`content-md5`/`digest`/`last-modified`) on a transformed response | Resolved | Drop those headers on every body-mutating path + `cache-control: no-store`; test that a mutated response drops the upstream `ETag` |
+| CR2-005 | Over-`maxBytes` request body is read-and-discarded until the (finite) Node `requestTimeout` — no socket teardown | Resolved | `request.pause()`/`destroy()` (or `Connection: close`) on the 413 path; optionally non-null default timeouts |
+| CR2-006 | `mcp-wrap --stderr filter` is line-oriented, so a newline-split secret evades it (inherent; single-line secrets caught, `drop` available) | Resolved | `COMMAND_HELP` + register note; recommend `--stderr drop` for high-sensitivity tools |
+| CR2-007 | README says mcp-wrap "stderr ... pass through" but the default is now `--stderr filter` | Resolved | Correct README + `README.ko.md` |
+| CR2-008 | README streaming split-match claim is unscoped (cross-frame buffering is delta-channel only) | Resolved | Scope both README passages + `README.ko.md` to the delta channel |
+| CR2-009 | (reported P2) `keyMaterial` appended after the credential `maxMessageBytes` check | Won't fix (FALSE POSITIVE) | `keyMaterial` is operator-controlled + hard-bounded by the fetcher `maxBytes`; no attacker amplification — optional cosmetic re-assert only |
+| CR2-010 | (reported P2) secret split across two NON-JSON SSE frames not caught | Accepted (documented) | Already out-of-scope in round-1 `P1-CR-005`, `threat-model.md`, and an in-code comment; the JSON delta channel does buffer up to `maxMatchBytes` |
 ## 6. P2 Product/Documentation Risk Status
 | ID | Risk | Status | Resolution evidence |
@@ -172,7 +190,7 @@ The authoritative itemized register is `docs/current/code-review-risk-register-2
 This checklist is the standing pre-distribution template for every release on the `1.x` stable line; it was first exercised for the `0.3.2` developer preview, whose results are retained below as the reference record.
-Current 2026-06-16 status: G9 is `Pass` — the code-review remediation shipped in `haechi@1.3.1`. This checklist is cleared for that cut.
+Current 2026-06-16 status: G9 is `Pass` (round-1 remediation shipped in `haechi@1.3.1`). Gate **G10** (CR2, §5.8) is now `Pass` — the CR2 P2s + P3 cluster (`CR2-001..008`) are Resolved and shipped in `haechi@1.3.2`, so the checklist is cleared for that cut.
 External npm gate check results (`0.3.2` developer preview, 2026-06-10, post-publish):

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "haechi",
-  "version": "1.3.1",
+  "version": "1.3.3",
   "description": "Self-hosted AI context enforcement across LLM, MCP, vLLM, Ollama, and agent traffic — a stable, zero-dependency security gateway.",
   "license": "Apache-2.0",
   "type": "module",
@@ -66,6 +66,7 @@
   ],
   "scripts": {
     "test": "node --test",
+    "test:inference:live": "node --test tests/local-inference.integration.test.mjs",
     "check:types": "tsc -p jsconfig.json --noEmit",
     "pack:dry": "npm pack --dry-run",
     "scan:stale-names": "node scripts/stale-name-scan.mjs",

package/packages/cli/bin/haechi.mjs CHANGED Viewed

@@ -766,7 +766,7 @@ const COMMAND_HELP = {
   "mcp-wrap": {
     usage: "haechi mcp-wrap [--config haechi.config.json] [--stderr filter|drop|inherit] -- <command> [args...]",
     summary: "Wrap an MCP server with bidirectional stdio protection.",
-    detail: "Spawns <command>, applies the method allowlist + params protection client→server, and result protection + injection heuristics server→client. Drop-in for MCP client configs. --stderr controls the child's stderr: filter (default) protects each line with the same policy before re-emitting, drop discards it, inherit passes it through raw (an explicit, opt-in local-process boundary). filter follows the configured policy mode — in dry-run/report-only it detects but does not transform (like the rest of the pipeline), so set policy.mode=enforce for stderr redaction to take effect."
+    detail: "Spawns <command>, applies the method allowlist + params protection client→server, and result protection + injection heuristics server→client. Drop-in for MCP client configs. --stderr controls the child's stderr: filter (default) protects each line with the same policy before re-emitting, drop discards it, inherit passes it through raw (an explicit, opt-in local-process boundary). filter follows the configured policy mode — in dry-run/report-only it detects but does not transform (like the rest of the pipeline), so set policy.mode=enforce for stderr redaction to take effect. filter protects each COMPLETE line independently, so it cannot catch a secret a child deliberately splits across a newline; use drop for high-sensitivity tools."
   },
   auth: {
     usage: "haechi auth add --type user|service|agent [--scope k:v ...] [--label k=v ...]\n  haechi auth list [--config haechi.config.json]\n  haechi auth revoke <id> [--config haechi.config.json]",

package/packages/cli/runtime.mjs CHANGED Viewed

@@ -1022,7 +1022,11 @@ function resolveAuthProvider(config, providers, cryptoProvider, auditSink) {
       return createProcessIsolatedAuthProviderSync({
         ...common,
         netEnforcement: plugin.netEnforcement ?? "require-permission",
-        keyMaterial: plugin.keyMaterial ?? null
+        keyMaterial: plugin.keyMaterial ?? null,
+        // CR2-003: reuse the worker's resourceLimits.maxOldGenerationSizeMb knob to
+        // cap the child heap. Optional for the process runtime (the sandbox defaults
+        // when absent), so pass it through whether or not the config supplied it.
+        resourceLimits: plugin.resourceLimits ?? null
       });
     }
     return createSandboxedAuthProviderSync({ ...common, resourceLimits: plugin.resourceLimits });

package/packages/filter/index.mjs CHANGED Viewed

@@ -540,12 +540,15 @@ function scanEntry(entry, rules, context = {}) {
   // own token. This is response-only on purpose: a REQUEST that contains a
   // marker-shaped string is NOT Haechi output (Haechi hasn't transformed it yet),
   // so it is scanned normally — otherwise an attacker could wrap a real secret in
-  // a fake `[TOKEN:…]` to evade request-side detection.
+  // a fake `[TOKEN:…]` to evade request-side detection. On the RESPONSE side the
+  // same wrap-a-secret risk is closed by haechiMarkerSpans recording a span only
+  // when the inner content matches a GENUINE emitted format — a fake marker
+  // wrapping a real secret stays in the scan and is detected/blocked.
   // Markers are pure ASCII and NFKC-stable, so their spans are computed on the
   // ORIGINAL value exactly as before — they line up with the same-length
   // normalized scan (Case 2 below) and are irrelevant to the whole-leaf scan
   // (Case 3).
-  const markerSpans = context?.direction === "response" ? haechiMarkerSpans(entry.value) : [];
+  const markerSpans = context?.direction === "response" ? haechiMarkerSpans(entry.value, rules, context) : [];
   // WS2d — Unicode evasion via NFKC normalization. A client can defeat every
   // regex rule by sending PII/secrets in a Unicode form that folds to ASCII
@@ -824,12 +827,157 @@ function isPositionStableNfkc(value, normalized) {
   return rebuilt === normalized;
 }
-// Spans of Haechi's own transform markers in a string, so detection can skip
-// them: `[TOKEN:…]`, `[HAECHI_ENC:…]`, `[REDACTED:…]`.
-function haechiMarkerSpans(text) {
+// Spans of Haechi's own transform markers in a string, so RESPONSE-direction
+// detection can skip them: `[TOKEN:…]`, `[HAECHI_ENC:…]`, `[REDACTED:…]`. A
+// tokenized round-trip echoed by the model would otherwise be re-flagged as a
+// secret (Haechi blocking its own output).
+//
+// CR-???: a span is recorded ONLY when its inner content matches a GENUINE
+// format actually emitted by core's transform (packages/core/index.mjs
+// replacementFor). Without this check the marker frame `[(?:TOKEN|…):[^\]]*]`
+// would skip ANY inner content, so a hostile model could exfiltrate a real
+// secret by wrapping it in a FAKE marker — `[TOKEN:sk-ant-api03-<secret>]`,
+// `[HAECHI_ENC:<secret>]`, `[REDACTED:<secret>]` — and that span would be
+// dropped from the scan. A marker-SHAPED string whose inner content is not
+// genuine is left in the scan, so the wrapped secret is detected/blocked.
+// Genuine inner formats:
+//   [REDACTED:<type>]            <type> is a detection type name (lowercase
+//                                identifier: [a-z][a-z0-9_]*).
+//   [TOKEN:<vaultTokenId>]       vault id shape `tok_<type>_<hexhash>`
+//                                (matches token-vault VAULT_TOKEN_SHAPE).
+//   [TOKEN:<type>:<shortHash>]   non-vault deterministic token: type name + hex.
+//   [HAECHI_ENC:<base64url>]     base64url that decodes to a VALID envelope
+//                                JSON object (cryptoProvider.encrypt envelope:
+//                                has `kid`+`aadHash`). A real secret string will
+//                                not base64url-decode to such an object.
+// Markers are pure ASCII / NFKC-stable and spans are computed on the ORIGINAL
+// entry.value, so offset integrity is unchanged.
+// Detection-type name shape (the `detection.type` written by core into REDACTED
+// and the type segment of a non-vault TOKEN). Built-in rule types and custom
+// rule types are lowercase identifiers; a real secret (hyphens, uppercase,
+// length) does not match, so a wrapped secret stays in the scan.
+const MARKER_TYPE_NAME = /^[a-z][a-z0-9_]*$/;
+// Vault token id shape — mirrors token-vault VAULT_TOKEN_SHAPE
+// (`tok_<type>_<hexhash>`, random: 16 hex, deterministic: 32 hex). Kept in sync
+// with packages/token-vault/index.mjs (not exported from there).
+const MARKER_VAULT_TOKEN = /^tok_[a-z0-9_]+_[a-f0-9]{16,}$/;
+// Non-vault deterministic token: `<type>:<hex>` (core shortHash → 12 hex; allow
+// any reasonable hex run so the check does not over-fit a single length).
+const MARKER_NONVAULT_TOKEN = /^[a-z][a-z0-9_]*:[a-f0-9]{8,}$/;
+// base64url alphabet only (core emits base64url with no padding).
+const MARKER_BASE64URL = /^[A-Za-z0-9_-]+$/;
+function isGenuineTokenInner(inner) {
+  return MARKER_VAULT_TOKEN.test(inner) || MARKER_NONVAULT_TOKEN.test(inner);
+}
+function isGenuineRedactedInner(inner) {
+  return MARKER_TYPE_NAME.test(inner);
+}
+// True only when `inner` base64url-decodes to a valid UTF-8 JSON object that
+// carries the encrypt-envelope signature (`kid` + `aadHash` — the contract keys
+// asserted by assertCryptoProviderConformance, present in the local AES-GCM
+// envelope and any conformant external provider). Any decode/parse failure or a
+// non-envelope shape → NOT a genuine marker (so a wrapped secret is scanned).
+function isGenuineEncInner(inner) {
+  if (!MARKER_BASE64URL.test(inner)) {
+    return false;
+  }
+  try {
+    const bytes = Buffer.from(inner, "base64url");
+    // Reject inputs that do not round-trip through base64url (e.g. an invalid
+    // tail that Buffer silently truncates): a genuine marker always round-trips.
+    if (bytes.toString("base64url") !== inner) {
+      return false;
+    }
+    if (!isUtf8(bytes)) {
+      return false;
+    }
+    const parsed = JSON.parse(bytes.toString("utf8"));
+    return (
+      parsed !== null &&
+      typeof parsed === "object" &&
+      !Array.isArray(parsed) &&
+      typeof parsed.kid === "string" &&
+      typeof parsed.aadHash === "string"
+    );
+  } catch {
+    return false;
+  }
+}
+// Belt-and-suspenders for the genuine-marker shapes: even a correctly-SHAPED
+// TOKEN/REDACTED inner must not itself carry a detectable secret. The lowercase-
+// identifier classes (MARKER_TYPE_NAME, the type segments of the token shapes)
+// overlap the body of real lowercase-bodied secrets (notably GitHub `gh[pousr]_`
+// tokens), so a hostile model could smuggle such a secret as the `<type>` segment
+// of an otherwise genuine-shaped marker. Re-scan the inner with the SAME rules and
+// refuse to treat it as genuine if anything detectable is inside — this un-skips a
+// marker exactly when skipping it would hide a leak.
+function textHasDetection(text, rules, context) {
+  for (const rule of rules) {
+    if (rule.direction && rule.direction !== context?.direction) {
+      continue;
+    }
+    const regex = new RegExp(rule.pattern, rule.flags.includes("g") ? rule.flags : `${rule.flags}g`);
+    for (const match of text.matchAll(regex)) {
+      if (!rule.validate || rule.validate(match[0])) {
+        return true;
+      }
+    }
+  }
+  return false;
+}
+// The attacker-controllable segment(s) of a genuine-shaped marker inner — i.e. the
+// `<type>` position(s) a hostile model could smuggle a secret into. For TOKEN we
+// peel off the structural framing (`tok_<type>_<hex>` → `<type>`, `<type>:<hex>` →
+// `<type>`) and scan the segment IN ISOLATION as well as the whole inner: a `\b`-
+// anchored rule (e.g. GitHub `\bghp_…`) misses a token glued to the `tok_` prefix
+// (no word boundary after `_`), but matches the segment scanned on its own.
+function markerSecretSurfaces(kind, inner) {
+  const surfaces = [inner];
+  if (kind === "TOKEN") {
+    const vault = /^tok_(.+)_[a-f0-9]{16,}$/.exec(inner);
+    if (vault) {
+      surfaces.push(vault[1]);
+    }
+    const nonVault = /^(.+):[a-f0-9]{8,}$/.exec(inner);
+    if (nonVault) {
+      surfaces.push(nonVault[1]);
+    }
+  }
+  return surfaces;
+}
+function innerContainsDetection(kind, inner, rules, context) {
+  return markerSecretSurfaces(kind, inner).some((surface) => textHasDetection(surface, rules, context));
+}
+function haechiMarkerSpans(text, rules = [], context = {}) {
   const spans = [];
-  for (const m of text.matchAll(/\[(?:TOKEN|HAECHI_ENC|REDACTED):[^\]]*\]/g)) {
-    spans.push([m.index, m.index + m[0].length]);
+  for (const m of text.matchAll(/\[(TOKEN|HAECHI_ENC|REDACTED):([^\]]*)\]/g)) {
+    const kind = m[1];
+    const inner = m[2];
+    let genuine = false;
+    if (kind === "TOKEN") {
+      genuine = isGenuineTokenInner(inner);
+    } else if (kind === "REDACTED") {
+      genuine = isGenuineRedactedInner(inner);
+    } else {
+      genuine = isGenuineEncInner(inner);
+    }
+    // HAECHI_ENC is exempt from the inner re-scan: its inner is an opaque base64url
+    // envelope validated by decode above (a raw secret cannot forge a valid
+    // envelope, and the envelope's base64url body is not a detectable leaf).
+    if (genuine && kind !== "HAECHI_ENC" && innerContainsDetection(kind, inner, rules, context)) {
+      genuine = false;
+    }
+    if (genuine) {
+      spans.push([m.index, m.index + m[0].length]);
+    }
   }
   return spans;
 }

package/packages/plugin/process-sandbox.mjs CHANGED Viewed

@@ -44,8 +44,19 @@ import {
 // The child flags. `--permission` enables the deny-by-default Node permission
 // model; we pass NO --allow-* grant, so fs/child-process/worker/addons/wasi/net
 // are all kernel-denied. `--disable-proto=delete` removes Object.prototype.__proto__.
+// A `--max-old-space-size=<mb>` heap cap is appended PER-SPAWN (see spawnAndLoad):
+// unlike the worker (resourceLimits OOMs a runaway), a process child has NO heap
+// cap by default, so a hostile/buggy signed plugin could build a reply up to the
+// child's default V8 heap. The cap bounds the child; the host-side reply-size bound
+// (CR2-003) bounds the host regardless.
 const CHILD_FLAGS = Object.freeze(["--permission", "--disable-proto=delete"]);
+// Default child heap cap (MB) when a process-runtime config does not supply
+// resourceLimits.maxOldGenerationSizeMb. Non-breaking: the worker REQUIRES the
+// knob, but the process runtime defaults rather than throwing so an isolation:
+// process config without resourceLimits keeps working.
+const DEFAULT_MAX_OLD_GEN_MB = 128;
 // A CONSTANT bootstrap harness, passed via `node -e`. It is identical for every
 // plugin (the plugin bytes arrive over IPC, NOT on the command line — so there is
 // no ARG_MAX limit and the harness never varies). It runs as CommonJS under -e and
@@ -155,6 +166,10 @@ function createProcessIsolatedAuthProviderHandle({
   timeoutMs,
   maxPendingCalls = 8,
   maxMessageBytes = 16384,
+  // Child V8 heap cap. Reuses the worker's resourceLimits.maxOldGenerationSizeMb
+  // knob (CR2-003). Optional for the process runtime: a config that omits it falls
+  // back to DEFAULT_MAX_OLD_GEN_MB rather than throwing (non-breaking).
+  resourceLimits = null,
   coreVersion = null,
   now = Date.now,
   allowedLabelKeys,
@@ -201,6 +216,21 @@ function createProcessIsolatedAuthProviderHandle({
   if (!Number.isInteger(maxMessageBytes) || maxMessageBytes < 1) {
     throw new Error("maxMessageBytes must be a positive integer");
   }
+  // Resolve the child heap cap (MB). Optional for the process runtime; if supplied
+  // it must be a positive-integer maxOldGenerationSizeMb (same shape as the worker),
+  // else default to DEFAULT_MAX_OLD_GEN_MB (non-breaking — never throws on absence).
+  let maxOldGenerationSizeMb = DEFAULT_MAX_OLD_GEN_MB;
+  if (resourceLimits !== null && resourceLimits !== undefined) {
+    if (typeof resourceLimits !== "object" || Array.isArray(resourceLimits)) {
+      throw new Error("createProcessIsolatedAuthProvider resourceLimits must be an object");
+    }
+    if (resourceLimits.maxOldGenerationSizeMb !== undefined) {
+      if (!Number.isInteger(resourceLimits.maxOldGenerationSizeMb) || resourceLimits.maxOldGenerationSizeMb <= 0) {
+        throw new Error("createProcessIsolatedAuthProvider resourceLimits.maxOldGenerationSizeMb must be a positive integer");
+      }
+      maxOldGenerationSizeMb = resourceLimits.maxOldGenerationSizeMb;
+    }
+  }
   // Fail-closed network containment. PR1 supports only the "require-permission"
   // mode; if this Node cannot enforce --allow-net, refuse to construct rather than
   // run a plugin whose network egress is uncontained.
@@ -273,7 +303,11 @@ function createProcessIsolatedAuthProviderHandle({
   // any failure kills the child and throws → fail closed. NOTE the plugin source
   // crosses over IPC (not the command line) so there is no ARG_MAX limit.
   async function spawnAndLoad({ entrySource, pluginId: pid }) {
-    const c = spawn(execPath, [...CHILD_FLAGS, "-e", PROCESS_HARNESS], {
+    // Build the spawn args by spreading the frozen base flags + the per-spawn heap
+    // cap. `--max-old-space-size` composes with `--permission`/`--disable-proto=
+    // delete` and the data:-URL load (verified). The cap bounds a runaway child;
+    // the host-side reply-size bound bounds the host regardless of the child heap.
+    const c = spawn(execPath, [...CHILD_FLAGS, `--max-old-space-size=${maxOldGenerationSizeMb}`, "-e", PROCESS_HARNESS], {
       stdio: ["ignore", "ignore", "ignore", "ipc"],
       serialization: "json",
       env: scrubbedEnv(),
@@ -291,6 +325,27 @@ function createProcessIsolatedAuthProviderHandle({
     const failed = new Promise((_, reject) => { onFail = reject; });
     c.on("message", (raw) => {
+      // REPLY SIZE BOUND (CR2-003): bound host-side work BEFORE JSON.parse. Unlike
+      // the worker (resourceLimits OOMs a runaway), the child has only the
+      // --max-old-space-size cap, so it can still build a reply up to that heap and
+      // process.send it; a synchronous JSON.parse of a multi-MB string stalls the
+      // host event loop (the per-call timeout cannot fire mid-parse). The reply is a
+      // STRING (serialization:'json'); measure its byte length and, if it exceeds the
+      // SAME maxMessageBytes ceiling the outbound credential obeys, drop the frame as
+      // an oversized DENY WITHOUT parsing. The auth reply is the only attacker-sized
+      // frame (claims come from the plugin); the tiny ready/loaded/load-error control
+      // frames are always far under the ceiling, so the uniform bound never harms the
+      // handshake. Single-occupancy: settle the one live pending call as oversized.
+      const replyBytes = typeof raw === "string"
+        ? Buffer.byteLength(raw, "utf8")
+        : Buffer.byteLength(String(raw), "utf8");
+      if (replyBytes > maxMessageBytes) {
+        for (const [cid, settle] of pending) {
+          pending.delete(cid);
+          settle({ __oversized: true });
+        }
+        return;
+      }
       let parsed;
       try {
         parsed = JSON.parse(typeof raw === "string" ? raw : String(raw));

package/packages/plugin/sandbox.mjs CHANGED Viewed

@@ -150,6 +150,29 @@ function createSandboxedAuthProviderHandle({
       workerData: {}
     });
     w.on("message", (raw) => {
+      // REPLY SIZE BOUND (CR2-003): bound host-side work BEFORE JSON.parse. The
+      // worker has an implicit heap cap (resourceLimits), but enforce the same
+      // maxMessageBytes ceiling on the INBOUND plugin→host reply that the OUTBOUND
+      // host→plugin credential message obeys — a hostile/buggy plugin can build a
+      // multi-MB reply and a synchronous JSON.parse would stall the host event loop
+      // (the per-call timeout cannot fire mid-parse). The reply is a STRING posted
+      // via JSON.stringify; measure its byte length and, if oversized, settle the
+      // matched call as an oversized DENY (mirroring the credential deny) WITHOUT
+      // parsing. We must locate the pending settle WITHOUT parsing the cid, so an
+      // oversized reply settles the single live pending call (single-occupancy: at
+      // most one entry is ever live).
+      const replyBytes = typeof raw === "string"
+        ? Buffer.byteLength(raw, "utf8")
+        : Buffer.byteLength(String(raw), "utf8");
+      if (replyBytes > maxMessageBytes) {
+        // Single-occupancy: settle the one live pending call as oversized, never
+        // touching JSON.parse on the oversized payload.
+        for (const [cid, settle] of pending) {
+          pending.delete(cid);
+          settle({ __oversized: true });
+        }
+        return;
+      }
       let parsed;
       try {
         parsed = JSON.parse(typeof raw === "string" ? raw : String(raw));