npm - haechi - Versions diffs - 1.3.1 → 1.3.2 - Mend

haechi 1.3.1 → 1.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.ko.md +3 -3
package/README.md +3 -3
package/docs/current/code-review-risk-register-2026-06-16-round2.ko.md +142 -0
package/docs/current/code-review-risk-register-2026-06-16-round2.md +142 -0
package/docs/current/operations-runbook.ko.md +21 -1
package/docs/current/operations-runbook.md +22 -1
package/docs/current/release-process.ko.md +14 -6
package/docs/current/release-process.md +14 -6
package/docs/current/risk-register-release-gate.ko.md +22 -4
package/docs/current/risk-register-release-gate.md +22 -4
package/package.json +2 -1
package/packages/cli/bin/haechi.mjs +1 -1
package/packages/cli/runtime.mjs +5 -1
package/packages/plugin/process-sandbox.mjs +56 -1
package/packages/plugin/sandbox.mjs +23 -0
package/packages/proxy/index.mjs +128 -12
package/packages/token-vault/index.mjs +46 -5

package/README.ko.md CHANGED Viewed

@@ -90,7 +90,7 @@ node packages/cli/bin/haechi.mjs proxy --config haechi.config.json
 proxy는 기본적으로 loopback에 바인딩됩니다. `0.0.0.0`, `::`, 또는 그 밖의 non-loopback 호스트에 바인딩하려면 `--allow-remote-bind`를 명시적으로 전달해야 합니다. 이 플래그는 명시적인 네트워크 접근 통제가 있을 때만 사용하세요.
-`stream: true`인 스트리밍 요청은 기본적으로 차단됩니다. `streaming.requestMode`를 `inspect`로 설정하면 SSE/NDJSON 응답을 stream-filter합니다(bounded sliding buffer가 프레임에 걸쳐 나뉜 PII도 잡아냅니다. `streaming.maxMatchBytes` 참고). 호출자가 보호되지 않는 스트리밍을 명시적으로 감수하는 경우에만 `pass-through`로 설정하세요.
+`stream: true`인 스트리밍 요청은 기본적으로 차단됩니다. `streaming.requestMode`를 `inspect`로 설정하면 SSE/NDJSON 응답을 stream-filter합니다(JSON **델타 채널**을 훑는 bounded sliding buffer가 델타 프레임에 걸쳐 나뉜 PII도 `streaming.maxMatchBytes`까지 잡아냅니다. 델타가 아닌 leaf와 비JSON 프레임은 각 프레임 내에서 검사됩니다). 호출자가 보호되지 않는 스트리밍을 명시적으로 감수하는 경우에만 `pass-through`로 설정하세요.
 Ollama의 `/api/chat`과 `/api/generate`는 `stream` 필드가 없으면 기본적으로 스트리밍하므로, proxy는 `stream: false`가 명시되지 않는 한 이 요청들을 스트리밍으로 간주합니다.
@@ -153,7 +153,7 @@ stdio MCP 서버를 감싸 양방향 트래픽을 필터링합니다. MCP 클라
 }
 ```
-클라이언트→서버 요청은 `mcp.allowedMethods` allowlist와 params 보호를 거치고, 서버→클라이언트 결과는 params/result 보호와 injection 휴리스틱(아래 참고)을 적용받습니다. 거부된 요청은 클라이언트에 응답되고 서버에는 도달하지 않습니다. stderr과 exit code는 그대로 전달됩니다.
+클라이언트→서버 요청은 `mcp.allowedMethods` allowlist와 params 보호를 거치고, 서버→클라이언트 결과는 params/result 보호와 injection 휴리스틱(아래 참고)을 적용받습니다. 거부된 요청은 클라이언트에 응답되고 서버에는 도달하지 않습니다. exit code는 그대로 전달되며, 자식의 stderr은 기본적으로 줄 단위로 동일한 보호를 거쳐 **필터링됩니다**(`--stderr filter`) — 원시 전달은 `--stderr inherit`을, 폐기는 `--stderr drop`을 사용하세요(줄 단위 필터는 자식이 개행을 넘어 쪼갠 비밀을 잡지 못하므로, 고민감 도구에는 폐기를 권장합니다). `filter`는 `policy.mode: enforce`에서만 변환합니다.
 ## Injection Detection (Preview)
@@ -322,7 +322,7 @@ Haechi는 로컬 정책 부트스트래핑을 위한 지역별 기본 Privacy Pr
 0.4.0은 token round-trip(deterministic tokenization + 요청 스코프 응답 detokenization), `mcp-wrap` 양방향 MCP 필터, `status` 및 `audit-verify` 커맨드, report-only injection detection 휴리스틱을 추가하고, 0.6 인증을 위한 PII-safe `identity`/`authProvider` 계약을 예약합니다. `docs/current/release-0.4-implementation-scope.md` 참고.
-0.5.0은 SSE/NDJSON 스트리밍 응답 검사를 추가합니다. `streaming.requestMode: "inspect"`가 bounded sliding buffer로 응답을 stream-filter하여 프레임에 걸쳐 나뉜 PII도 잡아냅니다(`streaming.maxMatchBytes`). `docs/current/release-0.5-implementation-scope.md` 참고.
+0.5.0은 SSE/NDJSON 스트리밍 응답 검사를 추가합니다. `streaming.requestMode: "inspect"`가 JSON **델타 채널**을 훑는 bounded sliding buffer로 응답을 stream-filter하여 델타 프레임에 걸쳐 나뉜 PII도 잡아냅니다(`streaming.maxMatchBytes`). 델타가 아닌 leaf와 비JSON 콘텐츠 프레임은 각 프레임 내에서 검사됩니다. `docs/current/release-0.5-implementation-scope.md` 참고.
 0.6.0은 인증과 클라이언트별 통제를 추가합니다. 해시 기반 token 저장소와 `haechi auth` CLI를 갖춘 내장 bearer auth, identity scope/label로 바인딩되는 named policy profile, model allowlisting, identity별 rate limiting을 제공하며, audit 로그에는 PII-safe identity가 기록됩니다. `docs/current/release-0.6-implementation-scope.md` 참고.

package/README.md CHANGED Viewed

@@ -90,7 +90,7 @@ Point an existing HTTP JSON client at `http://localhost:11016` and set `target.u
 The proxy binds to loopback by default. Binding to `0.0.0.0`, `::`, or another non-loopback host fails unless `--allow-remote-bind` is provided. Use that flag only behind explicit network access controls.
-Streaming requests with `stream: true` are blocked by default. Set `streaming.requestMode` to `inspect` to stream-filter SSE/NDJSON responses (a bounded sliding buffer catches PII split across frames; see `streaming.maxMatchBytes`), or to `pass-through` only when the caller explicitly accepts unprotected streaming.
+Streaming requests with `stream: true` are blocked by default. Set `streaming.requestMode` to `inspect` to stream-filter SSE/NDJSON responses (a bounded sliding buffer over the JSON **delta channel** catches PII split across delta frames, up to `streaming.maxMatchBytes`; non-delta leaves and non-JSON frames are inspected within each frame), or to `pass-through` only when the caller explicitly accepts unprotected streaming.
 Ollama `/api/chat` and `/api/generate` stream by default when the `stream` field is omitted, so the proxy treats those requests as streaming unless `stream: false` is explicitly set.
@@ -153,7 +153,7 @@ Wrap any stdio MCP server so its traffic is filtered in both directions — chan
 }
 ```
-Client→server requests pass the `mcp.allowedMethods` allowlist and params protection; server→client results get params/result protection plus injection heuristics (see below). Rejections are answered to the client and never reach the server; stderr and exit codes pass through.
+Client→server requests pass the `mcp.allowedMethods` allowlist and params protection; server→client results get params/result protection plus injection heuristics (see below). Rejections are answered to the client and never reach the server. Exit codes pass through; the child's stderr is **filtered** through the same protection per line by default (`--stderr filter`) — use `--stderr inherit` for raw passthrough or `--stderr drop` to discard (recommended for high-sensitivity tools, since a per-line filter cannot catch a secret a child splits across a newline). `filter` transforms only under `policy.mode: enforce`.
 ## Injection Detection (Preview)
@@ -322,7 +322,7 @@ Set `privacy.profile` in `haechi.config.json` to apply the profile's default act
 0.4.0 adds the token round-trip (deterministic tokenization + request-scoped response detokenization), the `mcp-wrap` bidirectional MCP filter, `status` and `audit-verify` commands, report-only injection detection heuristics, and reserves the PII-safe `identity`/`authProvider` contracts for 0.6 auth. See `docs/current/release-0.4-implementation-scope.md`.
-0.5.0 adds SSE/NDJSON streaming response inspection: `streaming.requestMode: "inspect"` stream-filters responses with a bounded sliding buffer that catches PII split across frames (`streaming.maxMatchBytes`). See `docs/current/release-0.5-implementation-scope.md`.
+0.5.0 adds SSE/NDJSON streaming response inspection: `streaming.requestMode: "inspect"` stream-filters responses with a bounded sliding buffer over the JSON **delta channel** that catches PII split across delta frames (`streaming.maxMatchBytes`); non-delta leaves and non-JSON content frames are inspected within each frame. See `docs/current/release-0.5-implementation-scope.md`.
 0.6.0 adds authentication and per-client controls: built-in bearer auth with a hashed token store and `haechi auth` CLI, named policy profiles bound by identity scope/label, model allowlisting, and per-identity rate limiting — with PII-safe identity in the audit log. See `docs/current/release-0.6-implementation-scope.md`.

package/docs/current/code-review-risk-register-2026-06-16-round2.ko.md ADDED Viewed

@@ -0,0 +1,142 @@
+# 2026-06-16 코드리뷰 리스크 등록부 — Round 2 (CR2)
+상태: 보완 완료 및 `haechi@1.3.2`로 발행(P0/P1 없음; CR2-001..008 Resolved; CR2-009 won't-fix, CR2-010 accepted; G10 Pass, 2026-06-16)
+범위: `main`의 `36af9fd1eef2b1e19b19b2e0344faab0a7a3e83d`(post-1.3.1)
+검토일: 2026-06-16
+출처: 1.3.1 보완 컷 이후 진행한 2차 심층 리뷰, 각 항목을 현재 코드에 대해 적대적으로 검증
+이 문서는 `code-review-risk-register-2026-06-16.md`(round 1, 모두 Resolved이며 1.3.1로 발행)와 분리해 둔 **2차 라운드**다. round 1은 threat model과 코드가 인용하는 frozen resolution 기록이다. Round 2는 1.3.1 이후 제기되어 현재 트리에 대해 재검증된 항목을 담는다. 일부는 round-1 주장을 확장하거나 한정하므로, frozen 기록에 되쓰지 않고 상호 참조한다.
+## 릴리스 판단
+round-1 P0/P1은 모두 Resolved이며 `haechi@1.3.1`로 발행되었다. Round 2는 **P0도 P1도 발견하지 못했다**: 외부 리뷰에서 P1로 제기된 두 항목은 모두 검증 결과 **P2**로 내려갔다(둘 다 stored-plaintext leak도, auth/SSRF 우회도 아니다). 확인된 P2는 프록시 데이터 경로의 availability/resource leak, 호출자 제공 입력을 반영하는 audit-hygiene 공백, 그리고 무제한 plugin IPC reply다. 세 개의 P2와 P3 묶음(`CR2-001..008`)은 이제 **Resolved이며 `haechi@1.3.2`로 발행되었다**(2026-06-16, attested OIDC publish); `CR2-009`는 won't-fix(false positive), `CR2-010`은 accepted(문서화된 잔여 리스크)로 유지된다. **G10은 Pass다.** 운영자는 CR2 수정 사항을 반영하려면 `1.3.1`에서 `1.3.2`로 업그레이드해야 한다.
+## 심각도 기준
+- `P0`: 신뢰 경계를 넘어가는 직접적인 자격증명/데이터 유출, 또는 핵심 보안 약속을 깨는 우회.
+- `P1`: SSRF, 보호 우회, 서비스 거부, 보호 배포를 깨뜨릴 수 있는 프로토콜 동작.
+- `P2`: 넓은 채택 전 해결해야 하는 운영, 정확성, availability, hygiene 공백.
+- `P3`: 영향이 작은 하드닝, 유한 경계 robustness, 또는 문서 정확성.
+## 검증 노트
+아래 모든 항목은 독립 리뷰어가 보고자 진술을 그대로 믿지 않고 현재 코드에 대해 추적했다. 보고된 P1 두 건은 검증 후 P2로 하향됐다(부풀림 없음). 보고된 한 항목은 **false positive**였고 한 항목은 **이미 문서화된 수용 잔여 리스크**였다 — 둘 다 audit trail을 위해 여기 기록하며 코드 변경은 필요 없다.
+## 요약
+| ID | 심각도 | 영역 | 리스크 | 상태 |
+| --- | --- | --- | --- | --- |
+| CR2-001 | P2 | 프록시 availability | pass-through streaming이 downstream 클라이언트 disconnect 시 upstream reader를 절대 취소하지 않는다 — `await once(response,"drain")`이 영원히 park되어 upstream connection/task가 leak되고, 인증되지 않은 클라이언트가 반복적으로 disconnect해 dangling upstream connection을 누적할 수 있다. | Resolved |
+| CR2-002 | P2 | audit hygiene | token-vault reveal/purge 실패가 호출자 제공 raw `token`과 `error.message`(token을 interpolate함)를 audit event에 기록한다; `FORBIDDEN_KEYS`는 key 이름으로만 제거하므로, `tok_` id가 기대되는 자리에 secret을 넘기면 hash-chained 로그에 raw로 남는다. stored vault plaintext가 아니라 호출자 입력을 반영한다. | Resolved |
+| CR2-003 | P2 | plugin sandbox DoS | `maxMessageBytes`는 host→plugin credential 메시지만 제한한다; plugin→host reply는 무제한으로 수신·`JSON.parse`된다. process-isolated child에 heap cap이 없어, 적대적/버그 있는 signed plugin이 oversized reply를 반환 → host의 동기 parse가 event loop를 정지시키고 메모리가 급증한다. | Resolved |
+| CR2-004 | P3 | 프록시 헤더 | `sanitizeResponseHeaders`가 body가 변환됐을 때 body-coupled validator(`etag`/`content-md5`/`digest`/`last-modified`)를 유지해 stale 상태가 된다; 변경된 응답에 `cache-control: no-store`가 없다. | Resolved |
+| CR2-005 | P3 | 프록시 robustness | `maxBytes`를 초과하는 body에 대해 `readBody`는 reject하지만 socket 읽기/teardown을 멈추지 않아, 업로드가 (유한한) Node `requestTimeout`까지 read-and-discard된다. | Resolved |
+| CR2-006 | P3 | MCP wrap | `mcp-wrap --stderr filter`는 완성된 라인 단위로 보호하므로, 적대적 child가 의도적으로 newline에 걸쳐 분할한 secret은 anchored regex를 회피한다. 신뢰된 로컬 child의 진단 출력에 대한 라인 지향 필터링의 본질적 한계다; single-line secret은 잡히고 `--stderr drop`이 있다. 문서 전용. | Resolved |
+| CR2-007 | P3 | Docs | README는 MCP wrap이 "stderr and exit codes pass through"라고 하지만, 기본값은 이제 `--stderr filter`다(round-1 P2-CR-006). | Resolved |
+| CR2-008 | P3 | Docs | README의 streaming "split match" 주장이 범위 한정이 없다; cross-frame buffering은 JSON delta 채널에만 적용되며 임의의 non-JSON frame에는 적용되지 않는다. | Resolved |
+| CR2-009 | — | plugin sandbox | (보고된 P2) `keyMaterial`이 base credential 메시지의 `maxMessageBytes` 검사 이후에 append된다. **FALSE POSITIVE:** `keyMaterial`은 운영자 통제이며 fetcher의 `maxBytes`로 hard-bound된다; 공격자 증폭 없음. 수정 불필요(선택적 cosmetic re-assert만). | Won't fix |
+| CR2-010 | — | Streaming | (보고된 P2) 두 개의 NON-JSON SSE/NDJSON frame에 걸쳐 분할된 secret은 잡히지 않는다(per-frame 검사). **수용 잔여 리스크 — 이미 문서화됨**: round-1 P1-CR-005 resolution, `threat-model.md` exclusions, 그리고 in-code comment. 변경 없음. | Accepted |
+## 상세 항목
+### CR2-001: downstream disconnect 시 upstream reader 미취소
+심각도: P2(가장 시급한 CR2 항목)
+상태: Resolved (`haechi@1.3.2`로 발행, PR #96)
+영향 코드: `packages/proxy/index.mjs`의 `pipeUpstreamBodyBounded` / `forward`
+검증: pass-through streaming 경로에는 클라이언트 연결의 `close`/`aborted` listener가 없다; 클라이언트 socket이 죽은 뒤 `await once(response, "drain")`이 무기한 park된다(`drain`도 `error`도 발생하지 않음). 그래서 async task와 upstream connection이 leak된다. 전제 조건 없이 **인증되지 않은** 클라이언트가 도달 가능하다; 스트림 도중 반복 disconnect는 프록시와 그 upstream LLM 엔드포인트에 대한 dangling upstream connection을 누적한다.
+해소: per-request `AbortController`를 `forward()`에 전달하고(upstream fetch를 abort) upstream reader를 취소하는 one-shot 클라이언트 `close`/`aborted` listener를 등록한다; `drain` 대기를 `close`와 race시켜 backpressure 대기가 disconnect 시 unpark되게 한다; no-backpressure `reader.read()` parked 케이스도 다룬다. 회귀 테스트: 스트림 도중 disconnect하고 reader가 즉시 취소되는지 / upstream이 abort되는지 단언.
+### CR2-002: Token-Vault Reveal/Purge가 Raw Token + Error Text를 Audit에 기록
+심각도: P2(보고된 P1에서 하향 — stored vault plaintext가 아니라 호출자 제공 입력을 반영함)
+상태: Resolved (`haechi@1.3.2`로 발행, PR #97)
+영향 코드: `packages/token-vault/index.mjs`(reveal/purge throw + record), `packages/audit/index.mjs`(`FORBIDDEN_KEYS` / `sanitizeAudit`)
+검증: reveal은 `Unknown token: ${token}` / `Token expired: ${token}`을 throw하고 catch가 `reason: error.message`를 기록한다; raw `token` 인자도 `reveal_failed`/`reveal_denied`/`purge`에 그대로 기록된다. `sanitizeAudit`는 key 이름으로 필터링하고 `FORBIDDEN_KEYS`는 `reason`도 `token`도 포함하지 않으므로, 둘 다 기록된 hash-chained 레코드로 살아남는다. 정상 흐름에서 `token` 인자는 비민감 `tok_<type>_<hash>` id이므로, 누출은 호출자/운영자가 token id가 기대되는 자리에 raw secret을 넘길 때만 발생한다 — 그래도 round-1 `P1-SEC-017`과 `threat-model.md`의 "no plaintext / keyed-HMAC only" 표현과 모순된다.
+해소: (1) 일반화된 오류 메시지(raw token interpolation 없음); (2) raw `token`을 그대로 기록하는 것을 중단 — reveal/purge 레코드 이전에 (`subjectHash`/`issuerHash`처럼) keyed-HMAC하거나 인자를 `tok_<type>_<hash>` 형태에 대해 검증하고 아니면 redact; (3) free-text `reason: error.message`를 enum `reasonCode`로 교체; (4) `reveal_failed`/`purge` event가 `reason`/`token`에 호출자 제공 raw token을 절대 포함하지 않는다는 회귀 테스트; 문서의 불변식 표현을 정합화.
+### CR2-003: Plugin IPC Reply가 Size-Bound되지 않음; Process Child에 Heap Cap 없음
+심각도: P2
+상태: Resolved (`haechi@1.3.2`로 발행, PR #98)
+영향 코드: `packages/plugin/sandbox.mjs`, `packages/plugin/process-sandbox.mjs`, `packages/cli/runtime.mjs`
+검증: `maxMessageBytes`는 outbound host→plugin credential 메시지에만 강제된다; inbound reply는 두 sandbox 모두에서 size 검사 없이 수신·`JSON.parse`된다. worker에는 암묵적 경계가 있지만(필수 `resourceLimits` heap cap이 폭주 worker를 먼저 OOM시킴), process child는 `--max-old-space-size`를 설정하지 않으므로, 적대적/버그 있는 signed plugin이 child의 기본 V8 heap까지 reply를 만들어 `process.send`할 수 있고, host의 동기 `JSON.parse`가 event loop를 정지시킨다(per-call timeout이 parse 도중 발생할 수 없음). signed/semi-trusted-but-hostile plugin이 필요하다.
+해소: 두 sandbox 모두에서 parse 이전에 reply를 경계화한다(worker/child `message` 핸들러에서 byte length를 `maxMessageBytes` 또는 전용 `maxReplyBytes`와 대조하고, oversized를 `JSON.parse` 이전에 deny로 drop); 새 `resourceLimits`/`processMaxOldGenerationSizeMb` knob에서 파생한 `--max-old-space-size`로 process child에 heap cap을 부여한다. 회귀 테스트: oversized claims 객체를 반환하는 fixture plugin → 무제한 host 작업 없이 deny. 1.0/1.1 scope 문서에 경계가 BOTH 방향에 적용됨을 명시.
+### CR2-004: 변환된 응답의 Stale Body-Coupled Validator 헤더
+심각도: P3
+상태: Resolved (`haechi@1.3.2`로 발행, PR #96)
+영향 코드: `packages/proxy/index.mjs`의 `sanitizeResponseHeaders` / `transformedJsonHeaders`
+검증: hop-by-hop 헤더만 제거된다; `protectJson`이 body를 변경·재직렬화할 때 upstream `etag`/`content-md5`/`digest`/`last-modified`가 그대로 살아남고 `cache-control: no-store`가 설정되지 않는다. 문서화된 inference-upstream 타깃 집합(POST 응답, strong validator 없음, RFC 9111상 기본 비캐시; `content-length`는 재계산됨)에서는 실세계 영향이 작지만, 수용 잔여 리스크로 기록되어 있지 않다.
+해소: 모든 body-mutating 경로의 drop 집합에 `etag`/`content-md5`/`digest`/`last-modified`를 추가한다; 변환된 응답에 `cache-control: no-store`를 설정한다. 테스트: 변경된 응답이 upstream `ETag`를 더 이상 담지 않음.
+### CR2-005: 한도 초과 Request Body가 Drain/Teardown되지 않음
+심각도: P3
+상태: Resolved (`haechi@1.3.2`로 발행, PR #96)
+영향 코드: `packages/proxy/index.mjs`의 `readBody`
+검증: `maxBytes` 초과 시 `readBody`는 플래그를 세우고 reject하지만 request를 `pause()`/`destroy()`하지 않으며, 413 응답이 `Connection: close`를 보내지 않으므로 Node가 built-in `requestTimeout`(Node ≥22 기본 300000 ms)까지 업로드의 나머지를 read-and-discard한다. hold는 유한하다; `maxInFlight: 0`(기본값)은 동시에 hold되는 connection 수를 경계화하지 않는다.
+해소: 413 시 `request.pause()`/`request.destroy()`(또는 응답 이전에 `Connection: close`)로 socket을 즉시 해제한다. 낮은 우선순위: non-null 기본 `requestTimeoutMs`/`headersTimeoutMs`를 출하하고 `maxInFlight: 0`이 동시성을 무제한으로 둔다는 점을 문서화.
+### CR2-006: mcp-wrap `--stderr filter`가 Newline-Split Secret을 잡지 못함
+심각도: P3(doc)
+상태: Resolved (`haechi@1.3.2`로 발행, PR #99)
+영향 코드: `packages/cli/bin/haechi.mjs`의 `pipeFilteredStderr` / `protectStderrLine`
+검증: `filter`는 child stderr를 `\n`으로 분할하고 매 완성된 라인을 fresh single-shot protector로 보호하므로, 적대적 child가 의도적으로 newline에 걸쳐 분할 방출한 secret은 anchored full-secret regex를 회피한다. 좁은 범위: child는 운영자의 신뢰된 로컬 MCP server이고, single-line secret은 잡히며, `--stderr drop`이 있다. 이것은 라인 지향 텍스트 필터링의 본질적 속성이지 request/response 보호 경로의 익스플로잇 가능한 우회가 아니다.
+해소(doc): `COMMAND_HELP`와 이 등록부에 `filter`가 완성된 라인 단위로 보호하며 newline에 걸쳐 분할된 secret을 잡지 못한다는 한 문장을 명시; 고민감 도구에는 `--stderr drop`을 권장. 선택적 후속 코드 하드닝: stderr를 per-line `protectText` 대신 push/flush sliding-buffer 채널(`maxMatchBytes`)로 라우팅.
+### CR2-007: README mcp-wrap stderr Passthrough가 Stale
+심각도: P3(doc)
+상태: Resolved (`haechi@1.3.2`로 발행, PR #99)
+영향 코드: `README.md`
+검증: README는 "stderr and exit codes pass through"라고 하지만, 기본값은 이제 `--stderr filter`다(round-1 P2-CR-006); raw passthrough는 opt-in `inherit` 모드뿐이다. exit code는 실제로 pass through되므로 stderr 절만 stale하다; `COMMAND_HELP`는 이미 정확하다.
+해소(doc): README 줄을 `filter` 기본값을 반영하도록 수정(`inherit`은 raw, `drop`은 폐기; `filter`는 `policy.mode: enforce`에서만 변환); `README.ko.md` sibling 갱신.
+### CR2-008: README Streaming Split-Match 주장이 범위 한정 없음
+심각도: P3(doc)
+상태: Resolved (`haechi@1.3.2`로 발행, PR #99)
+영향 코드: `README.md`
+검증: README는 frame에 걸쳐 분할된 PII가 잡힌다고 주장하면서 이를 JSON delta 채널로 한정하지 않는다; non-JSON CONTENT frame은 single-shot per-frame `protectText`를 받는다(cross-frame buffer 없음). 이 주장은 `threat-model.md`와 scope 문서 대비 보장을 과장한다.
+해소(doc): README 두 구절을 모두 delta 채널로 한정(`maxMatchBytes`까지 frame에 걸쳐 분할된 delta-text PII; non-delta leaf와 non-JSON frame은 within-frame 검사); `README.ko.md` 갱신.
+### CR2-009: maxMessageBytes 검사 이후의 keyMaterial — FALSE POSITIVE
+심각도: —(보고된 P2; 취약점이 아님으로 검증)
+상태: Won't fix
+영향 코드: `packages/plugin/process-sandbox.mjs`, `packages/cli/runtime.mjs`
+검증: 구조적 관찰(`keyMaterial` append 이후 결합 메시지가 재검사되지 않음)은 정확하지만, 공격자가 익스플로잇할 수 없다. `keyMaterial`은 운영자 통제이고(host가 운영자 선언 HTTPS URL에서 fetch, TTL 캐시, 공격자 영향 credential과 독립) guarded fetcher의 `maxBytes`(기본 1 MiB)로 hard-bound된다; credential은 base 검사로 경계가 유지된다. 결합 wire는 두 운영자 설정 상수로 경계화되며 공격자 증폭이 없다; "`maxBytes`를 임의로 크게"는 운영자 자체 오설정이다. 선택적 cosmetic defense-in-depth만 가능(결합 size re-assert); 보안 수정 불필요.
+### CR2-010: Non-JSON Cross-Frame Split — 수용 잔여 리스크(문서화됨)
+심각도: —(보고된 P2; 이미 문서화된 잔여 리스크)
+상태: Accepted
+영향 코드: `packages/core/index.mjs` / `packages/stream-filter/index.mjs`
+검증: 1.3.1에서 실재한다(non-JSON CONTENT frame은 cross-frame buffer 없이 per-frame `protectText`를 받음). 하지만 round-1 `P1-CR-005` resolution, `threat-model.md` exclusions, in-code comment에 범위 외로 명시 문서화되어 있다. JSON delta 채널은 `maxMatchBytes`까지 cross-frame buffering을 한다. 코드 변경 불필요; 기껏해야 문서 다듬기 차원의 sibling exclusion 항목(CR2-008의 README scoping에 흡수).
+## 보완 순서
+1. `CR2-001`을 최우선으로 — 전제 조건 없이 인증되지 않은 클라이언트가 도달 가능한 유일한 항목(availability).
+2. `CR2-002`와 `CR2-003`을 병렬로 — 파일이 disjoint하다(token-vault+audit vs plugin sandbox).
+3. `CR2-004` + `CR2-005`를 함께(둘 다 `proxy/index.mjs`; CR2-001 이후 / 그 위에 rebase해 착륙).
+4. `CR2-006` + `CR2-007` + `CR2-008` — 문서/help-text 묶음, 아무 때나.
+5. `CR2-009` / `CR2-010`은 코드 변경 불필요(audit trail용 기록).
+## 종료 규칙
+항목은 코드/문서 보완이 merge되고, 집중 회귀 테스트 또는 명시적 non-test rationale이 기록되며, 릴리스 게이트 등록부(`G10`)가 증거를 링크할 때만 `Resolved`로 옮긴다. 1.3.2 컷이 resolved 항목과 `G10`을 함께 뒤집는다.
+## 추적 링크
+`docs/current/risk-register-release-gate.md`(§5.8 + `G10`)와 `docs/current/risk-register-release-gate.ko.md`에서 참조한다.

package/docs/current/code-review-risk-register-2026-06-16-round2.md ADDED Viewed

@@ -0,0 +1,142 @@
+# 2026-06-16 Code Review Risk Register — Round 2 (CR2)
+Status: remediation complete and shipped in `haechi@1.3.2` (no P0/P1; CR2-001..008 Resolved; CR2-009 won't-fix, CR2-010 accepted; G10 Pass, 2026-06-16)
+Scope: `main` at `36af9fd1eef2b1e19b19b2e0344faab0a7a3e83d` (post-1.3.1)
+Review date: 2026-06-16
+Source: a second deep review after the 1.3.1 remediation cut, with per-finding adversarial verification against the current code
+This is a **second round** kept separate from `code-review-risk-register-2026-06-16.md` (round 1, all Resolved + shipped in 1.3.1), which is a frozen resolution record cited by the threat model and code. Round 2 captures findings raised after 1.3.1 and re-verified against the current tree; a few of them extend or qualify round-1 claims, so they are cross-referenced rather than written back into the frozen record.
+## Release Decision
+The round-1 P0/P1 are all Resolved and shipped in `haechi@1.3.1`. Round 2 found **no P0 and no P1**: the two findings raised as P1 in the external review both verified down to **P2** (neither is a stored-plaintext leak or an auth/SSRF bypass). The confirmed P2s are an availability/resource leak on the proxy data path, an audit-hygiene gap that reflects caller-supplied input, and an unbounded plugin IPC reply. The three P2s and the P3 cluster (`CR2-001..008`) are now **Resolved and shipped in `haechi@1.3.2`** (2026-06-16, attested OIDC publish); `CR2-009` stays won't-fix (false positive) and `CR2-010` stays accepted (documented residual). **G10 is Pass.** Operators should upgrade from `1.3.1` to `1.3.2` to pick up the CR2 fixes.
+## Severity Policy
+- `P0`: direct credential/data leak across a trust boundary, or a bypass that defeats the core security promise.
+- `P1`: SSRF, protection bypass, denial-of-service, or protocol behavior that can break protected deployments.
+- `P2`: operational, correctness, availability, or hygiene gaps that should be resolved before broad adoption.
+- `P3`: low-impact hardening, finite-bound robustness, or documentation accuracy.
+## Verification note
+Every finding below was traced against the current code by an independent reviewer (not taken on the reporter's word). Two reported P1s were downgraded to P2 after verification (no inflation); one reported finding was a **false positive** and one is an **already-documented accepted residual** — both are recorded here for the audit trail and require no code change.
+## Summary
+| ID | Severity | Area | Risk | Status |
+| --- | --- | --- | --- | --- |
+| CR2-001 | P2 | Proxy availability | Pass-through streaming never cancels the upstream reader on downstream client disconnect — `await once(response,"drain")` parks forever, leaking the upstream connection/task; an unauthenticated client can disconnect repeatedly to accumulate dangling upstream connections. | Resolved |
+| CR2-002 | P2 | Audit hygiene | Token-vault reveal/purge failures write the raw caller-supplied `token` and `error.message` (which interpolates the token) into the audit event; `FORBIDDEN_KEYS` strips by key name only, so a secret passed where a `tok_` id is expected lands raw in the hash-chained log. Reflects caller input, not stored vault plaintext. | Resolved |
+| CR2-003 | P2 | Plugin sandbox DoS | `maxMessageBytes` bounds only the host→plugin credential message; the plugin→host reply is received and `JSON.parse`d unbounded. The process-isolated child has no heap cap, so a hostile/buggy signed plugin can return an oversized reply → synchronous host parse stalls the event loop + memory spike. | Resolved |
+| CR2-004 | P3 | Proxy headers | `sanitizeResponseHeaders` keeps body-coupled validators (`etag`/`content-md5`/`digest`/`last-modified`) when the body is transformed, so they become stale; no `cache-control: no-store` on a mutated response. | Resolved |
+| CR2-005 | P3 | Proxy robustness | On a body over `maxBytes`, `readBody` rejects but does not stop reading/teardown the socket, so an upload is read-and-discarded until the (finite) Node `requestTimeout`. | Resolved |
+| CR2-006 | P3 | MCP wrap | `mcp-wrap --stderr filter` protects per complete line, so a secret an adversarial child deliberately splits across a newline evades the anchored regex. Inherent to line-oriented filtering of a trusted local child's diagnostic output; a single-line secret IS caught and `--stderr drop` exists. Doc-only. | Resolved |
+| CR2-007 | P3 | Docs | README says MCP wrap "stderr and exit codes pass through", but the default is now `--stderr filter` (round-1 P2-CR-006). | Resolved |
+| CR2-008 | P3 | Docs | README's streaming "split match" claim is unscoped; cross-frame buffering applies to the JSON delta channel only, not arbitrary non-JSON frames. | Resolved |
+| CR2-009 | — | Plugin sandbox | (Reported P2) `keyMaterial` is appended after the `maxMessageBytes` check on the base credential message. **FALSE POSITIVE:** `keyMaterial` is operator-controlled and hard-bounded by the fetcher's `maxBytes`; no attacker amplification. No fix required (optional cosmetic re-assert only). | Won't fix |
+| CR2-010 | — | Streaming | (Reported P2) A secret split across two NON-JSON SSE/NDJSON frames is not caught (per-frame inspection). **ACCEPTED RESIDUAL — already documented** in round-1 P1-CR-005 resolution, `threat-model.md` exclusions, and an in-code comment. No change. | Accepted |
+## Detailed Findings
+### CR2-001: Upstream Reader Not Cancelled On Downstream Disconnect
+Severity: P2 (the most urgent CR2 item)
+Status: Resolved (shipped in `haechi@1.3.2`, PR #96)
+Affected code: `packages/proxy/index.mjs` `pipeUpstreamBodyBounded` / `forward`
+Verified: in the pass-through streaming path there is no `close`/`aborted` listener on the client connection; `await once(response, "drain")` parks indefinitely after the client socket dies (neither `drain` nor `error` fires), so the async task and the upstream connection leak. Reachable by an **unauthenticated** client with no preconditions; repeated mid-stream disconnects accumulate dangling upstream connections against the proxy and its upstream LLM endpoint.
+Resolution: pass a per-request `AbortController` into `forward()` (aborting the upstream fetch) and register a one-shot client `close`/`aborted` listener that cancels the upstream reader; race the `drain` wait against `close` so backpressure waits unpark on disconnect; cover the no-backpressure `reader.read()` parked case too. Regression test: disconnect mid-stream and assert prompt reader cancellation / upstream abort.
+### CR2-002: Token-Vault Reveal/Purge Writes Raw Token + Error Text To Audit
+Severity: P2 (downgraded from a reported P1 — it reflects caller-supplied input, not stored vault plaintext)
+Status: Resolved (shipped in `haechi@1.3.2`, PR #97)
+Affected code: `packages/token-vault/index.mjs` (reveal/purge throw + record), `packages/audit/index.mjs` (`FORBIDDEN_KEYS` / `sanitizeAudit`)
+Verified: reveal throws `Unknown token: ${token}` / `Token expired: ${token}` and the catch records `reason: error.message`; the raw `token` argument is also written verbatim on `reveal_failed`/`reveal_denied`/`purge`. `sanitizeAudit` filters by key name and `FORBIDDEN_KEYS` contains neither `reason` nor `token`, so both survive into the written, hash-chained record. In legitimate flows the `token` argument is a non-sensitive `tok_<type>_<hash>` id, so the leak only fires when a caller/operator passes a raw secret where a token id is expected — but it still contradicts the "no plaintext / keyed-HMAC only" wording in round-1 `P1-SEC-017` and `threat-model.md`.
+Resolution: (1) generic error messages (no raw token interpolation); (2) stop writing the raw `token` verbatim — keyed-HMAC it (as `subjectHash`/`issuerHash` are) or validate the argument against the `tok_<type>_<hash>` shape and redact otherwise, before reveal/purge records; (3) replace free-text `reason: error.message` with an enum `reasonCode`; (4) regression test that a `reveal_failed`/`purge` event never contains a raw caller-supplied token in `reason`/`token`; reconcile the invariant wording in the docs.
+### CR2-003: Plugin IPC Reply Not Size-Bounded; Process Child Has No Heap Cap
+Severity: P2
+Status: Resolved (shipped in `haechi@1.3.2`, PR #98)
+Affected code: `packages/plugin/sandbox.mjs`, `packages/plugin/process-sandbox.mjs`, `packages/cli/runtime.mjs`
+Verified: `maxMessageBytes` is enforced only on the outbound host→plugin credential message; the inbound reply is received and `JSON.parse`d with no size check in both sandboxes. The worker has an implicit bound (a required `resourceLimits` heap cap OOMs a runaway worker first), but the process child sets no `--max-old-space-size`, so a hostile/buggy signed plugin can build a reply up to the child's default V8 heap and `process.send` it; the host's synchronous `JSON.parse` stalls the event loop (the per-call timeout cannot fire mid-parse). Requires a signed/semi-trusted-but-hostile plugin.
+Resolution: bound the reply BEFORE parsing in both sandboxes (check byte length against `maxMessageBytes` or a dedicated `maxReplyBytes` in the worker/child `message` handler, drop oversized as a deny before `JSON.parse`); give the process child a heap cap via `--max-old-space-size` derived from a new `resourceLimits`/`processMaxOldGenerationSizeMb` knob. Regression test: a fixture plugin returning an oversized claims object → deny without unbounded host work. Update the 1.0/1.1 scope docs to state the bound applies in BOTH directions.
+### CR2-004: Stale Body-Coupled Validator Headers On Transformed Responses
+Severity: P3
+Status: Resolved (shipped in `haechi@1.3.2`, PR #96)
+Affected code: `packages/proxy/index.mjs` `sanitizeResponseHeaders` / `transformedJsonHeaders`
+Verified: only hop-by-hop headers are stripped; when `protectJson` mutates and re-serializes the body, upstream `etag`/`content-md5`/`digest`/`last-modified` survive unchanged and no `cache-control: no-store` is set. Real-world impact is small for the documented inference-upstream target set (POST responses, no strong validators, not cacheable by default per RFC 9111; `content-length` IS recomputed), but it is not recorded as an accepted residual.
+Resolution: add `etag`/`content-md5`/`digest`/`last-modified` to the dropped set on every body-mutating path; set `cache-control: no-store` on a transformed response. Test: a mutated response no longer carries the upstream `ETag`.
+### CR2-005: Over-Limit Request Body Not Drained/Torn Down
+Severity: P3
+Status: Resolved (shipped in `haechi@1.3.2`, PR #96)
+Affected code: `packages/proxy/index.mjs` `readBody`
+Verified: on exceeding `maxBytes`, `readBody` sets a flag and rejects but does not `pause()`/`destroy()` the request, and the 413 response sends no `Connection: close`, so Node reads-and-discards the rest of the upload until the built-in `requestTimeout` (Node ≥22 default 300000 ms). The hold is finite; `maxInFlight: 0` (default) does not bound simultaneous held connections.
+Resolution: on 413, `request.pause()`/`request.destroy()` (or `Connection: close` before the response) so the socket releases promptly. Lower priority: ship non-null default `requestTimeoutMs`/`headersTimeoutMs` and document that `maxInFlight: 0` leaves concurrency unbounded.
+### CR2-006: mcp-wrap `--stderr filter` Cannot Catch A Newline-Split Secret
+Severity: P3 (doc)
+Status: Resolved (shipped in `haechi@1.3.2`, PR #99)
+Affected code: `packages/cli/bin/haechi.mjs` `pipeFilteredStderr` / `protectStderrLine`
+Verified: `filter` splits child stderr on `\n` and protects each complete line with a fresh single-shot protector, so a secret an adversarial child deliberately emits split across a newline evades the anchored full-secret regex. Narrow: the child is the operator's trusted local MCP server, a single-line secret IS caught, and `--stderr drop` exists. This is an inherent property of line-oriented text filtering, not an exploitable bypass of the request/response protection path.
+Resolution (doc): one sentence in `COMMAND_HELP` and this register noting `filter` protects per complete line and cannot catch a secret split across a newline; recommend `--stderr drop` for high-sensitivity tools. Optional later code hardening: route stderr through the push/flush sliding-buffer channel (`maxMatchBytes`) instead of per-line `protectText`.
+### CR2-007: README mcp-wrap stderr Passthrough Is Stale
+Severity: P3 (doc)
+Status: Resolved (shipped in `haechi@1.3.2`, PR #99)
+Affected code: `README.md`
+Verified: README says "stderr and exit codes pass through", but the default is now `--stderr filter` (round-1 P2-CR-006); raw passthrough is only the opt-in `inherit` mode. Exit codes do pass through, so only the stderr clause is stale; `COMMAND_HELP` is already accurate.
+Resolution (doc): correct the README line to reflect the `filter` default (`inherit` for raw, `drop` to discard; `filter` transforms only under `policy.mode: enforce`); update the `README.ko.md` sibling.
+### CR2-008: README Streaming Split-Match Claim Is Unscoped
+Severity: P3 (doc)
+Status: Resolved (shipped in `haechi@1.3.2`, PR #99)
+Affected code: `README.md`
+Verified: the README claims PII split across frames is caught without scoping it to the JSON delta channel; non-JSON CONTENT frames get single-shot per-frame `protectText` (no cross-frame buffer). The claim overstates the guarantee relative to `threat-model.md` and the scope docs.
+Resolution (doc): scope both README passages to the delta channel (delta-text PII split across frames up to `maxMatchBytes`; non-delta leaves and non-JSON frames inspected within-frame); update `README.ko.md`.
+### CR2-009: keyMaterial After the maxMessageBytes Check — FALSE POSITIVE
+Severity: — (reported P2; verified not a vulnerability)
+Status: Won't fix
+Affected code: `packages/plugin/process-sandbox.mjs`, `packages/cli/runtime.mjs`
+Verified: the structural observation (the combined message is not re-checked after `keyMaterial` is appended) is accurate, but it is not attacker-exploitable. `keyMaterial` is operator-controlled (fetched by the host from an operator-declared HTTPS URL, TTL-cached, independent of the attacker-influenced credential) and hard-bounded by the guarded fetcher's `maxBytes` (default 1 MiB); the credential stays bounded by the base check. The combined wire is bounded by two operator-set constants with no attacker amplification; "`maxBytes` arbitrarily large" is operator self-misconfiguration. Optional cosmetic defense-in-depth only (re-assert the combined size); no security fix required.
+### CR2-010: Non-JSON Cross-Frame Split — ACCEPTED RESIDUAL (documented)
+Severity: — (reported P2; an already-documented residual)
+Status: Accepted
+Affected code: `packages/core/index.mjs` / `packages/stream-filter/index.mjs`
+Verified: real in 1.3.1 (non-JSON CONTENT frames get per-frame `protectText` with no cross-frame buffer), but it is explicitly documented as out-of-scope in round-1 `P1-CR-005` resolution, the `threat-model.md` exclusions, and an in-code comment. The JSON delta channel DOES cross-frame buffer up to `maxMatchBytes`. No code change required; at most a documentation-polish sibling exclusion bullet (folded into CR2-008's README scoping).
+## Remediation Order
+1. `CR2-001` first — the only finding reachable by an unauthenticated client with no preconditions (availability).
+2. `CR2-002` and `CR2-003` in parallel — file-disjoint (token-vault+audit vs plugin sandbox).
+3. `CR2-004` + `CR2-005` together (both `proxy/index.mjs`; land after / rebased on CR2-001).
+4. `CR2-006` + `CR2-007` + `CR2-008` — documentation/help-text cluster, anytime.
+5. `CR2-009` / `CR2-010` need no code change (recorded for the audit trail).
+## Closure Rules
+An item moves to `Resolved` only when the code/doc remediation is merged, a focused regression test or explicit non-test rationale is recorded, and the release-gate register (`G10`) links the evidence. The 1.3.2 cut flips the resolved items and `G10` together.
+## Traceability
+Linked from `docs/current/risk-register-release-gate.md` (§5.8 + `G10`) and `docs/current/risk-register-release-gate.ko.md`.

package/docs/current/operations-runbook.ko.md CHANGED Viewed

@@ -140,7 +140,27 @@ HAECHI_BENCH_REQUESTS=5000 HAECHI_BENCH_CONCURRENCY=64 npm run bench:throughput
 > 네트워크/하드웨어 처리량 벤치마크가 **아니며** 보장 수치로 인용해서는 **안
 > 됩니다**. 이 벤치는 `release:preflight`에서 실행되지 않습니다.
-## 8. 빠른 참조
+## 8. 실 업스트림 검증 (real vLLM / Ollama)
+`local-inference` 통합 스위트는 요청을 **실제** OpenAI 호환(vLLM) 및/또는
+Ollama 업스트림으로 프록시하여, 프록시가 올바르게 왕복하는지 검증합니다(실제
+소켓 위에서의 adapter 라우팅 + 요청/응답 보호). 이 스위트는 env-gated되어, 백엔드를
+가리키지 않으면 **스킵**합니다 — CI는 프로토콜 스텁을 상대로 실행합니다(실제 vLLM은
+GPU가 필요하고 GitHub 호스팅 러너에서 도달할 수 없습니다). 도달 가능한 호스트에서
+본인의 백엔드를 상대로 검증하려면:
+```bash
+HAECHI_VLLM_URL=http://VLLM_HOST:8000  HAECHI_VLLM_MODEL=<served-model> \
+HAECHI_OLLAMA_URL=http://OLLAMA_HOST:11434  HAECHI_OLLAMA_MODEL=<pulled-model> \
+  npm run test:inference:live
+```
+보유한 백엔드만 설정하십시오 — 각 테스트는 해당 URL이 설정되지 않으면 스킵합니다.
+본인의 호스트/IP를 사용하십시오(커밋하지 마십시오). 지속적으로 구동되는 실제 백엔드
+게이트가 필요하면, 해당 네트워크에 self-hosted 러너를 등록하고 그곳에서 스위트를
+트리거하십시오. GitHub 호스팅 러너는 사설 LAN에 도달할 수 없습니다.
+## 9. 빠른 참조
 | 작업 | 커맨드 |
 |---|---|

package/docs/current/operations-runbook.md CHANGED Viewed

@@ -225,7 +225,28 @@ default 2000), `HAECHI_BENCH_CONCURRENCY` (default 32), `HAECHI_BENCH_WARMUP`
 > and must **not** be quoted as guarantees. The bench is not run by
 > `release:preflight`.
-## 8. Quick reference
+## 8. Live upstream validation (real vLLM / Ollama)
+The `local-inference` integration suite proxies a request through to a **real**
+OpenAI-compatible (vLLM) and/or Ollama upstream and asserts the proxy round-trips
+correctly (adapter routing + request/response protection over a real socket). It
+is env-gated, so it **skips** unless you point it at a backend — CI runs it
+against a protocol stub (a real vLLM needs a GPU and is not reachable from a
+GitHub-hosted runner). To validate against your own backend from a host that can
+reach it:
+```bash
+HAECHI_VLLM_URL=http://VLLM_HOST:8000  HAECHI_VLLM_MODEL=<served-model> \
+HAECHI_OLLAMA_URL=http://OLLAMA_HOST:11434  HAECHI_OLLAMA_MODEL=<pulled-model> \
+  npm run test:inference:live
+```
+Set only the backend(s) you have — each test skips when its URL is unset. Use
+your own host/IP (do not commit it). For a continuously-exercised real-backend
+gate, register a self-hosted runner on that network and trigger the suite there;
+GitHub-hosted runners cannot reach a private LAN.
+## 9. Quick reference
 | Task | Command |
 |---|---|

package/docs/current/release-process.ko.md CHANGED Viewed

@@ -27,7 +27,9 @@ npm run release:preflight:npm
 1. ✅ npmjs.com에서: package settings → Trusted Publisher → `raeseoklee/haechi` 저장소와 `npm-publish.yml` workflow 연결 (2026-06-10).
 2. ✅ `.github/workflows/npm-publish.yml` OIDC 인증 전환 (2026-06-10): `NODE_AUTH_TOKEN`과 `registry-url` 제거, runner의 npm CLI를 `>= 11.5.1`로 업그레이드.
-3. ✅ `haechi@0.4.0`으로 검증 완료 (2026-06-10): `npm view haechi --json`에서 SLSA provenance v1 predicate를 가진 `dist.attestations` 확인. 로컬 패스키로 배포한 `haechi@0.3.2`만 비증명 상태로 남습니다.
+3. ✅ `haechi@0.4.0`으로 검증 완료 (2026-06-10): `npm view haechi --json`에서 SLSA provenance v1 predicate를 가진 `dist.attestations` 확인.
+**비증명 버전(로컬 패스키 첫 발행):** `haechi@0.3.2`와 `haechi-ratelimit-redis@0.1.0`(2026-06-16)은 각각 로컬 머신에서 `--provenance=false`로 배포되어 두 버전의 provenance 증명이 존재하지 않습니다 — 둘 다 아직 존재하지 않던 패키지의 **이름을 확보하는 첫 발행**이었기 때문입니다(Trusted Publisher가 완전히 새로운 이름을 부트스트랩할 수 없는 이유는 §5 참조). 각 패키지의 이후 모든 버전은 OIDC workflow로 증명됩니다.
 provenance 없이 수행한 publish는 release note에 갭을 명시적으로 기록해야 합니다(`CONTRIBUTING.md` 참조).
@@ -80,10 +82,16 @@ Satellite는 npm workspaces 모노레포의 `satellites/*`에 살며 core와 **
 **satellite별 부트스트랩 순서(첫 발행, org 불필요):**
-1. npmjs.com에서 (아직 미발행) unscoped 이름(예: `haechi-crypto-kms`)에 **Trusted Publisher 설정**: `raeseoklee/haechi` 저장소와 satellite의 **정확한 워크플로 파일명**(예: `crypto-kms-publish.yml`)을 연결합니다. npm은 아직 발행 전인 이름에도 Trusted Publisher 설정을 허용합니다.
-2. 접두사 태그를 push하고 GitHub Release를 발행하면(예: `crypto-kms-v0.1.0`) 워크플로의 OIDC publish가 provenance와 함께 `0.1.0`을 생성하고 첫 발행 시 이름을 확보합니다.
+아직 존재하지 않는 이름에는 Trusted Publisher를 설정할 **수 없습니다** — npm은 **이미 존재하는** 패키지의 설정 페이지에서만 Trusted Publisher 설정을 노출합니다. 따라서 완전히 새로운 unscoped 이름은 두 단계 부트스트랩을 거칩니다: 먼저 수동 첫 발행으로 이름을 *생성하고 확보*한 뒤, Trusted Publisher를 설정하여 이후 모든 버전이 OIDC로 증명되게 합니다.
-노트북에서의 수동 `npm publish`는 필요 없습니다. 이름이 unscoped이고 비어있으므로 org-membership 선행 요건이 없습니다.
+1. **수동 첫 발행(이름 확보; 로컬, provenance 없음).** satellite 디렉터리에서, 패스키/WebAuthn 계정이 터미널 OTP 없이 인증되도록 브라우저로 인증한 뒤 provenance를 끄고 발행합니다(로컬 머신에는 OIDC id-token이 없어 증명할 수 없습니다).
+   ```bash
+   npm login --auth-type=web
+   cd satellites/<name> && npm publish --auth-type=web --provenance=false
+   ```
+   각 satellite `package.json`의 `publishConfig.access: "public"`이 unscoped 패키지를 public으로 만듭니다. 이 첫 버전은 **비증명**입니다 — §2 / `CONTRIBUTING.md`에 따라 갭을 기록하세요.
+2. **이제 패키지가 존재하므로 → Trusted Publisher 설정**: npmjs.com에서 package settings → Trusted Publisher → `raeseoklee/haechi` 저장소와 satellite의 **정확한 워크플로 파일명**(예: `crypto-kms-publish.yml`)을 연결합니다.
+3. **이후 모든 버전은 OIDC로 증명됩니다.** satellite `package.json`을 bump하고, 접두사 태그를 push한 뒤, GitHub Release를 발행하면(예: `crypto-kms-v0.1.1`) 워크플로의 OIDC publish가 provenance와 함께 해당 버전을 발행합니다. 이 시점부터는 노트북도 OTP도 필요 없습니다. 이름이 unscoped이고 비어있으므로 org-membership 선행 요건이 없습니다.
 **태그 → 워크플로 → 패키지 매핑:**
@@ -104,9 +112,9 @@ npm view haechi-crypto-kms --json   # dist.attestations 존재 확인; access "p
 **의존성 노트:** `haechi-crypto-kms`는 core를 zero-dependency로 유지합니다 — `@aws-sdk/client-kms`는 **optional peer dependency**이며, 실제 AWS 클라이언트를 쓰고 주입하지 않을 때만 lazy import됩니다. in-memory 또는 주입형 클라이언트를 쓰는 소비자는 SDK를 설치하지 않습니다. 0.2.0의 `./gcp`(`@google-cloud/kms`)와 `./azure`(`@azure/keyvault-keys` + `@azure/identity`) 백엔드도 동일한 optional-peer/lazy-import 모델을 따르며, `./vault` 백엔드는 optional peer가 없습니다(`node:` `fetch` 전용).
-**0.9 satellite(새 unscoped 이름 — 첫 태그 *전에* Trusted Publisher 설정):** `haechi-dashboard`와 `haechi-auth-oidc`는 0.9에서 첫 발행되며 위의 satellite별 부트스트랩 순서를 동일하게 따릅니다. 0.8 satellite와 마찬가지로 unscoped 이름은 첫 OIDC publish 시 확보되므로, 각각의 npmjs.com Trusted Publisher를 첫 태그 **전에** 설정해야 합니다 — `raeseoklee/haechi` 저장소와 정확한 워크플로 파일명(`haechi-dashboard`는 `dashboard-publish.yml`, `haechi-auth-oidc`는 `auth-oidc-publish.yml`)을 연결한 뒤, 접두사 태그(`dashboard-v0.1.0`, `auth-oidc-v0.1.0`)를 push하고 GitHub Release를 발행합니다. 기존 두 satellite는 이미 부트스트랩된 태그/워크플로를 그대로 사용합니다: `haechi-auth-jwt@0.2.0`은 `auth-jwt-v<semver>`(`auth-jwt-publish.yml`), `haechi-crypto-kms@0.2.0`은 `crypto-kms-v<semver>`(`crypto-kms-publish.yml`) — 이 둘은 새 Trusted Publisher 설정이 필요 없습니다.
+**0.9 satellite(새 unscoped 이름):** `haechi-dashboard`와 `haechi-auth-oidc`는 0.9에서 위의 두 단계 부트스트랩으로 첫 발행되었습니다 — 수동 첫 발행으로 각 이름을 확보한 뒤 Trusted Publisher를 설정했고, 그 이후 태그 릴리스(`dashboard-v<semver>`, `auth-oidc-v<semver>`)는 OIDC로 발행됩니다. 0.8 satellite 두 개는 이미 존재하므로 이미 부트스트랩된 태그/워크플로를 그대로 사용합니다: `haechi-auth-jwt`는 `auth-jwt-v<semver>`(`auth-jwt-publish.yml`), `haechi-crypto-kms`는 `crypto-kms-v<semver>`(`crypto-kms-publish.yml`) — 이 둘은 새 Trusted Publisher 설정이 필요 없습니다.
-**`haechi-ratelimit-redis`(새 unscoped 이름 — 첫 태그 *전에* Trusted Publisher 설정):** 공유 저장소 rate-limiter satellite는 고유의 `ratelimit-redis-v<semver>` 태그에서 첫 발행되며 위의 satellite별 부트스트랩 순서를 동일하게 따릅니다. unscoped 이름은 첫 OIDC publish 시 확보되므로, npmjs.com Trusted Publisher를 첫 태그 **전에** 설정해야 합니다 — `raeseoklee/haechi` 저장소와 정확한 워크플로 파일명 `ratelimit-redis-publish.yml`을 연결한 뒤, 접두사 태그(`ratelimit-redis-v0.1.0`)를 push하고 GitHub Release를 발행합니다. `redis` 클라이언트는 **optional peer dependency**이며 번들된 Redis 어댑터를 쓰는 소비자만 import합니다(store/client는 주입됩니다). 따라서 core는 zero-dependency로 유지됩니다.
+**`haechi-ratelimit-redis`(부트스트랩 2026-06-16):** 공유 저장소 rate-limiter satellite는 위의 두 단계 부트스트랩을 따랐습니다. `0.1.0`은 이름을 확보한 **수동 첫 발행**(로컬 패스키 web 인증, `--provenance=false`)이므로 **비증명**입니다(§2에 기록). 이후 Trusted Publisher(`ratelimit-redis-publish.yml`)를 설정했고, `0.1.1`부터의 모든 버전은 `ratelimit-redis-v<semver>` 태그 → 워크플로로 provenance와 함께 발행됩니다. `redis` 클라이언트는 **optional peer dependency**이며 번들된 Redis 어댑터를 쓰는 소비자만 import합니다(store/client는 주입됩니다). 따라서 core는 zero-dependency로 유지됩니다.
 ## 6. 배포 차단 조건

package/docs/current/release-process.md CHANGED Viewed

@@ -27,7 +27,9 @@ The intended publish path is GitHub Actions trusted publishing: npm authenticate
 1. ✅ On npmjs.com: package settings → Trusted Publisher → linked the `raeseoklee/haechi` repository and the `npm-publish.yml` workflow (2026-06-10).
 2. ✅ `.github/workflows/npm-publish.yml` authenticates via OIDC (2026-06-10): `NODE_AUTH_TOKEN` and `registry-url` removed, npm CLI upgraded to `>= 11.5.1` in the runner.
-3. ✅ Verified with `haechi@0.4.0` (2026-06-10): `npm view haechi --json` shows `dist.attestations` with a SLSA provenance v1 predicate. Only `haechi@0.3.2` remains unattested (published via local passkey).
+3. ✅ Verified with `haechi@0.4.0` (2026-06-10): `npm view haechi --json` shows `dist.attestations` with a SLSA provenance v1 predicate.
+**Unattested versions (local passkey first publishes):** `haechi@0.3.2` and `haechi-ratelimit-redis@0.1.0` (2026-06-16) were each published from a local machine with `--provenance=false`, so no provenance attestation exists for those two versions — both were the **name-claiming first publish** of a package that did not yet exist (see §5 on why a Trusted Publisher cannot bootstrap a brand-new name). Every later version of each package is attested via the OIDC workflow.
 Any publish performed without provenance must record the gap explicitly in the release notes (see `CONTRIBUTING.md`).
@@ -80,10 +82,16 @@ Satellites live under `satellites/*` in the npm workspaces monorepo and publish
 **Per-satellite bootstrap order (first publish, no org needed):**
-1. On npmjs.com, **configure a Trusted Publisher** for the (not-yet-published) unscoped name (e.g. `haechi-crypto-kms`): link the `raeseoklee/haechi` repository and the satellite's **exact workflow filename** (e.g. `crypto-kms-publish.yml`). npm allows configuring a Trusted Publisher for a name you have not published yet.
-2. Push the prefixed tag and publish a GitHub Release (e.g. `crypto-kms-v0.1.0`) → the workflow's OIDC publish creates `0.1.0` with provenance and claims the name on first publish.
+A Trusted Publisher **cannot** be configured for a name that does not exist yet — npm only exposes the Trusted Publisher setting on an **existing** package's settings page. So a brand-new unscoped name has a two-phase bootstrap: a manual first publish to *create and claim* the name, then Trusted-Publisher configuration so every later version is OIDC-attested.
-No manual `npm publish` from a laptop is needed. Because the names are unscoped and free, there is no org-membership prerequisite.
+1. **Manual first publish (claims the name; local, no provenance).** From the satellite directory, authenticate via the browser so a passkey/WebAuthn account needs no terminal OTP, then publish with provenance off (a local machine has no OIDC id-token, so it cannot attest):
+   ```bash
+   npm login --auth-type=web
+   cd satellites/<name> && npm publish --auth-type=web --provenance=false
+   ```
+   `publishConfig.access: "public"` in each satellite's `package.json` makes the unscoped package public. This first version is **unattested** — record the gap per §2 / `CONTRIBUTING.md`.
+2. **Now the package exists → configure a Trusted Publisher** on npmjs.com: package settings → Trusted Publisher → link the `raeseoklee/haechi` repository and the satellite's **exact workflow filename** (e.g. `crypto-kms-publish.yml`).
+3. **Every subsequent version is OIDC-attested.** Bump the satellite `package.json`, push the prefixed tag, and publish a GitHub Release (e.g. `crypto-kms-v0.1.1`) → the workflow's OIDC publish ships that version with provenance. No laptop and no OTP from here on. Because the names are unscoped and free, there is no org-membership prerequisite.
 **Tag → workflow → package mapping:**
@@ -104,9 +112,9 @@ npm view haechi-crypto-kms --json   # dist.attestations present; access "public"
 **Dependency note:** `haechi-crypto-kms` keeps core zero-dependency — `@aws-sdk/client-kms` is an **optional peer dependency**, imported lazily only when a real AWS client is used and not injected. Consumers who use the in-memory or an injected client never install the SDK. The 0.2.0 `./gcp` (`@google-cloud/kms`) and `./azure` (`@azure/keyvault-keys` + `@azure/identity`) backends follow the same optional-peer/lazy-import model; the `./vault` backend has zero optional peer (`node:` `fetch` only).
-**0.9 satellites (new unscoped names — configure Trusted Publisher *before* the first tag):** `haechi-dashboard` and `haechi-auth-oidc` are first-published in 0.9 and follow the same per-satellite bootstrap order above. As with the 0.8 satellites, the unscoped name is claimed on first OIDC publish, so the npmjs.com Trusted Publisher for each must be configured **before** its first tag — link `raeseoklee/haechi` and the exact workflow filename (`dashboard-publish.yml` for `haechi-dashboard`, `auth-oidc-publish.yml` for `haechi-auth-oidc`), then push the prefixed tag (`dashboard-v0.1.0`, `auth-oidc-v0.1.0`) and publish the GitHub Release. The two existing satellites ride their already-bootstrapped tags/workflows: `haechi-auth-jwt@0.2.0` on `auth-jwt-v<semver>` (`auth-jwt-publish.yml`) and `haechi-crypto-kms@0.2.0` on `crypto-kms-v<semver>` (`crypto-kms-publish.yml`) — no new Trusted Publisher configuration is required for those two.
+**0.9 satellites (new unscoped names):** `haechi-dashboard` and `haechi-auth-oidc` were first-published in 0.9 via the two-phase bootstrap above — a manual first publish to claim each name, then the Trusted Publisher, after which their tagged releases (`dashboard-v<semver>`, `auth-oidc-v<semver>`) publish via OIDC. The two 0.8 satellites already exist and ride their already-bootstrapped tags/workflows: `haechi-auth-jwt` on `auth-jwt-v<semver>` (`auth-jwt-publish.yml`) and `haechi-crypto-kms` on `crypto-kms-v<semver>` (`crypto-kms-publish.yml`) — no new Trusted Publisher configuration is required for those two.
-**`haechi-ratelimit-redis` (new unscoped name — configure Trusted Publisher *before* the first tag):** the shared-store rate-limiter satellite is first-published from its own `ratelimit-redis-v<semver>` tag and follows the same per-satellite bootstrap order above. The unscoped name is claimed on its first OIDC publish, so its npmjs.com Trusted Publisher must be configured **before** its first tag — link `raeseoklee/haechi` and the exact workflow filename `ratelimit-redis-publish.yml`, then push the prefixed tag (`ratelimit-redis-v0.1.0`) and publish the GitHub Release. The `redis` client is an **optional peer dependency**, imported only by consumers using the bundled Redis adapter (the store/client is injected), so core stays zero-dependency.
+**`haechi-ratelimit-redis` (bootstrapped 2026-06-16):** the shared-store rate-limiter satellite followed the two-phase bootstrap above. `0.1.0` was the **manual first publish** (local passkey web auth, `--provenance=false`) that claimed the name — so it is **unattested** (recorded in §2). The Trusted Publisher (`ratelimit-redis-publish.yml`) was then configured, and every version from `0.1.1` on is published via the `ratelimit-redis-v<semver>` tag → workflow with provenance. The `redis` client is an **optional peer dependency**, imported only by consumers using the bundled Redis adapter (the store/client is injected), so core stays zero-dependency.
 ## 6. Deployment block conditions

package/docs/current/risk-register-release-gate.ko.md CHANGED Viewed

@@ -14,9 +14,9 @@ Haechi는 `1.x` stable 라인을 출시했습니다. developer preview 게이트
 | 구분 | 판단 | 이유 |
 |---|---|---|
 | GitHub public | 허용 | 보안 한계, threat model, shared responsibility가 문서화됨 |
-| GitHub release/tag | 허용 (`v1.3.1` 릴리스됨) | `v1.3.1` 보완 컷이 태깅·릴리스됨; §5.7 항목이 모두 Resolved이고 G9은 Pass |
-| npm stable | `haechi@1.3.1` publish됨 | 코드리뷰 보완이 `haechi@1.3.1` attested OIDC publish(2026-06-16)로 발행됨; 이전 `1.3.0`은 수정 이전 동작을 담고 있음 |
-| production use | 운영자 게이트; `1.3.1`로 업그레이드 | 운영자 네트워크 통제, 인가/인증, key custody가 있을 때만 지원; `haechi@1.3.0` 운영자는 민감한 제3자 업스트림 트래픽을 프록시로 라우팅하기 전에 프록시 헤더 경계 수정(P0-CR-001)을 반영하도록 `1.3.1`로 업그레이드해야 함 |
+| GitHub release/tag | 허용 (`v1.3.2` 릴리스됨) | `v1.3.2` CR2 보완 컷이 태깅·릴리스됨; §5.7 및 §5.8(`CR2-001..008`) 항목이 모두 Resolved이고 G9/G10은 Pass |
+| npm stable | `haechi@1.3.2` publish됨 | CR2 보완이 `haechi@1.3.2` attested OIDC publish(2026-06-16)로 발행됨; 이전 `1.3.1`은 CR2 수정 이전 동작을 담고 있음 |
+| production use | 운영자 게이트; `1.3.2`로 업그레이드 | 운영자 네트워크 통제, 인가/인증, key custody가 있을 때만 지원; `haechi@1.3.1` 운영자는 민감한 제3자 업스트림 트래픽을 프록시로 라우팅하기 전에 CR2 수정(특히 `CR2-001` 프록시 upstream-cancel과 `CR2-002` token-vault audit hygiene)을 반영하도록 `1.3.2`로 업그레이드해야 함 |
 ## 2. 릴리스 게이트
@@ -32,6 +32,7 @@ Haechi는 `1.x` stable 라인을 출시했습니다. developer preview 게이트
 | G7 | 1.2.0 신뢰성 강화 트랙 (WS1–WS6) | 탐지 품질 측정+강화(WS2: 라벨 코퍼스 precision/recall `bench:detection` 게이트, 자격증명+국제 PII 커버리지, 하드블록 타입 불변식이 적용된 `filters.minConfidence` / `filters.allowlist`, offset 무결성을 갖춘 NFKC 유니코드 회피 폴딩); WS3 주입 가능한 `rateLimiter` 시임 + bounded fixed-window map; WS4 운영성(`/__haechi/live`+`/ready` 분리, 주입 가능한 `/metrics`, 구조적 로그 + 요청별 `correlationId`, graceful drain, max-in-flight backpressure, env overlay, 하드닝 Dockerfile/compose/runbook, `configVersion`); WS6 proxy TLS / remote-bind 하드닝(`proxy.tls` / `proxy.trustForwardedProto`, fail-closed `assertSafeProxyTransport`) + OWASP-LLM/NIST 컨트롤 매핑 백서 + RFC 9116 `security.txt` + 취약점 공개 경로. 모든 변경은 1.1 동작을 보존하는 기본값 뒤의 additive(`tests/api-contract.test.mjs` 통과); no-plaintext-in-audit 불변식이 텔레메트리까지 확장; core는 zero runtime dependency 유지; core 1.2.0 bump(additive 마이너) | Pass |
 | G8 | 1.3.0 백엔드 + 탐지 커버리지 확장 | **Anthropic Messages API**(`/v1/messages`, content-block + SSE `delta.text`, `event:` 라인 보존 재직렬화)와 **Google Gemini API**(model-in-path `:generateContent`/`:streamGenerateContent`, 기존 정확-매칭 어댑터를 바이트 동일하게 두는 additive `:method`-suffix 라우트 매처) 프로토콜 어댑터 추가; 탐지 커버리지 확장 — 클라우드/SaaS provider 키(OpenAI/Anthropic/Google-OAuth/SendGrid/Twilio/npm/Azure, anchored)와 국제 PII(FR/ES/JP + IT/SG/IN/DE/NL 국가 ID, 체크섬 validator), 각 하드블록-대-dial-eligible 결정은 측정된 충돌률 기반(하드블록은 비숫자 앵커 또는 비현실적으로 드문 형태가 필요; 흔한 길이의 bare-digit run은 allowlist로 정리 가능 유지); `bench:throughput` proxy 부하 벤치; `haechi-ratelimit-redis` 공유 저장소 rate-limiter 위성(WS3 시임의 운영 소비자; proxy가 이제 `rateLimiter.allow`를 `await`); `haechi-dashboard`가 요청별 `correlationId` 노출. 모든 변경은 additive — 새 `target.type`/탐지타입/`privacy.profile` *값*이며 새 config 키가 아님(`configVersion`은 `1` 유지); `tests/api-contract.test.mjs` 통과; core는 zero runtime dependency 유지; core 1.3.0 bump(additive 마이너) | Pass |
 | G9 | 2026-06-16 전체 코드리뷰 보완 게이트 (1.3.1로 발행) | `P0-CR-001` 및 `P1-CR-002`부터 `P1-CR-005`까지 해결 또는 책임자 명시 수용; P2 항목은 해결 또는 명시적 non-blocking 근거와 일정 기록; 연결된 등록부 갱신. **13개 `P*-CR-*` 항목이 모두 Resolved이며(§5.7) `haechi@1.3.1`(2026-06-16, attested OIDC publish)로 발행되었습니다; core가 1.3.0 → 1.3.1로 bump(patch, 보완 전용 — API/config 표면 변경 없음, `configVersion`은 `1` 유지)되었습니다.** | Pass (`haechi@1.3.1`, 2026-06-16) |
+| G10 | 2026-06-16 코드리뷰 round 2 (CR2) 보완 게이트 | CR2 등록부(`code-review-risk-register-2026-06-16-round2.md`, §5.8)는 **P0/P1을 발견하지 못했습니다**; 세 개의 P2(`CR2-001` 프록시 upstream-cancel, `CR2-002` token-vault audit hygiene, `CR2-003` plugin IPC reply 경계)와 P3 묶음(`CR2-004..008`)이 모두 **Resolved이며 `haechi@1.3.2`로 발행되었고**(`CR2-009` won't-fix, `CR2-010` accepted) 연결된 등록부가 갱신되었습니다. | Pass (`haechi@1.3.2`, 2026-06-16) |
 ## 3. P0 배포 차단 리스크 상태
@@ -151,6 +152,23 @@ base64/인코딩 값 디코딩 검사, query string 검사, audit tail truncatio
 | P2-CR-012 | KMS vault IPv6 loopback carve-out의 IPv6 테스트 부족 | Resolved | `satellites/crypto-kms/vault.test.mjs`에 전용 IPv6 loopback 정책 테스트("…enforces the IPv6 loopback policy (::1, [::1], dotted + hex mapped) — P2-CR-012")를 추가해 bare `::1`, bracketed `[::1]`, dotted `::ffff:127.0.0.1`, hex `::ffff:7f00:1`/`::ffff:7f00:0001`(및 bracketed 변형)을 검증하고, 공인 mapped 주소(`::ffff:8.8.8.8`/`::ffff:808:808`)가 과차단되지 않음을 단언; 확장된 range table과 `ssrf-parity.test.mjs`가 auth-jwt와의 dotted+hex 일치를 고정 |
 | P2-CR-013 | SSE multi-line `data:` 필드를 newline separator 없이 합침 | Resolved | `parseFrame`이 여러 `data:` line을 `join("\n")`(스펙 separator)으로 합치고 line별 스펙 선행 공백 1개만 제거; multi-line JSON은 여전히 `JSON.parse`되고 multi-line plain text는 newline과 함께 재구성되어 검사되며 `serializeTextFrame`가 multi-line payload를 여러 `data:` line으로 재방출; `tests/stream-filter.test.mjs`가 multi-line JSON event와 PII 포함 multi-line plain-text event를 커버 |
+## 5.8 2026-06-16 코드리뷰 Round 2 (CR2) 상태 — 게이트 G10
+권위 있는 항목별 등록부는 `docs/current/code-review-risk-register-2026-06-16-round2.md`입니다; 이 절은 릴리스 게이트 요약입니다. 1.3.1 컷 이후 진행한 2차 심층 리뷰는 **P0도 P1도 발견하지 못했습니다**(외부에서 P1로 보고된 두 항목 모두 검증 결과 P2로 내려갔습니다 — 둘 다 stored-plaintext leak도, auth/SSRF 우회도 아닙니다). 세 개의 P2 + P3 묶음(`CR2-001..008`)은 **Resolved이며 `haechi@1.3.2`로 발행되었습니다**; 보고된 한 항목은 **false positive**(`CR2-009`, won't-fix)였고 한 항목은 **이미 문서화된 수용 잔여 리스크**(`CR2-010`, accepted)였습니다. **G10은 Pass입니다.**
+| ID | 리스크 | 상태 | 종료에 필요한 증거 |
+|---|---|---|---|
+| CR2-001 | pass-through streaming이 downstream disconnect 시 upstream reader를 절대 취소하지 않음(`pipeUpstreamBodyBounded`가 `drain`에서 영원히 park) — 인증되지 않은 resource leak | Resolved | per-request `AbortController` + upstream reader를 취소하고 fetch를 abort하는 클라이언트 `close`/`aborted` listener; `drain` 대기를 `close`와 race; 스트림 도중 disconnect가 reader를 즉시 취소하는 회귀 테스트 |
+| CR2-002 | token-vault reveal/purge가 호출자 제공 raw `token` + `error.message`(token interpolate됨)를 audit event에 기록; `FORBIDDEN_KEYS`는 key 이름으로만 제거 | Resolved | 일반화된 오류 메시지; 기록 이전에 `token`을 keyed-HMAC하거나 `tok_` 형태로 검증; `error.message` 대신 enum `reasonCode`; raw token이 `reason`/`token`에 도달하지 않는다는 회귀 테스트; 불변식 표현 정합화 |
+| CR2-003 | plugin IPC reply가 `JSON.parse` 이전에 size-bound되지 않음; process child에 heap cap 없음 → 적대적 signed plugin으로 인한 event-loop 정지 + 메모리 급증 | Resolved | 두 sandbox 모두에서 parse 이전 reply byte-length 검사(oversized를 deny로 drop); 새 `resourceLimits` knob을 통한 process child의 `--max-old-space-size` heap cap; oversized-reply fixture 회귀 테스트 |
+| CR2-004 | `sanitizeResponseHeaders`가 변환된 응답에 stale body-coupled validator(`etag`/`content-md5`/`digest`/`last-modified`)를 유지 | Resolved | 모든 body-mutating 경로에서 해당 헤더 drop + `cache-control: no-store`; 변경된 응답이 upstream `ETag`를 drop하는 테스트 |
+| CR2-005 | `maxBytes` 초과 request body가 (유한한) Node `requestTimeout`까지 read-and-discard됨 — socket teardown 없음 | Resolved | 413 경로에서 `request.pause()`/`destroy()`(또는 `Connection: close`); 선택적으로 non-null 기본 timeout |
+| CR2-006 | `mcp-wrap --stderr filter`가 라인 지향이라 newline-split secret이 회피함(본질적; single-line secret은 잡힘, `drop` 사용 가능) | Resolved | `COMMAND_HELP` + 등록부 노트; 고민감 도구에 `--stderr drop` 권장 |
+| CR2-007 | README가 mcp-wrap "stderr ... pass through"라고 하지만 기본값은 이제 `--stderr filter` | Resolved | README + `README.ko.md` 수정 |
+| CR2-008 | README streaming split-match 주장이 범위 한정 없음(cross-frame buffering은 delta 채널만) | Resolved | README 두 구절 + `README.ko.md`를 delta 채널로 한정 |
+| CR2-009 | (보고된 P2) credential `maxMessageBytes` 검사 이후 append된 `keyMaterial` | Won't fix (FALSE POSITIVE) | `keyMaterial`은 운영자 통제 + fetcher `maxBytes`로 hard-bound; 공격자 증폭 없음 — 선택적 cosmetic re-assert만 |
+| CR2-010 | (보고된 P2) 두 NON-JSON SSE frame에 걸쳐 분할된 secret 미포착 | Accepted (documented) | round-1 `P1-CR-005`, `threat-model.md`, in-code comment에 이미 범위 외; JSON delta 채널은 `maxMatchBytes`까지 buffering함 |
 ## 6. P2 제품/문서 리스크 상태
 | ID | 기존 리스크 | 상태 | 해소 증거 |
@@ -164,7 +182,7 @@ base64/인코딩 값 디코딩 검사, query string 검사, audit tail truncatio
 이 체크리스트는 `1.x` stable 라인의 모든 릴리스에 대한 상시 배포 전 템플릿이며, `0.3.2` developer preview에서 처음 적용되었습니다. 그 결과를 아래에 참조 기록으로 보존합니다.
-2026-06-16 현재 상태: G9은 `Pass`입니다 — 코드리뷰 보완이 `haechi@1.3.1`로 발행되었습니다. 이 체크리스트는 해당 컷에 대해 해제되었습니다.
+2026-06-16 현재 상태: G9은 `Pass`입니다(round-1 보완이 `haechi@1.3.1`로 발행됨). 게이트 **G10**(CR2, §5.8)은 이제 `Pass`입니다 — CR2 P2 + P3 묶음(`CR2-001..008`)이 Resolved이며 `haechi@1.3.2`로 발행되었으므로, 그 컷에 대해 이 체크리스트가 해제되었습니다.
 외부 npm 게이트 확인 결과(`0.3.2` developer preview, 2026-06-10, 배포 후)는 다음과 같습니다.

package/docs/current/risk-register-release-gate.md CHANGED Viewed

@@ -14,9 +14,9 @@ Haechi has shipped its `1.x` stable line. The developer-preview gate (G2, `haech
 | Category | Judgment | Rationale |
 |---|---|---|
 | GitHub public | Allowed | Security limitations, threat model, and shared responsibility are documented |
-| GitHub release/tag | Allowed (`v1.3.1` released) | The `v1.3.1` remediation cut is tagged and released; all §5.7 findings are Resolved and G9 is Pass |
-| npm stable | `haechi@1.3.1` published | The code-review remediation shipped in the `haechi@1.3.1` attested OIDC publish (2026-06-16); the prior `1.3.0` carries the pre-fix behavior |
-| Production use | Operator-gated; upgrade to `1.3.1` | Supported only with operator network controls, authz/authn, and key custody; operators on `haechi@1.3.0` should upgrade to `1.3.1` to pick up the proxy header-boundary fix (P0-CR-001) before routing sensitive third-party upstream traffic through the proxy |
+| GitHub release/tag | Allowed (`v1.3.2` released) | The `v1.3.2` CR2 remediation cut is tagged and released; all §5.7 and §5.8 (`CR2-001..008`) findings are Resolved and G9/G10 are Pass |
+| npm stable | `haechi@1.3.2` published | The CR2 remediation shipped in the `haechi@1.3.2` attested OIDC publish (2026-06-16); the prior `1.3.1` carries the pre-CR2-fix behavior |
+| Production use | Operator-gated; upgrade to `1.3.2` | Supported only with operator network controls, authz/authn, and key custody; operators on `haechi@1.3.1` should upgrade to `1.3.2` to pick up the CR2 fixes (notably the `CR2-001` proxy upstream-cancel and `CR2-002` token-vault audit hygiene) before routing sensitive third-party upstream traffic through the proxy |
 ## 2. Release Gates
@@ -32,6 +32,7 @@ Haechi has shipped its `1.x` stable line. The developer-preview gate (G2, `haech
 | G7 | 1.2.0 Reliability Hardening Track (WS1–WS6) | Detection quality measured + tightened (WS2: a labeled-corpus precision/recall `bench:detection` gate, credential + international-PII coverage, `filters.minConfidence` / `filters.allowlist` with the hard-block-types invariant, NFKC unicode-evasion folding with offset-integrity); WS3 injectable `rateLimiter` seam + bounded fixed-window map; WS4 operability (`/__haechi/live`+`/ready` split, injectable `/metrics`, structured logs + per-request `correlationId`, graceful drain, max-in-flight backpressure, env overlay, hardened Dockerfile/compose/runbook, `configVersion`); WS6 proxy TLS / remote-bind hardening (`proxy.tls` / `proxy.trustForwardedProto`, fail-closed `assertSafeProxyTransport`) + OWASP-LLM/NIST control-mapping whitepaper + RFC 9116 `security.txt` + vulnerability-disclosure path. Every change is additive behind 1.1-preserving defaults (`tests/api-contract.test.mjs` green); the no-plaintext-in-audit invariant extends to telemetry; core stays zero runtime dependency; core bumped to 1.2.0 (additive minor) | Pass |
 | G8 | 1.3.0 backend + detection coverage expansion | New protocol adapters for the **Anthropic Messages API** (`/v1/messages`, content-block + SSE `delta.text` with `event:`-line-preserving re-serialize) and the **Google Gemini API** (model-in-path `:generateContent`/`:streamGenerateContent` via an additive `:method`-suffix route matcher that leaves the exact-match adapters byte-identical); detection coverage expansion — cloud/SaaS provider keys (OpenAI/Anthropic/Google-OAuth/SendGrid/Twilio/npm/Azure, anchored) and international PII (FR/ES/JP + IT/SG/IN/DE/NL national IDs with checksum validators), each hard-block-vs-dial-eligible decision driven by measured collision rates (a non-numeric anchor or implausibly-rare shape is required for hard-block; a bare-digit run over a common length stays allowlist-clearable); a `bench:throughput` proxy load benchmark; the `haechi-ratelimit-redis` shared-store rate-limiter satellite (the WS3 seam's production consumer; the proxy now `await`s `rateLimiter.allow`); `haechi-dashboard` surfaces the per-request `correlationId`. Every change is additive — new `target.type`/detection-type/`privacy.profile` *values*, not new config keys (`configVersion` stays `1`); `tests/api-contract.test.mjs` green; core stays zero runtime dependency; core bumped to 1.3.0 (additive minor) | Pass |
 | G9 | 2026-06-16 full code-review remediation gate (shipped in 1.3.1) | `P0-CR-001` and `P1-CR-002` through `P1-CR-005` resolved or formally accepted; P2 items either resolved or scheduled with explicit non-blocking rationale; linked register updated. **All 13 `P*-CR-*` findings are Resolved (§5.7) and shipped in `haechi@1.3.1` (2026-06-16, attested OIDC publish); core bumped 1.3.0 → 1.3.1 (patch, remediation-only — no API/config surface change, `configVersion` stays `1`).** | Pass (`haechi@1.3.1`, 2026-06-16) |
+| G10 | 2026-06-16 code-review round 2 (CR2) remediation gate | The CR2 register (`code-review-risk-register-2026-06-16-round2.md`, §5.8) found **no P0/P1**; its three P2s (`CR2-001` proxy upstream-cancel, `CR2-002` token-vault audit hygiene, `CR2-003` plugin IPC reply bound) plus the P3 cluster (`CR2-004..008`) are all **Resolved and shipped in `haechi@1.3.2`** (`CR2-009` won't-fix, `CR2-010` accepted) and the linked register is updated. | Pass (`haechi@1.3.2`, 2026-06-16) |
 ## 3. P0 Distribution-Blocking Risk Status
@@ -159,6 +160,23 @@ The authoritative itemized register is `docs/current/code-review-risk-register-2
 | P2-CR-012 | KMS vault IPv6 loopback carve-out lacks IPv6-focused tests | Resolved | `satellites/crypto-kms/vault.test.mjs` adds a dedicated IPv6 loopback policy test ("…enforces the IPv6 loopback policy (::1, [::1], dotted + hex mapped) — P2-CR-012") covering bare `::1`, bracketed `[::1]`, dotted `::ffff:127.0.0.1`, and hex `::ffff:7f00:1`/`::ffff:7f00:0001` (plus bracketed variants), and asserts a public mapped address (`::ffff:8.8.8.8`/`::ffff:808:808`) is NOT over-blocked; the extended range table and `ssrf-parity.test.mjs` lock the dotted+hex agreement with auth-jwt |
 | P2-CR-013 | SSE multi-line `data:` fields are joined without newline separators | Resolved | `parseFrame` joins multiple `data:` lines with `join("\n")` (spec separator) and strips only the single spec leading space per line; multi-line JSON still `JSON.parse`s, multi-line plain text is reconstructed with newlines for inspection, and `serializeTextFrame` re-emits a multi-line payload as multiple `data:` lines; `tests/stream-filter.test.mjs` covers a multi-line JSON event and a multi-line plain-text event with PII |
+## 5.8 2026-06-16 Code Review Round 2 (CR2) Status — gate G10
+The authoritative itemized register is `docs/current/code-review-risk-register-2026-06-16-round2.md`; this is the release-gate summary. A second deep review after the 1.3.1 cut found **no P0 and no P1** (the two externally-reported P1s both verified down to P2 — neither is a stored-plaintext leak or an auth/SSRF bypass). The three P2s + the P3 cluster (`CR2-001..008`) are **Resolved and shipped in `haechi@1.3.2`**; one reported item was a **false positive** (`CR2-009`, won't-fix) and one is an **already-documented accepted residual** (`CR2-010`, accepted). **G10 is Pass.**
+| ID | Risk | Status | Required closure evidence |
+|---|---|---|---|
+| CR2-001 | Pass-through streaming never cancels the upstream reader on downstream disconnect (`pipeUpstreamBodyBounded` parks on `drain` forever) — unauthenticated resource leak | Resolved | Per-request `AbortController` + client `close`/`aborted` listener that cancels the upstream reader and aborts the fetch; `drain` wait raced against `close`; regression test that a mid-stream disconnect cancels the reader promptly |
+| CR2-002 | Token-vault reveal/purge writes the raw caller-supplied `token` + `error.message` (token-interpolated) into the audit event; `FORBIDDEN_KEYS` strips by key name only | Resolved | Generic error messages; keyed-HMAC or `tok_`-shape-validate the `token` before recording; enum `reasonCode` instead of `error.message`; regression test that no raw token reaches `reason`/`token`; reconcile the invariant wording |
+| CR2-003 | Plugin IPC reply not size-bounded before `JSON.parse`; process child has no heap cap → event-loop stall + memory spike from a hostile signed plugin | Resolved | Reply byte-length check before parse in both sandboxes (drop oversized as deny); `--max-old-space-size` heap cap on the process child via a new `resourceLimits` knob; regression test with an oversized-reply fixture |
+| CR2-004 | `sanitizeResponseHeaders` keeps stale body-coupled validators (`etag`/`content-md5`/`digest`/`last-modified`) on a transformed response | Resolved | Drop those headers on every body-mutating path + `cache-control: no-store`; test that a mutated response drops the upstream `ETag` |
+| CR2-005 | Over-`maxBytes` request body is read-and-discarded until the (finite) Node `requestTimeout` — no socket teardown | Resolved | `request.pause()`/`destroy()` (or `Connection: close`) on the 413 path; optionally non-null default timeouts |
+| CR2-006 | `mcp-wrap --stderr filter` is line-oriented, so a newline-split secret evades it (inherent; single-line secrets caught, `drop` available) | Resolved | `COMMAND_HELP` + register note; recommend `--stderr drop` for high-sensitivity tools |
+| CR2-007 | README says mcp-wrap "stderr ... pass through" but the default is now `--stderr filter` | Resolved | Correct README + `README.ko.md` |
+| CR2-008 | README streaming split-match claim is unscoped (cross-frame buffering is delta-channel only) | Resolved | Scope both README passages + `README.ko.md` to the delta channel |
+| CR2-009 | (reported P2) `keyMaterial` appended after the credential `maxMessageBytes` check | Won't fix (FALSE POSITIVE) | `keyMaterial` is operator-controlled + hard-bounded by the fetcher `maxBytes`; no attacker amplification — optional cosmetic re-assert only |
+| CR2-010 | (reported P2) secret split across two NON-JSON SSE frames not caught | Accepted (documented) | Already out-of-scope in round-1 `P1-CR-005`, `threat-model.md`, and an in-code comment; the JSON delta channel does buffer up to `maxMatchBytes` |
 ## 6. P2 Product/Documentation Risk Status
 | ID | Risk | Status | Resolution evidence |
@@ -172,7 +190,7 @@ The authoritative itemized register is `docs/current/code-review-risk-register-2
 This checklist is the standing pre-distribution template for every release on the `1.x` stable line; it was first exercised for the `0.3.2` developer preview, whose results are retained below as the reference record.
-Current 2026-06-16 status: G9 is `Pass` — the code-review remediation shipped in `haechi@1.3.1`. This checklist is cleared for that cut.
+Current 2026-06-16 status: G9 is `Pass` (round-1 remediation shipped in `haechi@1.3.1`). Gate **G10** (CR2, §5.8) is now `Pass` — the CR2 P2s + P3 cluster (`CR2-001..008`) are Resolved and shipped in `haechi@1.3.2`, so the checklist is cleared for that cut.
 External npm gate check results (`0.3.2` developer preview, 2026-06-10, post-publish):

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "haechi",
-  "version": "1.3.1",
+  "version": "1.3.2",
   "description": "Self-hosted AI context enforcement across LLM, MCP, vLLM, Ollama, and agent traffic — a stable, zero-dependency security gateway.",
   "license": "Apache-2.0",
   "type": "module",
@@ -66,6 +66,7 @@
   ],
   "scripts": {
     "test": "node --test",
+    "test:inference:live": "node --test tests/local-inference.integration.test.mjs",
     "check:types": "tsc -p jsconfig.json --noEmit",
     "pack:dry": "npm pack --dry-run",
     "scan:stale-names": "node scripts/stale-name-scan.mjs",

package/packages/cli/bin/haechi.mjs CHANGED Viewed

@@ -766,7 +766,7 @@ const COMMAND_HELP = {
   "mcp-wrap": {
     usage: "haechi mcp-wrap [--config haechi.config.json] [--stderr filter|drop|inherit] -- <command> [args...]",
     summary: "Wrap an MCP server with bidirectional stdio protection.",
-    detail: "Spawns <command>, applies the method allowlist + params protection client→server, and result protection + injection heuristics server→client. Drop-in for MCP client configs. --stderr controls the child's stderr: filter (default) protects each line with the same policy before re-emitting, drop discards it, inherit passes it through raw (an explicit, opt-in local-process boundary). filter follows the configured policy mode — in dry-run/report-only it detects but does not transform (like the rest of the pipeline), so set policy.mode=enforce for stderr redaction to take effect."
+    detail: "Spawns <command>, applies the method allowlist + params protection client→server, and result protection + injection heuristics server→client. Drop-in for MCP client configs. --stderr controls the child's stderr: filter (default) protects each line with the same policy before re-emitting, drop discards it, inherit passes it through raw (an explicit, opt-in local-process boundary). filter follows the configured policy mode — in dry-run/report-only it detects but does not transform (like the rest of the pipeline), so set policy.mode=enforce for stderr redaction to take effect. filter protects each COMPLETE line independently, so it cannot catch a secret a child deliberately splits across a newline; use drop for high-sensitivity tools."
   },
   auth: {
     usage: "haechi auth add --type user|service|agent [--scope k:v ...] [--label k=v ...]\n  haechi auth list [--config haechi.config.json]\n  haechi auth revoke <id> [--config haechi.config.json]",

package/packages/cli/runtime.mjs CHANGED Viewed

@@ -1022,7 +1022,11 @@ function resolveAuthProvider(config, providers, cryptoProvider, auditSink) {
       return createProcessIsolatedAuthProviderSync({
         ...common,
         netEnforcement: plugin.netEnforcement ?? "require-permission",
-        keyMaterial: plugin.keyMaterial ?? null
+        keyMaterial: plugin.keyMaterial ?? null,
+        // CR2-003: reuse the worker's resourceLimits.maxOldGenerationSizeMb knob to
+        // cap the child heap. Optional for the process runtime (the sandbox defaults
+        // when absent), so pass it through whether or not the config supplied it.
+        resourceLimits: plugin.resourceLimits ?? null
       });
     }
     return createSandboxedAuthProviderSync({ ...common, resourceLimits: plugin.resourceLimits });

package/packages/plugin/process-sandbox.mjs CHANGED Viewed

@@ -44,8 +44,19 @@ import {
 // The child flags. `--permission` enables the deny-by-default Node permission
 // model; we pass NO --allow-* grant, so fs/child-process/worker/addons/wasi/net
 // are all kernel-denied. `--disable-proto=delete` removes Object.prototype.__proto__.
+// A `--max-old-space-size=<mb>` heap cap is appended PER-SPAWN (see spawnAndLoad):
+// unlike the worker (resourceLimits OOMs a runaway), a process child has NO heap
+// cap by default, so a hostile/buggy signed plugin could build a reply up to the
+// child's default V8 heap. The cap bounds the child; the host-side reply-size bound
+// (CR2-003) bounds the host regardless.
 const CHILD_FLAGS = Object.freeze(["--permission", "--disable-proto=delete"]);
+// Default child heap cap (MB) when a process-runtime config does not supply
+// resourceLimits.maxOldGenerationSizeMb. Non-breaking: the worker REQUIRES the
+// knob, but the process runtime defaults rather than throwing so an isolation:
+// process config without resourceLimits keeps working.
+const DEFAULT_MAX_OLD_GEN_MB = 128;
 // A CONSTANT bootstrap harness, passed via `node -e`. It is identical for every
 // plugin (the plugin bytes arrive over IPC, NOT on the command line — so there is
 // no ARG_MAX limit and the harness never varies). It runs as CommonJS under -e and
@@ -155,6 +166,10 @@ function createProcessIsolatedAuthProviderHandle({
   timeoutMs,
   maxPendingCalls = 8,
   maxMessageBytes = 16384,
+  // Child V8 heap cap. Reuses the worker's resourceLimits.maxOldGenerationSizeMb
+  // knob (CR2-003). Optional for the process runtime: a config that omits it falls
+  // back to DEFAULT_MAX_OLD_GEN_MB rather than throwing (non-breaking).
+  resourceLimits = null,
   coreVersion = null,
   now = Date.now,
   allowedLabelKeys,
@@ -201,6 +216,21 @@ function createProcessIsolatedAuthProviderHandle({
   if (!Number.isInteger(maxMessageBytes) || maxMessageBytes < 1) {
     throw new Error("maxMessageBytes must be a positive integer");
   }
+  // Resolve the child heap cap (MB). Optional for the process runtime; if supplied
+  // it must be a positive-integer maxOldGenerationSizeMb (same shape as the worker),
+  // else default to DEFAULT_MAX_OLD_GEN_MB (non-breaking — never throws on absence).
+  let maxOldGenerationSizeMb = DEFAULT_MAX_OLD_GEN_MB;
+  if (resourceLimits !== null && resourceLimits !== undefined) {
+    if (typeof resourceLimits !== "object" || Array.isArray(resourceLimits)) {
+      throw new Error("createProcessIsolatedAuthProvider resourceLimits must be an object");
+    }
+    if (resourceLimits.maxOldGenerationSizeMb !== undefined) {
+      if (!Number.isInteger(resourceLimits.maxOldGenerationSizeMb) || resourceLimits.maxOldGenerationSizeMb <= 0) {
+        throw new Error("createProcessIsolatedAuthProvider resourceLimits.maxOldGenerationSizeMb must be a positive integer");
+      }
+      maxOldGenerationSizeMb = resourceLimits.maxOldGenerationSizeMb;
+    }
+  }
   // Fail-closed network containment. PR1 supports only the "require-permission"
   // mode; if this Node cannot enforce --allow-net, refuse to construct rather than
   // run a plugin whose network egress is uncontained.
@@ -273,7 +303,11 @@ function createProcessIsolatedAuthProviderHandle({
   // any failure kills the child and throws → fail closed. NOTE the plugin source
   // crosses over IPC (not the command line) so there is no ARG_MAX limit.
   async function spawnAndLoad({ entrySource, pluginId: pid }) {
-    const c = spawn(execPath, [...CHILD_FLAGS, "-e", PROCESS_HARNESS], {
+    // Build the spawn args by spreading the frozen base flags + the per-spawn heap
+    // cap. `--max-old-space-size` composes with `--permission`/`--disable-proto=
+    // delete` and the data:-URL load (verified). The cap bounds a runaway child;
+    // the host-side reply-size bound bounds the host regardless of the child heap.
+    const c = spawn(execPath, [...CHILD_FLAGS, `--max-old-space-size=${maxOldGenerationSizeMb}`, "-e", PROCESS_HARNESS], {
       stdio: ["ignore", "ignore", "ignore", "ipc"],
       serialization: "json",
       env: scrubbedEnv(),
@@ -291,6 +325,27 @@ function createProcessIsolatedAuthProviderHandle({
     const failed = new Promise((_, reject) => { onFail = reject; });
     c.on("message", (raw) => {
+      // REPLY SIZE BOUND (CR2-003): bound host-side work BEFORE JSON.parse. Unlike
+      // the worker (resourceLimits OOMs a runaway), the child has only the
+      // --max-old-space-size cap, so it can still build a reply up to that heap and
+      // process.send it; a synchronous JSON.parse of a multi-MB string stalls the
+      // host event loop (the per-call timeout cannot fire mid-parse). The reply is a
+      // STRING (serialization:'json'); measure its byte length and, if it exceeds the
+      // SAME maxMessageBytes ceiling the outbound credential obeys, drop the frame as
+      // an oversized DENY WITHOUT parsing. The auth reply is the only attacker-sized
+      // frame (claims come from the plugin); the tiny ready/loaded/load-error control
+      // frames are always far under the ceiling, so the uniform bound never harms the
+      // handshake. Single-occupancy: settle the one live pending call as oversized.
+      const replyBytes = typeof raw === "string"
+        ? Buffer.byteLength(raw, "utf8")
+        : Buffer.byteLength(String(raw), "utf8");
+      if (replyBytes > maxMessageBytes) {
+        for (const [cid, settle] of pending) {
+          pending.delete(cid);
+          settle({ __oversized: true });
+        }
+        return;
+      }
       let parsed;
       try {
         parsed = JSON.parse(typeof raw === "string" ? raw : String(raw));

package/packages/plugin/sandbox.mjs CHANGED Viewed

@@ -150,6 +150,29 @@ function createSandboxedAuthProviderHandle({
       workerData: {}
     });
     w.on("message", (raw) => {
+      // REPLY SIZE BOUND (CR2-003): bound host-side work BEFORE JSON.parse. The
+      // worker has an implicit heap cap (resourceLimits), but enforce the same
+      // maxMessageBytes ceiling on the INBOUND plugin→host reply that the OUTBOUND
+      // host→plugin credential message obeys — a hostile/buggy plugin can build a
+      // multi-MB reply and a synchronous JSON.parse would stall the host event loop
+      // (the per-call timeout cannot fire mid-parse). The reply is a STRING posted
+      // via JSON.stringify; measure its byte length and, if oversized, settle the
+      // matched call as an oversized DENY (mirroring the credential deny) WITHOUT
+      // parsing. We must locate the pending settle WITHOUT parsing the cid, so an
+      // oversized reply settles the single live pending call (single-occupancy: at
+      // most one entry is ever live).
+      const replyBytes = typeof raw === "string"
+        ? Buffer.byteLength(raw, "utf8")
+        : Buffer.byteLength(String(raw), "utf8");
+      if (replyBytes > maxMessageBytes) {
+        // Single-occupancy: settle the one live pending call as oversized, never
+        // touching JSON.parse on the oversized payload.
+        for (const [cid, settle] of pending) {
+          pending.delete(cid);
+          settle({ __oversized: true });
+        }
+        return;
+      }
       let parsed;
       try {
         parsed = JSON.parse(typeof raw === "string" ? raw : String(raw));

package/packages/proxy/index.mjs CHANGED Viewed

@@ -227,7 +227,8 @@ export function createHaechiProxy({ runtime, port = DEFAULT_PROXY_PORT, host = "
       const authContext = { identity, profile, policyEngine, correlationId };
       const body = await readBody(request, {
-        maxBytes: config.limits.maxRequestBytes
+        maxBytes: config.limits.maxRequestBytes,
+        response
       });
       const json = parseJsonBody(body);
@@ -268,13 +269,18 @@ export function createHaechiProxy({ runtime, port = DEFAULT_PROXY_PORT, host = "
             blocked: false
           });
           countDecision(metrics, { routeContext, mode, decision: "forwarded" });
+          // CR2-001 — a per-request AbortController whose signal is threaded into
+          // the upstream fetch; aborting it (on a downstream client disconnect)
+          // tears down the upstream request + body so neither leaks.
+          const streamAbort = new AbortController();
           const upstreamResponse = await forward({
             upstream: config.target.upstream,
             request,
             body,
             timeoutMs: config.limits.upstreamTimeoutMs,
             metrics,
-            forwardPolicy
+            forwardPolicy,
+            abortController: streamAbort
           });
           // P1-CR-003 — sanitize response headers (strip the upstream's
           // content-encoding/content-length/transfer/hop-by-hop) on this path
@@ -286,7 +292,9 @@ export function createHaechiProxy({ runtime, port = DEFAULT_PROXY_PORT, host = "
           await pipeUpstreamBodyBounded({
             upstreamResponse,
             response,
+            request,
             maxBytes: streamingPassThroughMaxBytes(config),
+            abortController: streamAbort,
             logger,
             metrics,
             correlationId
@@ -360,10 +368,16 @@ export function createHaechiProxy({ runtime, port = DEFAULT_PROXY_PORT, host = "
         });
         metrics.increment("haechi_internal_error_total");
       }
+      // CR2-005 — an over-limit request body teardown carries `Connection: close`
+      // so the socket releases once the 413 is delivered (readBody destroys the
+      // request on response finish/close).
+      const extraHeaders = error?.errorCode === "haechi_request_body_too_large"
+        ? { connection: "close" }
+        : null;
       writeJson(response, error.statusCode ?? 500, {
         error: error.errorCode ?? "haechi_proxy_error",
         message: expected ? error.message : "Internal proxy error"
-      });
+      }, extraHeaders);
     } finally {
       const elapsedSeconds = Number(process.hrtime.bigint() - startedAt) / 1e9;
       // route label is a bounded route id (or "unknown") — never an identity/value.
@@ -783,7 +797,7 @@ function streamingPassThroughMaxBytes(config) {
 // the client response so a long-lived or malicious stream cannot hold memory or
 // the connection open unbounded. Bytes already written cannot be retracted, so
 // this caps total memory/throughput, not the already-flushed prefix.
-async function pipeUpstreamBodyBounded({ upstreamResponse, response, maxBytes, logger = null, metrics = null, correlationId = null }) {
+async function pipeUpstreamBodyBounded({ upstreamResponse, response, request = null, maxBytes, abortController = null, logger = null, metrics = null, correlationId = null }) {
   if (!upstreamResponse.body) {
     response.end();
     return;
@@ -791,8 +805,50 @@ async function pipeUpstreamBodyBounded({ upstreamResponse, response, maxBytes, l
   const reader = upstreamResponse.body.getReader();
   let received = 0;
+  // CR2-001 — a ONE-SHOT teardown on a downstream client disconnect. Without it,
+  // a parked `await once(response, "drain")` (backpressure) or a parked
+  // `await reader.read()` (no backpressure, upstream idle) never unparks after the
+  // client socket dies — neither `drain` nor `error` fires — so the async task and
+  // the upstream connection leak. On `close`/`aborted` we cancel the upstream
+  // reader (interrupts a parked read) AND abort the upstream fetch (tears down the
+  // connection); the listeners are removed on normal completion so the happy path
+  // does not leak a handle.
+  let disconnected = false;
+  // A SINGLE promise resolved by the one-shot tearDown below, so the backpressure
+  // wait can race against the disconnect WITHOUT registering a fresh `close`
+  // listener every drain cycle (which would accumulate on a sustained
+  // backpressured stream and trip MaxListenersExceededWarning).
+  let signalDisconnected;
+  const disconnectedPromise = new Promise((resolve) => {
+    signalDisconnected = resolve;
+  });
+  const tearDown = () => {
+    if (disconnected) {
+      return;
+    }
+    disconnected = true;
+    signalDisconnected();
+    void cancelReader(reader);
+    abortController?.abort();
+  };
+  const disconnectSources = [response, request].filter(Boolean);
+  for (const source of disconnectSources) {
+    source.once("close", tearDown);
+    source.once("aborted", tearDown);
+  }
+  const cleanupListeners = () => {
+    for (const source of disconnectSources) {
+      source.removeListener("close", tearDown);
+      source.removeListener("aborted", tearDown);
+    }
+  };
   try {
     while (true) {
+      if (disconnected) {
+        return;
+      }
       const { done, value } = await reader.read();
       if (done) {
         break;
@@ -802,6 +858,7 @@ async function pipeUpstreamBodyBounded({ upstreamResponse, response, maxBytes, l
         // Over the cap: stop reading upstream and tear down the client write so
         // the oversize stream is bounded (fail-closed on size).
         void cancelReader(reader);
+        abortController?.abort();
         metrics?.increment("haechi_response_stream_truncated_total");
         logger?.error("proxy_stream_pass_through_too_large", {
           correlationId,
@@ -813,18 +870,29 @@ async function pipeUpstreamBodyBounded({ upstreamResponse, response, maxBytes, l
         return;
       }
       // Respect downstream backpressure: stop pulling upstream until the client
-      // socket has drained.
+      // socket has drained. CR2-001 — race the drain wait against `close` so a
+      // client disconnect mid-backpressure unparks the wait instead of hanging
+      // until the request timeout.
       const ok = response.write(Buffer.from(value));
-      if (!ok) {
-        await once(response, "drain");
+      if (!ok && !disconnected) {
+        await Promise.race([
+          once(response, "drain"),
+          disconnectedPromise
+        ]);
+        if (disconnected || response.writableEnded || response.destroyed) {
+          return;
+        }
       }
     }
     response.end();
   } catch (error) {
     void cancelReader(reader);
+    abortController?.abort();
     if (!response.writableEnded) {
       response.destroy();
     }
+  } finally {
+    cleanupListeners();
   }
 }
@@ -1051,14 +1119,24 @@ function restoreTokens(value, tokenValues) {
   return value;
 }
-async function forward({ upstream, request, body, timeoutMs = null, metrics = null, forwardPolicy = {} }) {
+async function forward({ upstream, request, body, timeoutMs = null, metrics = null, forwardPolicy = {}, abortController = null }) {
   const target = buildUpstreamUrl({ upstream, requestUrl: request.url });
+  // CR2-001 — combine the upstream timeout with a per-request AbortController so a
+  // downstream client disconnect (which aborts `abortController`) tears down the
+  // in-flight upstream fetch + its body, instead of leaking the connection.
+  const timeoutSignal = timeoutMs ? AbortSignal.timeout(timeoutMs) : null;
+  let signal;
+  if (abortController && timeoutSignal) {
+    signal = AbortSignal.any([abortController.signal, timeoutSignal]);
+  } else {
+    signal = abortController ? abortController.signal : timeoutSignal ?? undefined;
+  }
   try {
     return await fetch(target, {
       method: request.method,
       headers: filteredHeaders(request.headers, forwardPolicy),
       body: request.method === "GET" || request.method === "HEAD" ? undefined : body,
-      signal: timeoutMs ? AbortSignal.timeout(timeoutMs) : undefined
+      signal
     });
   } catch (error) {
     if (error?.name === "TimeoutError" || error?.name === "AbortError") {
@@ -1195,7 +1273,7 @@ function appendHeader(target, key, value) {
   }
 }
-function readBody(request, { maxBytes }) {
+function readBody(request, { maxBytes, response = null }) {
   return new Promise((resolve, reject) => {
     const chunks = [];
     let received = 0;
@@ -1208,6 +1286,26 @@ function readBody(request, { maxBytes }) {
       received += chunk.byteLength;
       if (received > maxBytes) {
         rejected = true;
+        // CR2-005 — stop reading and release the socket PROMPTLY instead of
+        // reading-and-discarding the rest of the upload until Node's finite
+        // requestTimeout. pause() halts the flowing read immediately (no further
+        // data is consumed); the connection is then torn down — but only AFTER the
+        // 413 has been written, so the client still receives it. The 413 carries
+        // `Connection: close` and the socket is destroyed once the response
+        // finishes/closes (destroying before the response is sent would reset the
+        // socket and the client would get a transport error instead of the 413).
+        request.pause();
+        if (response) {
+          const destroyRequest = () => {
+            if (!request.destroyed) {
+              request.destroy();
+            }
+          };
+          response.once("finish", destroyRequest);
+          response.once("close", destroyRequest);
+        } else {
+          request.destroy();
+        }
         reject(proxyError({
           statusCode: 413,
           errorCode: "haechi_request_body_too_large",
@@ -1260,8 +1358,8 @@ function parseJsonBody(body) {
   }
 }
-function writeJson(response, status, body) {
-  response.writeHead(status, { "content-type": "application/json" });
+function writeJson(response, status, body, extraHeaders = null) {
+  response.writeHead(status, { "content-type": "application/json", ...(extraHeaders ?? {}) });
   response.end(`${JSON.stringify(body, null, 2)}\n`);
 }
@@ -1269,6 +1367,17 @@ function isJson(contentType = "") {
   return contentType.toLowerCase().includes("application/json");
 }
+// CR2-004 — body-coupled validator headers that describe the UPSTREAM body. On a
+// transformed (protected/redacted/re-serialized) response the body changed, so
+// these become stale and must be dropped (a client/proxy honoring the upstream's
+// etag/last-modified could otherwise serve or revalidate against the wrong body).
+const BODY_COUPLED_VALIDATOR_HEADERS = [
+  "etag",
+  "content-md5",
+  "digest",
+  "last-modified"
+];
 function transformedJsonHeaders(headers) {
   // P1-CR-003 — defensively strip the full hop-by-hop/compression set (the
   // caller already passes the sanitized headers, but the transformed JSON body
@@ -1277,6 +1386,13 @@ function transformedJsonHeaders(headers) {
   for (const name of RESPONSE_HOP_BY_HOP_HEADERS) {
     delete next[name];
   }
+  // CR2-004 — the body was MUTATED, so drop validators coupled to the upstream
+  // body and forbid caching the rewritten response. This path only (the raw
+  // pass-through path keeps its etag — its body is byte-unchanged so still valid).
+  for (const name of BODY_COUPLED_VALIDATOR_HEADERS) {
+    delete next[name];
+  }
+  next["cache-control"] = "no-store";
   return next;
 }

package/packages/token-vault/index.mjs CHANGED Viewed

@@ -4,6 +4,12 @@ import { createHash, randomBytes, randomUUID } from "node:crypto";
 import { setTimeout as delay } from "node:timers/promises";
 const DETERMINISTIC_DOMAIN = "haechi:token-vault:deterministic:v1";
+const AUDIT_ID_DOMAIN = "haechi:token-vault:audit-id:v1";
+// Opaque vault token ids are `tok_<type>_<hexhash>` (random: 16 hex via
+// shortHash; deterministic: 32 hex from hmac). Anything that does not match
+// this shape is treated as a misused raw value and never written verbatim.
+const VAULT_TOKEN_SHAPE = /^tok_[a-z0-9_]+_[a-f0-9]{16,}$/;
 export function createLocalTokenVault({
   path,
@@ -41,6 +47,30 @@ export function createLocalTokenVault({
     return mutation;
   }
+  // The audit `token` field must never carry a raw secret. A legitimate token
+  // id is a non-sensitive opaque `tok_<type>_<hexhash>` — recorded verbatim for
+  // correlation. A caller who misuses the API and passes a raw value where a
+  // token id is expected would otherwise leak that value into the hash-chained
+  // log (sanitizeAudit strips by key name only). For non-matching inputs we
+  // record a keyed-HMAC under a dedicated domain, or a fixed redaction marker
+  // if no hmac is available — never the raw value.
+  async function safeAuditToken(token) {
+    if (token == null) {
+      return null;
+    }
+    if (typeof token === "string" && VAULT_TOKEN_SHAPE.test(token)) {
+      return token;
+    }
+    if (typeof cryptoProvider.hmac === "function") {
+      const digest = await cryptoProvider.hmac({
+        data: typeof token === "string" ? token : String(token),
+        domain: AUDIT_ID_DOMAIN
+      });
+      return `nontoken_${digest.slice(0, 32)}`;
+    }
+    return "[REDACTED:non-token]";
+  }
   // Reveal/purge governance events must be auditable. Events carry token ids
   // and decision metadata only — never plaintext values.
   async function recordVaultEvent({ operation, decision, token = null, tokenType = null, reason = null, count = null }) {
@@ -58,7 +88,7 @@ export function createLocalTokenVault({
       blocked: decision.endsWith("_denied"),
       decision,
       reason,
-      token,
+      token: await safeAuditToken(token),
       tokenType,
       count,
       revealPolicy,
@@ -132,17 +162,28 @@ export function createLocalTokenVault({
         });
         throw new Error("Token reveal is disabled by tokenVault.revealPolicy");
       }
+      // Failure branches carry a stable reasonCode (never error.message / raw
+      // token); the message itself never interpolates the token argument.
+      let reasonCode = "reveal_error";
       try {
         const vault = await readVault(path);
         const record = vault.tokens[token];
         if (!record) {
-          throw new Error(`Unknown token: ${token}`);
+          reasonCode = "unknown_token";
+          throw new Error("Unknown token");
         }
         if (record.expiresAt && Date.parse(record.expiresAt) < Date.now()) {
-          throw new Error(`Token expired: ${token}`);
+          reasonCode = "token_expired";
+          throw new Error("Token expired");
         }
         const aad = context ? { ...record.aad, context } : record.aad;
-        const plaintext = await cryptoProvider.decrypt({ envelope: record.envelope, aad });
+        let plaintext;
+        try {
+          plaintext = await cryptoProvider.decrypt({ envelope: record.envelope, aad });
+        } catch {
+          reasonCode = "decrypt_failed";
+          throw new Error("Token decrypt failed");
+        }
         await recordVaultEvent({
           operation: "token-vault:reveal",
           decision: "reveal_allowed",
@@ -159,7 +200,7 @@ export function createLocalTokenVault({
           operation: "token-vault:reveal",
           decision: "reveal_failed",
           token,
-          reason: error.message
+          reason: reasonCode
         });
         throw error;
       }