npm - haechi - Versions diffs - 1.1.2 → 1.3.0 - Mend

haechi 1.1.2 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/README.ko.md +46 -11
package/README.md +46 -11
package/SECURITY.md +7 -1
package/docs/README.md +2 -0
package/docs/current/compliance-mapping.ko.md +53 -0
package/docs/current/compliance-mapping.md +53 -0
package/docs/current/config-version.ko.md +30 -0
package/docs/current/config-version.md +51 -0
package/docs/current/configuration.ko.md +165 -9
package/docs/current/configuration.md +165 -9
package/docs/current/operations-runbook.ko.md +155 -0
package/docs/current/operations-runbook.md +241 -0
package/docs/current/release-process.ko.md +5 -1
package/docs/current/release-process.md +5 -1
package/docs/current/risk-register-release-gate.ko.md +5 -3
package/docs/current/risk-register-release-gate.md +13 -3
package/docs/current/security-whitepaper.ko.md +102 -0
package/docs/current/security-whitepaper.md +102 -0
package/docs/current/shared-responsibility.ko.md +2 -2
package/docs/current/shared-responsibility.md +2 -2
package/docs/current/threat-model.ko.md +4 -2
package/docs/current/threat-model.md +4 -2
package/examples/local-proxy-demo/README.md +51 -0
package/examples/local-proxy-demo/demo.mjs +144 -0
package/examples/local-proxy-demo/demo.tape +19 -0
package/examples/local-proxy-demo/live-demo.mjs +121 -0
package/examples/local-proxy-demo/live-demo.tape +25 -0
package/haechi.config.example.json +20 -3
package/package.json +7 -2
package/packages/audit/index.mjs +26 -2
package/packages/cli/bin/haechi.mjs +57 -10
package/packages/cli/runtime.mjs +402 -10
package/packages/core/index.mjs +143 -8
package/packages/filter/index.mjs +975 -12
package/packages/metrics/index.mjs +181 -0
package/packages/privacy-profiles/index.mjs +72 -3
package/packages/protocol-adapters/index.mjs +99 -1
package/packages/proxy/index.mjs +525 -40
package/packages/stream-filter/index.mjs +69 -7

package/docs/current/operations-runbook.ko.md ADDED Viewed

@@ -0,0 +1,155 @@
+# Haechi 운영 런북 (Day-2)
+- 상태: Living document (코어 1.3.x 추적)
+Haechi를 프로덕션에서 운영하기 위한 실무 가이드입니다: 배포, 환경변수 오버레이를 통한 설정, health/readiness/metrics 모니터링, 우아한 종료, 백프레셔 튜닝, 그리고 해시 체인을 깨지 않는 audit 로그 회전입니다.
+이 문서는 운영 가이드이며 컴플라이언스 보증이 아닙니다. 전체 설정 레퍼런스는 [`configuration.ko.md`](./configuration.ko.md)를, 신뢰 경계는 [`threat-model.ko.md`](./threat-model.ko.md)를 참고하십시오.
+## 1. 배포
+Haechi는 런타임 의존성이 0인 Node `>=22` 패키지입니다. 리포지토리 루트의 참조용 [`Dockerfile`](../../Dockerfile), [`docker-compose.yml`](../../docker-compose.yml), [`.dockerignore`](../../.dockerignore)가 하드닝된 이미지를 빌드합니다(이 파일들은 npm 타르볼에 포함되지 **않는** 리포지토리 배포 자산입니다). 이미지는:
+- Node 22 slim 베이스를 핀으로 고정하고(`engines: ">=22"`와 일치),
+- 비루트 `node` 사용자로 실행하며,
+- 런타임 파일만 복사하고(`.haechi` 비밀, 테스트, 문서 소스 제외),
+- audit 체인 / 키 파일 / 토큰 볼트를 위한 쓰기 가능 `/app/.haechi` 볼륨을 선언하고 나머지 트리는 읽기 전용으로 실행하며,
+- `/__haechi/live`에 대한 `HEALTHCHECK`를 제공합니다.
+```bash
+docker compose up -d        # 참조 스택 빌드 + 실행
+docker compose logs -f haechi
+```
+**TLS + 인증으로 앞단을 보호하십시오.** Haechi는 자체 TLS가 없습니다. 포트는 TLS를 종단하고 인증하는 리버스 프록시(nginx / Caddy / Traefik / API 게이트웨이)에만 공개하고, 원시 Haechi 포트를 공개 인터페이스에 절대 노출하지 마십시오. compose 예제는 바로 이 이유로 호스트 loopback(`127.0.0.1:11016`)에만 공개합니다.
+**Loopback 너머 바인딩.** 컨테이너 내부에서는 매핑된 포트가 도달 가능하도록 Haechi가 `0.0.0.0`에 바인딩해야 하며, 이는 `--allow-remote-bind`를 요구합니다(참조 `CMD`가 전달합니다). 호스트에서는 기본 loopback 바인딩을 선호하고 리버스 프록시를 통해 Haechi에 접근하십시오. [Loopback 너머 바인딩](./configuration.ko.md)을 참고하십시오.
+## 2. 환경변수 오버레이를 통한 설정
+컨테이너 / 12-factor 배포를 위해 **비밀이 아닌 운영 키의 고정 allowlist**를 환경변수로 덮어쓸 수 있습니다. 환경변수 값은 **설정 파일보다 우선**하며 fail-closed로 검증됩니다 — 잘못된 값(잘못된 포트, 알 수 없는 모드)은 프로세스를 조용히 약화시키지 않고 **기동 실패**시킵니다.
+| 환경변수 | 설정 키 | 타입 / 값 | 예시 |
+|---|---|---|---|
+| `HAECHI_PROXY_PORT` | `proxy.port` | 정수 0–65535 | `11016` |
+| `HAECHI_PROXY_HOST` | `proxy.host` | 비어 있지 않은 문자열 | `0.0.0.0` |
+| `HAECHI_UPSTREAM` | `target.upstream` | URL 문자열 | `http://llm:8000` |
+| `HAECHI_MODE` | `mode` | `dry-run` \| `report-only` \| `enforce` | `enforce` |
+| `HAECHI_LOG_FORMAT` | `logging.format` | `text` \| `json` | `json` |
+**비밀은 설계상 오버레이 대상이 아닙니다.** `keys.*`(로컬 키 파일이나 외부 키 경로), auth 토큰 저장소, 어떤 토큰/비밀에 대한 `HAECHI_*` 변수도 **없습니다**. 비밀은 마운트된 설정 파일에 두거나 **주입된 provider**(`createRuntime(config, { cryptoProvider, authProvider, … })`)로 공급합니다. 비밀을 프로세스 환경에 두면 `/proc`, 크래시 덤프, 오케스트레이터 inspect 출력, 자식 프로세스를 통해 누출될 위험이 있으므로 오버레이 allowlist에서 완전히 제외합니다.
+오버레이는 `loadConfig()`에서 파일을 읽은 뒤 `normalizeConfig()` 이전에 적용되므로, 오버레이된 값도 파일에 설정된 값과 동일한 검증을 거칩니다.
+## 3. Health, readiness, metrics 스크레이핑
+예약된 `/__haechi/*` 프리픽스 아래 인증이 필요 없는 네 개의 라우트로, 인증/바디 읽기 이전에 검사되며 upstream을 절대 프록시하지 않습니다(전체 레퍼런스: [운영 엔드포인트](./configuration.ko.md#운영-엔드포인트)):
+| 엔드포인트 | 용도 |
+|---|---|
+| `GET /__haechi/live` | **Liveness** — 재시작 프로브. 가볍고, 이벤트 루프가 서비스하는 동안 200. |
+| `GET /__haechi/ready` | **Readiness** — 트래픽 게이트. **audit sink에 쓸 수 없으면 503**(감사를 못 하는 게이트웨이는 ready가 아님). 로드밸런서/오케스트레이터 readiness 프로브를 여기로 지정하십시오. |
+| `GET /__haechi/health` | 하위 호환(`ok` + `mode` + `version`). |
+| `GET /__haechi/metrics` | Prometheus 텍스트 노출. `metrics.enabled: false`이면 `404`. |
+Prometheus(또는 OpenMetrics 호환 스크레이퍼)로 **`/metrics`를 스크레이프**하십시오:
+```yaml
+scrape_configs:
+  - job_name: haechi
+    metrics_path: /__haechi/metrics
+    static_configs:
+      - targets: ["haechi:11016"]
+```
+주요 신호: `haechi_requests_total{route,mode,decision}`, `haechi_blocks_total`, `haechi_auth_denied_total`, `haechi_rate_limited_total`, `haechi_overloaded_total`(백프레셔 503), `haechi_upstream_timeout_total`, `haechi_upstream_error_total`, `haechi_response_unprotected_total`, `haechi_internal_error_total`, 그리고 `haechi_request_duration_seconds{route}` 히스토그램.
+**텔레메트리 no-PII 불변식.** 모든 메트릭 이름과 **모든 라벨 값**은 경계가 있는 enum(route id / mode / decision class)이며, identity·토큰·탐지 값이 절대 아닙니다. 동일한 불변식이 구조화 로그에도 적용됩니다: `logging.format: json`(또는 `HAECHI_LOG_FORMAT=json`)에서 기동/종료/오류 로그는 `correlationId`와 오류 클래스 이름만 담고 페이로드는 절대 담지 않습니다. `correlationId`는 해당 요청의 audit 이벤트에도 나타나므로, 기록된 오류를 그 audit 추적과 연결할 수 있습니다.
+## 4. 우아한 종료
+`SIGINT`/`SIGTERM` 시 CLI는 프록시의 `close()`를 호출하고, 이는 **우아하게 드레인**합니다:
+1. 새 연결 수락을 멈추고(`server.close()`),
+2. idle keep-alive 소켓을 즉시 닫고(`closeIdleConnections()`),
+3. in-flight 요청이 끝날 때까지 기다리고,
+4. 유예 기간(`limits.shutdownGraceMs`, 기본 10000ms) 후 남은 소켓을 강제 종료하여(`closeAllConnections()`) 멈춘 keep-alive가 종료를 무한정 붙잡지 못하게 합니다.
+`close()`는 in-flight 요청이 빠지거나 유예가 지나면 resolve합니다. 오케스트레이터의 `terminationGracePeriod`(쿠버네티스) / `stop_grace_period`(compose)를 `limits.shutdownGraceMs`보다 **크게** 설정하여 플랫폼이 드레인 도중 SIGKILL하지 않게 하십시오. 가장 긴 허용 in-flight 요청에 맞춰 `limits.shutdownGraceMs`를 튜닝하십시오.
+## 5. 백프레셔 튜닝
+`limits.maxInFlight`는 동시에 처리되는 요청 수의 전역 상한입니다.
+- `0`(기본)은 상한을 비활성화합니다 — 1.1 동작 그대로.
+- `> 0`: 현재 in-flight 수가 상한에 도달하면 **새** 요청은 `Retry-After` 헤더(`limits.shutdownGraceMs`에서 유도한 초)와 `{ "error": "haechi_overloaded" }` 바디와 함께, 인증/바디 읽기 **이전에** `503`으로 거부됩니다. 거부마다 `haechi_overloaded_total`이 증가합니다.
+- `/__haechi/*` 관측 라우트는 상한에서 **예외**이므로, 포화 상태에서도 liveness와 `/metrics`를 스크레이프할 수 있습니다 — 부하를 떨어내는 *이유*를 여전히 볼 수 있습니다.
+`maxInFlight`를 upstream + 호스트가 감당할 수 있는 동시성 근처로 설정하고(`haechi_request_duration_seconds`와 upstream 포화를 관찰), 게이트웨이가 붕괴 대신 깔끔한 503으로 부하를 떨어내도록 여유를 두십시오. 느린 upstream이 슬롯을 무한정 점유하지 못하도록 튜닝된 `limits.upstreamTimeoutMs`와 함께 사용하십시오.
+### 튜닝된 타임아웃
+`limits.requestTimeoutMs`와 `limits.headersTimeoutMs`는 Node HTTP 서버의 `requestTimeout` / `headersTimeout`에 매핑됩니다. 둘 다 기본값 `null` = Node 서버 기본값을 그대로 둠(옵트인하지 않으면 동작 불변)입니다. slow-loris 류의 느린 요청/헤더 전달을 제한하려면 숫자를 설정하고, `0`은 해당 타임아웃을 비활성화합니다(Node 의미).
+## 6. 체인 인지 audit 로그 회전 & 보존
+audit 로그는 **SHA-256 해시 체인**입니다(`audit.path`): 각 이벤트의 `auditIntegrity.previousHash`가 이전 이벤트 해시에 연결되므로, 삽입·삭제·수정·재정렬은 `haechi audit-verify` / `verifyAuditChain`로 탐지됩니다. 선택적 **anchor 스트림**(`audit.anchor`)은 체인 헤드를 별도의 append-only 매체에 기록하여 tail truncation(최신 이벤트 삭제)까지 잡아냅니다. [`audit` 개념](./configuration.ko.md#audit)과 위협 모델을 참고하십시오.
+**체인을 중간에서 잘라내거나 다시 쓰지 마십시오.** `audit.jsonl`을 제자리에서 truncate하거나 이전 줄을 다시 쓰면 **체인이 깨지고** 검증이 실패합니다(더 나쁘게는 변조 증거가 조용히 사라집니다). **새 세그먼트를 시작**하고 이전 세그먼트를 보존하는 방식으로 회전하십시오:
+1. writer를 **멈추거나 정지**시킵니다(우아한 종료, 또는 점검 시간대에 회전). 기본 JSONL sink는 append 방식이므로, 열려 있는 파일을 회전하는 일을 피하는 것입니다.
+2. 현재 세그먼트를 **그대로 보존한 채 옆으로 옮깁니다**: `mv .haechi/audit.jsonl .haechi/audit-2026-06-12.jsonl`(대응하는 anchor도: `mv .haechi/audit.anchor.jsonl .haechi/audit-2026-06-12.anchor.jsonl`).
+3. Haechi를 재시작하여(또는 `audit.path` / `audit.anchor.path`를 새 파일로 지정하여) **새 세그먼트를 시작**합니다. 새 체인은 `previousHash: null`로 시작합니다 — 독립적으로 검증 가능한 새 체인입니다. 이는 의도된 동작입니다: 각 세그먼트가 자체적으로 검증 가능한 체인이며, 회전 경계를 넘어 체인을 잇지 **않습니다**.
+4. 보존된 각 세그먼트를 자체 anchor로 **독립 검증**합니다: `haechi audit-verify --audit .haechi/audit-2026-06-12.jsonl --anchor .haechi/audit-2026-06-12.anchor.jsonl`.
+5. 전체 이력이 검증 가능하도록 보존 기간 동안 **이전 세그먼트를 보관**합니다. 가능하면 삭제 대신 append-only / WORM 저장소로 아카이브하십시오. anchor의 방어는 anchor가 별도의 append-only 매체에 존재한다는 전제에 기반합니다.
+**보존:** 회전된 각 세그먼트(및 그 anchor)를 요구되는 audit 보존 기간 동안 유지한 뒤 세그먼트 단위로 만료시키십시오 — 세그먼트 내 일부 줄을 절대 부분 삭제하지 마십시오. 토큰 볼트 보존은 독립적이며(`tokenVault.retentionDays`), audit 회전은 토큰을 정리하지 않습니다.
+아카이브 파이프라인에 검증 단계를 유지하지 않는 한, 나중에 재검증이 불가능한 방식으로 세그먼트를 압축/암호화하지 **마십시오**. 회전된 세그먼트는 여전히 검증될 때에만 증거로서 유용합니다.
+## 7. 프록시 처리량 벤치마크
+`npm run bench:throughput`(`scripts/bench-throughput.mjs`)는 동시성 부하에서
+프록시가 더하는 요청당 오버헤드를 측정합니다. 결정적인 로컬 **스텁**
+OpenAI 호환 업스트림(즉시 응답하는 정해진 답변 — 실제 모델 없음)과 그 앞단의
+**실제** Haechi 프록시를 세우고, 고정 크기 워커 풀의 동시 `fetch`로 부하를
+구동하여 **req/s**와 **p50/p95/p99/max** 지연(정렬된 표본에 대한 nearest-rank
+백분위수)을 보고합니다. 세 가지 시나리오를 실행합니다:
+1. 고정 동시성에서의 **처리량 + 지연**(워밍업 배치는 보고 통계에서 제외합니다 —
+   JIT/연결 워밍업이 초기 요청을 왜곡하기 때문입니다),
+2. **enforce 대 dry-run 오버헤드** — 동일한 부하를 두 모드로 실행하여 지연/처리량
+   **델타**를 보고하므로, 보호 비용이 추측이 아닌 측정된 수치가 됩니다,
+3. **백프레셔** — 낮은 `limits.maxInFlight`를 버스트로 포화시켜 `503 + Retry-After`와
+   `200`의 비율을 보고합니다(실제 응답을 관찰하여 천장이 부하를 흘려보냄을 증명).
+```bash
+npm run bench:throughput
+HAECHI_BENCH_REQUESTS=5000 HAECHI_BENCH_CONCURRENCY=64 npm run bench:throughput
+```
+노브(env, 매 실행 상단에 출력됨): `HAECHI_BENCH_REQUESTS`(총 요청 수, 기본 2000),
+`HAECHI_BENCH_CONCURRENCY`(기본 32), `HAECHI_BENCH_WARMUP`(제외할 워밍업 수, 기본
+100), `HAECHI_BENCH_PAYLOAD_KB`(기본 1), `HAECHI_BENCH_MAXINFLIGHT`(백프레셔
+시나리오의 천장, 기본 4).
+> **수치는 머신 상대적입니다.** 이것은 **루프백, 단일 프로세스, 스텁 업스트림
+> 마이크로 벤치마크**입니다: 스텁, 프록시, 부하 생성기가 모두 `127.0.0.1`의 한
+> Node 프로세스에서 실행되므로 실제 네트워크도 실제 모델도 없습니다. 수치는 오직
+> Haechi가 더하는 오버헤드만 측정하며 머신·Node 버전·부하에 따라 달라집니다.
+> 네트워크/하드웨어 처리량 벤치마크가 **아니며** 보장 수치로 인용해서는 **안
+> 됩니다**. 이 벤치는 `release:preflight`에서 실행되지 않습니다.
+## 8. 빠른 참조
+| 작업 | 커맨드 |
+|---|---|
+| 시작(compose) | `docker compose up -d` |
+| Liveness | `curl localhost:11016/__haechi/live` |
+| Readiness | `curl localhost:11016/__haechi/ready` |
+| Metrics | `curl localhost:11016/__haechi/metrics` |
+| 처리량 벤치 | `npm run bench:throughput` |
+| 세그먼트 검증 | `haechi audit-verify --audit <seg>.jsonl --anchor <seg>.anchor.jsonl` |
+| 우아한 정지 | `docker compose stop` (SIGTERM → 드레인) |
+참고: `configVersion` 스탬프와 업그레이드 노트는 [`config-version.ko.md`](./config-version.ko.md)를 참고하십시오.

package/docs/current/operations-runbook.md ADDED Viewed

@@ -0,0 +1,241 @@
+# Haechi Operations Runbook (Day-2)
+- Status: Living document (tracks core 1.3.x)
+A practical guide to running Haechi in production: deploy, configure via the
+env-var overlay, monitor with health/readiness/metrics, shut down gracefully,
+tune backpressure, and rotate the audit log without breaking its hash chain.
+This is an operability guide, not a compliance guarantee. See
+[`configuration.md`](./configuration.md) for the full config reference and
+[`threat-model.md`](./threat-model.md) for the trust boundary.
+## 1. Deploy
+Haechi is a zero-runtime-dependency Node `>=22` package. The reference
+[`Dockerfile`](../../Dockerfile), [`docker-compose.yml`](../../docker-compose.yml),
+and [`.dockerignore`](../../.dockerignore) at the repo root build a hardened
+image (these files are **not** shipped in the npm tarball — they are repo deploy
+assets). The image:
+- pins a Node 22 slim base (matches `engines: ">=22"`),
+- runs as the non-root `node` user,
+- copies only the runtime files (no `.haechi` secrets, no tests, no docs sources),
+- declares a writable `/app/.haechi` volume for the audit chain / key file / token
+  vault and runs the rest of the tree read-only,
+- ships a `HEALTHCHECK` against `/__haechi/live`.
+```bash
+docker compose up -d        # build + run the reference stack
+docker compose logs -f haechi
+```
+**Front it with TLS + auth.** Haechi has no TLS of its own. Publish its port only
+to a TLS-terminating, authenticating reverse proxy (nginx / Caddy / Traefik / an
+API gateway); never expose the raw Haechi port on a public interface. The compose
+example publishes to host loopback (`127.0.0.1:11016`) for exactly this reason.
+**Binding beyond loopback.** Inside a container Haechi must bind `0.0.0.0` for the
+mapped port to be reachable, which requires `--allow-remote-bind` (the reference
+`CMD` passes it). On a host, prefer the default loopback bind and reach Haechi
+through the reverse proxy. See [Binding beyond loopback](./configuration.md#binding-beyond-loopback).
+## 2. Configuration via the env-var overlay
+For container / 12-factor deploys, a **fixed allowlist of NON-SECRET operational
+keys** may be overridden from the environment. The env value **wins over the
+config file** and is validated fail-closed — an invalid value (bad port, unknown
+mode) makes the process **fail to start** rather than degrade silently.
+| Env var | Config key | Type / values | Example |
+|---|---|---|---|
+| `HAECHI_PROXY_PORT` | `proxy.port` | integer 0–65535 | `11016` |
+| `HAECHI_PROXY_HOST` | `proxy.host` | non-empty string | `0.0.0.0` |
+| `HAECHI_UPSTREAM` | `target.upstream` | URL string | `http://llm:8000` |
+| `HAECHI_MODE` | `mode` | `dry-run` \| `report-only` \| `enforce` | `enforce` |
+| `HAECHI_LOG_FORMAT` | `logging.format` | `text` \| `json` | `json` |
+**Secrets are NOT overlayable — by design.** There is **no** `HAECHI_*` variable
+for `keys.*` (the local key file or an external key path), the auth token store,
+or any token/secret. Secrets stay in the mounted config file or are supplied via
+**injected providers** (`createRuntime(config, { cryptoProvider, authProvider, … })`).
+Putting a secret in a process environment invites leaking it through `/proc`,
+crash dumps, orchestrator inspect output, and child processes — so the overlay
+allowlist excludes them outright.
+The overlay is applied in `loadConfig()` after reading the file and before
+`normalizeConfig()`, so an overlaid value passes the same validation as a
+file-set one.
+## 3. Health, readiness, and metrics scraping
+Four unauthenticated routes under the reserved `/__haechi/*` prefix, checked
+before auth/body-read, never proxying upstream (full reference:
+[Operability endpoints](./configuration.md#operability-endpoints)):
+| Endpoint | Use |
+|---|---|
+| `GET /__haechi/live` | **Liveness** — restart probe. Cheap; 200 while the event loop serves. |
+| `GET /__haechi/ready` | **Readiness** — traffic gate. **503 when the audit sink is not writable** (a gateway that cannot audit is not ready). Point your load balancer / orchestrator readiness probe here. |
+| `GET /__haechi/health` | Back-compat (`ok` + `mode` + `version`). |
+| `GET /__haechi/metrics` | Prometheus text exposition. `404` when `metrics.enabled: false`. |
+**Scrape `/metrics`** with Prometheus (or any OpenMetrics-compatible scraper):
+```yaml
+scrape_configs:
+  - job_name: haechi
+    metrics_path: /__haechi/metrics
+    static_configs:
+      - targets: ["haechi:11016"]
+```
+Key signals: `haechi_requests_total{route,mode,decision}`, `haechi_blocks_total`,
+`haechi_auth_denied_total`, `haechi_rate_limited_total`, `haechi_overloaded_total`
+(backpressure 503s), `haechi_upstream_timeout_total`, `haechi_upstream_error_total`,
+`haechi_response_unprotected_total`, `haechi_internal_error_total`, and the
+`haechi_request_duration_seconds{route}` histogram.
+**No-PII-in-telemetry invariant.** Every metric name and **every label value** is
+a bounded enum (route id / mode / decision class) — never an identity, token, or
+detected value. The same invariant covers structured logs: with
+`logging.format: json` (or `HAECHI_LOG_FORMAT=json`), startup/shutdown/error logs
+carry a `correlationId` and an error class name only, never a payload. The
+`correlationId` also appears on the request's audit events, so you can join a
+logged error to its audit trail.
+## 4. Graceful shutdown
+On `SIGINT`/`SIGTERM` the CLI calls the proxy's `close()`, which **drains
+gracefully**:
+1. stops accepting new connections (`server.close()`),
+2. immediately closes idle keep-alive sockets (`closeIdleConnections()`),
+3. waits for in-flight requests to finish,
+4. after a grace period (`limits.shutdownGraceMs`, default 10000ms) force-closes
+   any lingering socket (`closeAllConnections()`) so a stuck keep-alive cannot
+   hold shutdown open forever.
+`close()` resolves once in-flight requests drain or the grace elapses. Set your
+orchestrator's `terminationGracePeriod` (Kubernetes) / `stop_grace_period`
+(compose) **above** `limits.shutdownGraceMs` so the platform does not SIGKILL
+mid-drain. Tune `limits.shutdownGraceMs` to your longest acceptable in-flight
+request.
+## 5. Backpressure tuning
+`limits.maxInFlight` is a global ceiling on concurrently-processing requests.
+- `0` (default) disables the ceiling — unchanged 1.1 behavior.
+- `> 0`: when the live in-flight count is at the ceiling, a **new** request is
+  rejected `503` with a `Retry-After` header (seconds, derived from
+  `limits.shutdownGraceMs`) and a `{ "error": "haechi_overloaded" }` body, **before**
+  auth and body-read. Each rejection increments `haechi_overloaded_total`.
+- The `/__haechi/*` observability routes are **exempt** from the ceiling, so
+  liveness and `/metrics` stay scrapable under saturation — you can still see
+  *why* you are shedding load.
+Set `maxInFlight` near the concurrency your upstream + host can sustain (watch
+`haechi_request_duration_seconds` and upstream saturation), leaving headroom so
+the gateway sheds load with a clean 503 instead of collapsing. Pair it with a
+tuned `limits.upstreamTimeoutMs` so a slow upstream cannot pin slots indefinitely.
+### Tuned timeouts
+`limits.requestTimeoutMs` and `limits.headersTimeoutMs` map to the Node HTTP
+server's `requestTimeout` / `headersTimeout`. Both default to `null` = leave
+Node's server defaults untouched (behavior unchanged unless you opt in). Set a
+number to cap slow-loris-style slow request/header delivery; `0` disables that
+specific timeout (Node semantics).
+## 6. Chain-aware audit log rotation & retention
+The audit log is a **SHA-256 hash chain** (`audit.path`): each event's
+`auditIntegrity.previousHash` links to the prior event's hash, so any insert,
+delete, edit, or reorder is detectable by `haechi audit-verify` /
+`verifyAuditChain`. An optional **anchor stream** (`audit.anchor`) appends the
+chain head to separate append-only media so even tail truncation (deleting the
+newest events) is caught. See [`audit` concepts](./configuration.md#audit) and the
+threat model.
+**Never truncate or rewrite a chain mid-stream.** Rotating by truncating
+`audit.jsonl` in place, or rewriting earlier lines, **breaks the chain** and makes
+verification fail (or, worse, silently destroys tamper evidence). Rotate by
+**starting a new segment**, preserving prior segments:
+1. **Stop or quiesce** the writer (graceful shutdown, or rotate at a maintenance
+   window). The default JSONL sink appends; rotating a file it holds open is what
+   you are avoiding.
+2. **Move the current segment aside**, keeping it intact:
+   `mv .haechi/audit.jsonl .haechi/audit-2026-06-12.jsonl` (and the matching
+   anchor: `mv .haechi/audit.anchor.jsonl .haechi/audit-2026-06-12.anchor.jsonl`).
+3. **Start a fresh segment** by restarting Haechi (or pointing `audit.path` /
+   `audit.anchor.path` at the new files). The new chain begins with
+   `previousHash: null` — a fresh, independently-verifiable chain. This is
+   expected: each segment is its own verifiable chain; you do **not** chain across
+   the rotation boundary.
+4. **Verify each retained segment independently** with its own anchor:
+   `haechi audit-verify --audit .haechi/audit-2026-06-12.jsonl --anchor .haechi/audit-2026-06-12.anchor.jsonl`.
+5. **Retain prior segments** for your retention window so the full history stays
+   verifiable. Archive (don't delete) to append-only / WORM storage where you can;
+   the anchor's defense assumes the anchor lives on separate, append-only media.
+**Retention:** keep each rotated segment (and its anchor) for your required
+audit-retention period, then expire whole segments — never partial lines within a
+segment. Token-vault retention is independent (`tokenVault.retentionDays`); audit
+rotation does not purge tokens.
+**Do not** compress/encrypt a segment in a way that prevents later
+re-verification unless you keep the verification step in your archival pipeline. A
+rotated segment is only useful as evidence if it still verifies.
+## 7. Benchmarking proxy throughput
+`npm run bench:throughput` (`scripts/bench-throughput.mjs`) measures the proxy's
+added per-request overhead under concurrency. It stands up a deterministic local
+**stub** OpenAI-compatible upstream (an instant canned reply — no real model) and
+the **real** Haechi proxy in front of it, drives a configurable load with a
+fixed-size worker pool of in-flight `fetch`es, and reports **req/s** plus
+**p50/p95/p99/max** latency (percentiles by nearest-rank over a sorted sample). It
+runs three scenarios:
+1. **throughput + latency** at a fixed concurrency (a warmup batch is excluded
+   from the reported stats — JIT/connection warmup skews the first requests),
+2. **enforce vs dry-run overhead** — the same load run in both modes, reporting
+   the latency/throughput **delta** so the cost of protection is a measured number,
+3. **backpressure** — a low `limits.maxInFlight` saturated by a burst, reporting
+   how many requests got `503 + Retry-After` vs `200` (observed live, proving the
+   ceiling sheds load).
+```bash
+npm run bench:throughput
+HAECHI_BENCH_REQUESTS=5000 HAECHI_BENCH_CONCURRENCY=64 npm run bench:throughput
+```
+Knobs (env, printed at the top of every run): `HAECHI_BENCH_REQUESTS` (total,
+default 2000), `HAECHI_BENCH_CONCURRENCY` (default 32), `HAECHI_BENCH_WARMUP`
+(excluded warmup count, default 100), `HAECHI_BENCH_PAYLOAD_KB` (default 1),
+`HAECHI_BENCH_MAXINFLIGHT` (the backpressure scenario's ceiling, default 4).
+> **The numbers are machine-relative.** This is a **loopback, single-process,
+> stub-upstream micro-benchmark**: the stub, the proxy, and the load generator all
+> run in one Node process on `127.0.0.1`, so there is no real network and no real
+> model. The numbers measure Haechi's added overhead only, and vary by machine,
+> Node version, and load. They are **not** a network/hardware throughput benchmark
+> and must **not** be quoted as guarantees. The bench is not run by
+> `release:preflight`.
+## 8. Quick reference
+| Task | Command |
+|---|---|
+| Start (compose) | `docker compose up -d` |
+| Liveness | `curl localhost:11016/__haechi/live` |
+| Readiness | `curl localhost:11016/__haechi/ready` |
+| Metrics | `curl localhost:11016/__haechi/metrics` |
+| Throughput bench | `npm run bench:throughput` |
+| Verify a segment | `haechi audit-verify --audit <seg>.jsonl --anchor <seg>.anchor.jsonl` |
+| Graceful stop | `docker compose stop` (SIGTERM → drain) |
+See also: [`config-version.md`](./config-version.md) for the `configVersion`
+stamp and upgrade notes.

package/docs/current/release-process.ko.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Haechi Release Process
-- 문서 상태: Living document (core 1.1.x 추적)
+- 문서 상태: Living document (core 1.3.x 추적)
 - 작성일: 2026-06-10
 ## 1. 로컬 릴리즈 검증
@@ -70,6 +70,7 @@ npm audit signatures
 | `.github/workflows/auth-jwt-publish.yml` | `haechi-auth-jwt` | `auth-jwt-v<semver>` | satellite publish, 동일한 서명 아티팩트 경로 |
 | `.github/workflows/dashboard-publish.yml` | `haechi-dashboard` | `dashboard-v<semver>` | satellite publish, 동일한 서명 아티팩트 경로 |
 | `.github/workflows/auth-oidc-publish.yml` | `haechi-auth-oidc` | `auth-oidc-v<semver>` | satellite publish, 동일한 서명 아티팩트 경로 |
+| `.github/workflows/ratelimit-redis-publish.yml` | `haechi-ratelimit-redis` | `ratelimit-redis-v<semver>` | satellite publish, 동일한 서명 아티팩트 경로 |
 각 publish 워크플로는 `release: published`에서 트리거되지만 **가드**되어 둘이 교차 발화하지 않습니다. core job은 `v`로 시작하는 태그에서만 실행되고(그리고 `^v[0-9]+\.[0-9]+\.[0-9]+$` 재검증), satellite job은 `crypto-kms-v…`에서만 실행됩니다(그리고 `^crypto-kms-v[0-9]+\.[0-9]+\.[0-9]+$` 재검증 **및** 태그 버전이 satellite `package.json` 버전과 일치하는지 검증). npmjs.com Trusted Publisher는 각 패키지의 **특정 워크플로 파일명**에 바인딩됩니다 — 워크플로 파일 rename은 npm 설정을 갱신할 때까지 OIDC publish를 깨뜨립니다.
@@ -92,6 +93,7 @@ Satellite는 npm workspaces 모노레포의 `satellites/*`에 살며 core와 **
 | `haechi-auth-jwt` | `auth-jwt-v<semver>` | `auth-jwt-publish.yml` | `satellites/auth-jwt/package.json` |
 | `haechi-dashboard` | `dashboard-v<semver>` | `dashboard-publish.yml` | `satellites/dashboard/package.json` |
 | `haechi-auth-oidc` | `auth-oidc-v<semver>` | `auth-oidc-publish.yml` | `satellites/auth-oidc/package.json` |
+| `haechi-ratelimit-redis` | `ratelimit-redis-v<semver>` | `ratelimit-redis-publish.yml` | `satellites/ratelimit-redis/package.json` |
 **satellite 릴리스 검증** (core와 동일한 신뢰 앵커):
@@ -104,6 +106,8 @@ npm view haechi-crypto-kms --json   # dist.attestations 존재 확인; access "p
 **0.9 satellite(새 unscoped 이름 — 첫 태그 *전에* Trusted Publisher 설정):** `haechi-dashboard`와 `haechi-auth-oidc`는 0.9에서 첫 발행되며 위의 satellite별 부트스트랩 순서를 동일하게 따릅니다. 0.8 satellite와 마찬가지로 unscoped 이름은 첫 OIDC publish 시 확보되므로, 각각의 npmjs.com Trusted Publisher를 첫 태그 **전에** 설정해야 합니다 — `raeseoklee/haechi` 저장소와 정확한 워크플로 파일명(`haechi-dashboard`는 `dashboard-publish.yml`, `haechi-auth-oidc`는 `auth-oidc-publish.yml`)을 연결한 뒤, 접두사 태그(`dashboard-v0.1.0`, `auth-oidc-v0.1.0`)를 push하고 GitHub Release를 발행합니다. 기존 두 satellite는 이미 부트스트랩된 태그/워크플로를 그대로 사용합니다: `haechi-auth-jwt@0.2.0`은 `auth-jwt-v<semver>`(`auth-jwt-publish.yml`), `haechi-crypto-kms@0.2.0`은 `crypto-kms-v<semver>`(`crypto-kms-publish.yml`) — 이 둘은 새 Trusted Publisher 설정이 필요 없습니다.
+**`haechi-ratelimit-redis`(새 unscoped 이름 — 첫 태그 *전에* Trusted Publisher 설정):** 공유 저장소 rate-limiter satellite는 고유의 `ratelimit-redis-v<semver>` 태그에서 첫 발행되며 위의 satellite별 부트스트랩 순서를 동일하게 따릅니다. unscoped 이름은 첫 OIDC publish 시 확보되므로, npmjs.com Trusted Publisher를 첫 태그 **전에** 설정해야 합니다 — `raeseoklee/haechi` 저장소와 정확한 워크플로 파일명 `ratelimit-redis-publish.yml`을 연결한 뒤, 접두사 태그(`ratelimit-redis-v0.1.0`)를 push하고 GitHub Release를 발행합니다. `redis` 클라이언트는 **optional peer dependency**이며 번들된 Redis 어댑터를 쓰는 소비자만 import합니다(store/client는 주입됩니다). 따라서 core는 zero-dependency로 유지됩니다.
 ## 6. 배포 차단 조건
 다음 중 하나라도 실패하면 npm publish를 하지 않습니다.

package/docs/current/release-process.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Haechi Release Process
-- Status: Living document (tracks core 1.1.x)
+- Status: Living document (tracks core 1.3.x)
 - Date: 2026-06-10
 ## 1. Local Release Verification
@@ -70,6 +70,7 @@ npm audit signatures
 | `.github/workflows/auth-jwt-publish.yml` | `haechi-auth-jwt` | `auth-jwt-v<semver>` | satellite publish, same signed-artifacts path |
 | `.github/workflows/dashboard-publish.yml` | `haechi-dashboard` | `dashboard-v<semver>` | satellite publish, same signed-artifacts path |
 | `.github/workflows/auth-oidc-publish.yml` | `haechi-auth-oidc` | `auth-oidc-v<semver>` | satellite publish, same signed-artifacts path |
+| `.github/workflows/ratelimit-redis-publish.yml` | `haechi-ratelimit-redis` | `ratelimit-redis-v<semver>` | satellite publish, same signed-artifacts path |
 Each publish workflow triggers on `release: published` but is **guarded** so the two never cross-fire: the core job runs only for tags starting `v` (and re-validates `^v[0-9]+\.[0-9]+\.[0-9]+$`); the satellite job runs only for `crypto-kms-v…` (and re-validates `^crypto-kms-v[0-9]+\.[0-9]+\.[0-9]+$` **and** that the tag version equals the satellite's `package.json` version). The npmjs.com Trusted Publisher for each package is bound to its **specific workflow filename** — renaming a workflow file breaks its OIDC publish until the npm config is updated.
@@ -92,6 +93,7 @@ No manual `npm publish` from a laptop is needed. Because the names are unscoped
 | `haechi-auth-jwt` | `auth-jwt-v<semver>` | `auth-jwt-publish.yml` | `satellites/auth-jwt/package.json` |
 | `haechi-dashboard` | `dashboard-v<semver>` | `dashboard-publish.yml` | `satellites/dashboard/package.json` |
 | `haechi-auth-oidc` | `auth-oidc-v<semver>` | `auth-oidc-publish.yml` | `satellites/auth-oidc/package.json` |
+| `haechi-ratelimit-redis` | `ratelimit-redis-v<semver>` | `ratelimit-redis-publish.yml` | `satellites/ratelimit-redis/package.json` |
 **Verify a satellite release** (same anchors as core):
@@ -104,6 +106,8 @@ npm view haechi-crypto-kms --json   # dist.attestations present; access "public"
 **0.9 satellites (new unscoped names — configure Trusted Publisher *before* the first tag):** `haechi-dashboard` and `haechi-auth-oidc` are first-published in 0.9 and follow the same per-satellite bootstrap order above. As with the 0.8 satellites, the unscoped name is claimed on first OIDC publish, so the npmjs.com Trusted Publisher for each must be configured **before** its first tag — link `raeseoklee/haechi` and the exact workflow filename (`dashboard-publish.yml` for `haechi-dashboard`, `auth-oidc-publish.yml` for `haechi-auth-oidc`), then push the prefixed tag (`dashboard-v0.1.0`, `auth-oidc-v0.1.0`) and publish the GitHub Release. The two existing satellites ride their already-bootstrapped tags/workflows: `haechi-auth-jwt@0.2.0` on `auth-jwt-v<semver>` (`auth-jwt-publish.yml`) and `haechi-crypto-kms@0.2.0` on `crypto-kms-v<semver>` (`crypto-kms-publish.yml`) — no new Trusted Publisher configuration is required for those two.
+**`haechi-ratelimit-redis` (new unscoped name — configure Trusted Publisher *before* the first tag):** the shared-store rate-limiter satellite is first-published from its own `ratelimit-redis-v<semver>` tag and follows the same per-satellite bootstrap order above. The unscoped name is claimed on its first OIDC publish, so its npmjs.com Trusted Publisher must be configured **before** its first tag — link `raeseoklee/haechi` and the exact workflow filename `ratelimit-redis-publish.yml`, then push the prefixed tag (`ratelimit-redis-v0.1.0`) and publish the GitHub Release. The `redis` client is an **optional peer dependency**, imported only by consumers using the bundled Redis adapter (the store/client is injected), so core stays zero-dependency.
 ## 6. Deployment block conditions
 npm publish is not performed if any of the following fail.

package/docs/current/risk-register-release-gate.ko.md CHANGED Viewed

@@ -1,8 +1,8 @@
 # Haechi 리스크 레지스터 및 릴리스 게이트
-- 문서 상태: Living document(core 1.1.x 추적)
+- 문서 상태: Living document(core 1.3.x 추적)
 - 작성일: 2026-06-11
-- 기준 버전: 1.1.x
+- 기준 버전: 1.3.x
 - 기준 브랜치: `main`
 ## 1. 현재 판단
@@ -23,10 +23,12 @@ Haechi는 `1.x` stable 라인을 출시했습니다. developer preview 게이트
 | G0 | GitHub source 공개 | 테스트 통과, 보안 한계 문서화, 평문 audit leak 없음 | Pass |
 | G1 | GitHub pre-release | P0 코드 리스크 해결, production-ready 표현 없음 | Pass |
 | G2 | npm developer preview | P0 해결, preflight/SBOM/provenance 경로 준비, npm auth 확인 | Pass (`haechi@0.3.2` 2026-06-10 배포) |
-| G3 | npm stable | P1 운영 reference, stream-aware enforcement, API stability 강화 | Blocked |
+| G3 | npm stable | P1 운영 reference, stream-aware enforcement, API stability 강화 | Pass (1.0.0 stable 컷에서 달성 — streaming inspection은 0.5, API freeze는 1.0.0에서 출시; G5 참조. G5–G7로 대체됨.) |
 | G4 | 0.9.0 observability + interactive-auth 위성 컷 | P1-SEC-026 / P1-OPS-009 mitigated 및 P2-CRYPTO-001 accepted; `haechi-dashboard` + `haechi-auth-oidc` + `haechi-crypto-kms@0.2.0` 테스트 통과; 위성 tarball zero-dep; core 0.9.0 bump(추가적 FORBIDDEN_KEYS audit 강화만) | Pass |
 | G5 | 1.0.0 stable API contract + signed-plugin sandbox | P1-SEC-024 / P1-SEC-025 mitigated, P2-API-001 / P2-OPS-006 resolved; API freeze + deprecation policy + `tests/api-contract.test.mjs` 통과; Ed25519 signed-plugin contract + `assertAuthProviderConformance` + worker-isolated `authProvider` sandbox 테스트 통과; PR0 위성 peer-range를 `>=0.8.0 <2.0.0`로 확대 및 `check-satellite-peer-ranges.mjs` preflight 게이트 통과; core는 zero runtime dependency 유지; core 1.0.0 bump | Pass |
 | G6 | 1.1.0 plugin capability 강제 (`process-isolated`) | P1-SEC-027 / P1-SEC-028 mitigated; `process-isolated` 런타임(`--permission` 하 자식, 부여 0, `data:` URL 로드, stdio 무시, JSON-string IPC) + fail-closed `--allow-net` 기능 탐지(`netEnforcement:"require-permission"`) + 코어 `haechi/ssrf` 가드 + 호스트 중개 키 자료 + spawn-storm 서킷 브레이커; fs/net/stdio 레드팀 + SSRF + config 테스트 통과(행동 스위트는 `--allow-net` Node에서 실행, 아니면 fail-closed로 skip); API freeze 통과 유지(additive `./ssrf` export + additive config 키); core는 zero runtime dependency 유지; core 1.1.0 bump(additive + opt-in 마이너) | Pass |
+| G7 | 1.2.0 신뢰성 강화 트랙 (WS1–WS6) | 탐지 품질 측정+강화(WS2: 라벨 코퍼스 precision/recall `bench:detection` 게이트, 자격증명+국제 PII 커버리지, 하드블록 타입 불변식이 적용된 `filters.minConfidence` / `filters.allowlist`, offset 무결성을 갖춘 NFKC 유니코드 회피 폴딩); WS3 주입 가능한 `rateLimiter` 시임 + bounded fixed-window map; WS4 운영성(`/__haechi/live`+`/ready` 분리, 주입 가능한 `/metrics`, 구조적 로그 + 요청별 `correlationId`, graceful drain, max-in-flight backpressure, env overlay, 하드닝 Dockerfile/compose/runbook, `configVersion`); WS6 proxy TLS / remote-bind 하드닝(`proxy.tls` / `proxy.trustForwardedProto`, fail-closed `assertSafeProxyTransport`) + OWASP-LLM/NIST 컨트롤 매핑 백서 + RFC 9116 `security.txt` + 취약점 공개 경로. 모든 변경은 1.1 동작을 보존하는 기본값 뒤의 additive(`tests/api-contract.test.mjs` 통과); no-plaintext-in-audit 불변식이 텔레메트리까지 확장; core는 zero runtime dependency 유지; core 1.2.0 bump(additive 마이너) | Pass |
+| G8 | 1.3.0 백엔드 + 탐지 커버리지 확장 | **Anthropic Messages API**(`/v1/messages`, content-block + SSE `delta.text`, `event:` 라인 보존 재직렬화)와 **Google Gemini API**(model-in-path `:generateContent`/`:streamGenerateContent`, 기존 정확-매칭 어댑터를 바이트 동일하게 두는 additive `:method`-suffix 라우트 매처) 프로토콜 어댑터 추가; 탐지 커버리지 확장 — 클라우드/SaaS provider 키(OpenAI/Anthropic/Google-OAuth/SendGrid/Twilio/npm/Azure, anchored)와 국제 PII(FR/ES/JP + IT/SG/IN/DE/NL 국가 ID, 체크섬 validator), 각 하드블록-대-dial-eligible 결정은 측정된 충돌률 기반(하드블록은 비숫자 앵커 또는 비현실적으로 드문 형태가 필요; 흔한 길이의 bare-digit run은 allowlist로 정리 가능 유지); `bench:throughput` proxy 부하 벤치; `haechi-ratelimit-redis` 공유 저장소 rate-limiter 위성(WS3 시임의 운영 소비자; proxy가 이제 `rateLimiter.allow`를 `await`); `haechi-dashboard`가 요청별 `correlationId` 노출. 모든 변경은 additive — 새 `target.type`/탐지타입/`privacy.profile` *값*이며 새 config 키가 아님(`configVersion`은 `1` 유지); `tests/api-contract.test.mjs` 통과; core는 zero runtime dependency 유지; core 1.3.0 bump(additive 마이너) | Pass |
 ## 3. P0 배포 차단 리스크 상태

package/docs/current/risk-register-release-gate.md CHANGED Viewed

@@ -1,8 +1,8 @@
 # Haechi Risk Register and Release Gates
-- Status: Living document (tracks core 1.1.x)
+- Status: Living document (tracks core 1.3.x)
 - Date: 2026-06-11
-- Target version: 1.1.x
+- Target version: 1.3.x
 - Branch: `main`
 ## 1. Current Assessment
@@ -23,10 +23,12 @@ Haechi has shipped its `1.x` stable line. The developer-preview gate (G2, `haech
 | G0 | GitHub source publication | Tests pass, security limitations documented, no plaintext audit leak | Pass |
 | G1 | GitHub pre-release | P0 code risks resolved, no production-ready language | Pass |
 | G2 | npm developer preview | P0 resolved, preflight/SBOM/provenance paths ready, npm auth confirmed | Pass (`haechi@0.3.2` published 2026-06-10) |
-| G3 | npm stable | P1 production reference, stream-aware enforcement, API stability hardened | Blocked |
+| G3 | npm stable | P1 production reference, stream-aware enforcement, API stability hardened | Pass (achieved at the 1.0.0 stable cut — streaming inspection shipped in 0.5, the API freeze in 1.0.0; see G5. Superseded by G5–G7.) |
 | G4 | 0.9.0 observability + interactive-auth satellite cut | P1-SEC-026 / P1-OPS-009 mitigated and P2-CRYPTO-001 accepted; `haechi-dashboard` + `haechi-auth-oidc` + `haechi-crypto-kms@0.2.0` tests green; satellite tarballs zero-dep; core bumped to 0.9.0 (only an additive FORBIDDEN_KEYS audit hardening) | Pass |
 | G5 | 1.0.0 stable API contract + signed-plugin sandbox | P1-SEC-024 / P1-SEC-025 mitigated, P2-API-001 / P2-OPS-006 resolved; the API freeze + deprecation policy + `tests/api-contract.test.mjs` green; the Ed25519 signed-plugin contract + `assertAuthProviderConformance` + the worker-isolated `authProvider` sandbox tests green; PR0 satellite peer-ranges widened to `>=0.8.0 <2.0.0` and the `check-satellite-peer-ranges.mjs` preflight gate green; core stays zero runtime dependency; core bumped to 1.0.0 | Pass |
 | G6 | 1.1.0 plugin capability enforcement (`process-isolated`) | P1-SEC-027 / P1-SEC-028 mitigated; the `process-isolated` runtime (child under `--permission`, zero grants, `data:`-URL load, stdio-ignored, JSON-string IPC) + the fail-closed `--allow-net` feature detection (`netEnforcement:"require-permission"`) + the core `haechi/ssrf` guard + host-mediated key material + the spawn-storm circuit breaker; the fs/net/stdio red-team + SSRF + config tests green (the behavioral suite runs on a `--allow-net` Node and skips fail-closed otherwise); the API freeze stays green (additive `./ssrf` export + additive config keys); core stays zero runtime dependency; core bumped to 1.1.0 (additive + opt-in minor) | Pass |
+| G7 | 1.2.0 Reliability Hardening Track (WS1–WS6) | Detection quality measured + tightened (WS2: a labeled-corpus precision/recall `bench:detection` gate, credential + international-PII coverage, `filters.minConfidence` / `filters.allowlist` with the hard-block-types invariant, NFKC unicode-evasion folding with offset-integrity); WS3 injectable `rateLimiter` seam + bounded fixed-window map; WS4 operability (`/__haechi/live`+`/ready` split, injectable `/metrics`, structured logs + per-request `correlationId`, graceful drain, max-in-flight backpressure, env overlay, hardened Dockerfile/compose/runbook, `configVersion`); WS6 proxy TLS / remote-bind hardening (`proxy.tls` / `proxy.trustForwardedProto`, fail-closed `assertSafeProxyTransport`) + OWASP-LLM/NIST control-mapping whitepaper + RFC 9116 `security.txt` + vulnerability-disclosure path. Every change is additive behind 1.1-preserving defaults (`tests/api-contract.test.mjs` green); the no-plaintext-in-audit invariant extends to telemetry; core stays zero runtime dependency; core bumped to 1.2.0 (additive minor) | Pass |
+| G8 | 1.3.0 backend + detection coverage expansion | New protocol adapters for the **Anthropic Messages API** (`/v1/messages`, content-block + SSE `delta.text` with `event:`-line-preserving re-serialize) and the **Google Gemini API** (model-in-path `:generateContent`/`:streamGenerateContent` via an additive `:method`-suffix route matcher that leaves the exact-match adapters byte-identical); detection coverage expansion — cloud/SaaS provider keys (OpenAI/Anthropic/Google-OAuth/SendGrid/Twilio/npm/Azure, anchored) and international PII (FR/ES/JP + IT/SG/IN/DE/NL national IDs with checksum validators), each hard-block-vs-dial-eligible decision driven by measured collision rates (a non-numeric anchor or implausibly-rare shape is required for hard-block; a bare-digit run over a common length stays allowlist-clearable); a `bench:throughput` proxy load benchmark; the `haechi-ratelimit-redis` shared-store rate-limiter satellite (the WS3 seam's production consumer; the proxy now `await`s `rateLimiter.allow`); `haechi-dashboard` surfaces the per-request `correlationId`. Every change is additive — new `target.type`/detection-type/`privacy.profile` *values*, not new config keys (`configVersion` stays `1`); `tests/api-contract.test.mjs` green; core stays zero runtime dependency; core bumped to 1.3.0 (additive minor) | Pass |
 ## 3. P0 Distribution-Blocking Risk Status
@@ -126,6 +128,14 @@ These IDs are scoped to the 1.0.0 stable cut (the API freeze + the Ed25519 signe
 | P1-SEC-027 | Plugin capability *enforcement*: the 1.0 `worker_threads` sandbox is memory/crash isolation only, so a malicious signed plugin can use `fs`/`net` and exfiltrate the credential. **Strengthens P1-SEC-024's accepted worker residual** — 1.1 adds real enforcement for a new opt-in runtime | Mitigated | `packages/plugin/process-sandbox.mjs` `createProcessIsolatedAuthProvider`/`…Sync` (PR #54): a signed `authProvider` runs in a child `node` under `--permission` with **zero grants** (no fs/child-process/worker/addons/wasi, no `--allow-net`), loaded from a `data:` URL (no fs grant → no TOCTOU/symlink surface), `stdio:['ignore','ignore','ignore','ipc']` (no stdout/stderr/fd leak channel), scrubbed env, JSON-string-only IPC + the shared null-proto sanitizer + host-side keyed-HMAC identity. **Empirically validated on Node 26**: the plugin's `fs`/`net`/`fetch`/`dns`/`child_process`/`worker` and the `process.binding('tcp_wrap')` bypass are all `ERR_ACCESS_DENIED`. Network containment is the **kernel `--allow-net` denial**, not a deletable JS harness; the default `netEnforcement:"require-permission"` **fails closed** (behavior-probed feature detection; PR #54) on a Node that cannot enforce it. A spawn-storm circuit breaker (PR #56) bounds respawns. Lifecycle audit gains host-computed/enum-only `isolation`/`grants`/`netEnforcement` (PR #56). Config: `auth.plugin.isolation:"process"` wired fail-closed (PR #56). Tests: the fs/net/stdio red-team (skipped on a Node without `--allow-net`, where the runtime fails closed instead) + the always-run fail-closed contract + the config matrix. **Residual:** a Node without `--allow-net` (fail-closed, not contained); a `networkEgress`-granted plugin; credential/key material in child memory (core-dump/swap); a V8/Node escape (a runtime control, not an OS sandbox) |
 | P1-SEC-028 | Host-mediated key material + SSRF: a custom-credential plugin needing key material could be a plugin-driven SSRF vector, and core had no SSRF guard (the satellites' copies are unreachable from core) | Mitigated | A new node:-only, zero-dependency **`haechi/ssrf`** core module (PR #55): `isBlockedAddress` (private/loopback/link-local/metadata), `guardedFetch` (https-only, post-DNS re-check, `redirect:"error"`, bounded body + timeout), `createGuardedKeyFetcher` (TTL cache + cooldown). The `process-isolated` runtime's optional `keyMaterial:{url}` is fetched by the **host** from the **operator-declared** URL through this guard and injected over the IPC — the plugin never names a URL (no plugin-driven SSRF), and the kid-refetch cooldown bounds the outbound rate; a blocked-address URL fails closed. Tests: the canonical `isBlockedAddress` vector table + a core-vs-`auth-jwt` parity guard, `guardedFetch` SSRF refusal/bounding, the cooldown fail-closed, and the runtime key-injection + no-SSRF tests. **Residual:** the satellites keep their DELIBERATE local copies (a crypto/auth package must not runtime-depend on core-ssrf; `crypto-kms/ssrf-parity.test.mjs`) — the core re-import is deferred and the drift is guarded by parity, not eliminated; the guard's DNS-rebinding window (resolve-then-connect) is accepted for an operator-declared URL |
+## 5.6 Reliability Hardening Track — Horizontal-scale & State Safety (WS3)
+Additive, accumulating on `main` toward a later `1.2.0` minor; the seam + honest docs, never a built-in distributed store (track §3 non-goal).
+| ID | Risk | Status | Resolution evidence |
+|---|---|---|---|
+| P1-OPS-010 | Proxy rate limiter is single-process and **not injectable**, and its fixed-window `Map` is **never pruned** — a one-shot identity's slot lingers forever, so a high-cardinality identity stream is unbounded memory growth keyed by identity; and a multi-replica deployment silently weakens the limit (per-process throughput multiplies by the replica count) with no replaceable seam | Mitigated | The rate limiter is now an **injectable collaborator** mirroring `cryptoProvider`/`auditSink`/`tokenVault`: `createRuntime(config, { rateLimiter })` (`packages/cli/runtime.mjs`) supplies it, `assertProvider("rateLimiter", …, ["allow"])` fails closed at construction if it lacks `allow()`, and it is exposed on the returned runtime object; the proxy consults `runtime.rateLimiter` (`packages/proxy/index.mjs`, with a backward-compatible local-default fallback for a hand-built runtime). The default per-process in-memory fixed-window limiter (the documented default; `allow(key, limit) -> boolean`, 429 semantics unchanged) is **self-bounding**: a lazy, amortized sweep evicts fully-expired window slots once the `Map` crosses a size threshold — **no background timer** (so `node --test` does not hang). A multi-replica operator injects a shared-store implementation (e.g. Redis) satisfying the same contract, or enforces the limit at a shared front door. Docs: `configuration.md`(+ko) "Rate limiter injection" seam, `shared-responsibility.md`(+ko) §4. Tests: `tests/rate-limiter.test.mjs` — an injected limiter is the one consulted (deny→429, allow→pass-through), fail-closed on a missing `allow()`, the default limiter prunes aged-out one-shot identities (bounded `Map` via `_size()`), and the fixed-window limit/isolation semantics are unchanged; the existing `tests/proxy-auth.test.mjs` 429 test stays green. **Residual:** core ships **no** built-in distributed limiter (track non-goal §5) — a shared-store implementation is the operator's injection or a future satellite; the default's per-process scope is the documented honest default |
 ## 6. P2 Product/Documentation Risk Status
 | ID | Risk | Status | Resolution evidence |