npm - haechi - Versions diffs - 1.1.1 → 1.2.0 - Mend

haechi 1.1.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

package/README.ko.md +97 -97
package/README.md +2 -2
package/SECURITY.md +19 -11
package/docs/README.md +2 -0
package/docs/current/api-stability.ko.md +26 -26
package/docs/current/compliance-mapping.ko.md +53 -0
package/docs/current/compliance-mapping.md +53 -0
package/docs/current/config-version.ko.md +30 -0
package/docs/current/config-version.md +51 -0
package/docs/current/configuration.ko.md +242 -102
package/docs/current/configuration.md +149 -9
package/docs/current/operations-runbook.ko.md +121 -0
package/docs/current/operations-runbook.md +204 -0
package/docs/current/release-process.ko.md +19 -20
package/docs/current/release-process.md +1 -2
package/docs/current/reliability-hardening-track.ko.md +77 -0
package/docs/current/reliability-hardening-track.md +77 -0
package/docs/current/risk-register-release-gate.ko.md +26 -27
package/docs/current/risk-register-release-gate.md +27 -20
package/docs/current/security-whitepaper.ko.md +102 -0
package/docs/current/security-whitepaper.md +102 -0
package/docs/current/shared-responsibility.ko.md +33 -24
package/docs/current/shared-responsibility.md +12 -3
package/docs/current/threat-model.ko.md +12 -12
package/docs/current/threat-model.md +3 -3
package/haechi.config.example.json +19 -3
package/package.json +6 -2
package/packages/audit/index.mjs +26 -2
package/packages/cli/bin/haechi.mjs +54 -8
package/packages/cli/runtime.mjs +398 -10
package/packages/core/index.mjs +189 -15
package/packages/filter/index.mjs +299 -9
package/packages/metrics/index.mjs +181 -0
package/packages/proxy/index.mjs +535 -41

package/docs/current/configuration.md CHANGED Viewed

@@ -1,7 +1,6 @@
 # Haechi Configuration Reference
-- Status: Living document
-- Target version: 0.6.0
+- Status: Living document (tracks core 1.2.x)
 `haechi init` writes `haechi.config.json`; a non-secret template is at `haechi.config.example.json`. Every command reads it with `--config <path>` (default `haechi.config.json`). Configuration is **validated fail-closed**: unknown providers, out-of-range numbers, and malformed values throw at load time rather than degrading silently. `haechi config` prints this reference; `haechi status` prints the *effective* state of a given config.
@@ -9,18 +8,21 @@
 ```json
 {
+  "configVersion": 1,
   "mode": "dry-run",
   "target": { "type": "llm-http", "adapter": "openai-compatible", "upstream": "http://127.0.0.1:9999" },
-  "proxy": { "host": "127.0.0.1", "port": 11016 },
+  "proxy": { "host": "127.0.0.1", "port": 11016, "tls": null, "trustForwardedProto": false },
   "responseProtection": { "enabled": false, "mode": "enforce", "failureMode": "fail-closed", "allowNonJson": false, "allowCompressed": false, "maxBytes": 1048576 },
   "streaming": { "requestMode": "block" },
-  "limits": { "maxRequestBytes": 1048576, "upstreamTimeoutMs": 120000 },
+  "limits": { "maxRequestBytes": 1048576, "upstreamTimeoutMs": 120000, "maxNestingDepth": 256, "maxInFlight": 0, "shutdownGraceMs": 10000, "requestTimeoutMs": null, "headersTimeoutMs": null },
   "policy": { "mode": "dry-run", "presets": ["korean-pii", "secrets-only", "llm-redact"], "defaultAction": "redact", "actions": { "card": "block" } },
   "filters": { "customRules": [] },
   "keys": { "provider": "local", "keyFile": ".haechi/dev.keys.json" },
   "audit": { "sink": "jsonl", "path": ".haechi/audit.jsonl" },
   "tokenVault": { "provider": "local", "path": ".haechi/token-vault.json", "revealPolicy": "disabled", "retentionDays": 30, "deterministic": false, "deterministicTypes": null, "detokenizeResponses": false },
   "privacy": { "profile": null },
+  "logging": { "format": "text" },
+  "metrics": { "enabled": true },
   "mcp": { "allowedMethods": ["initialize", "tools/call", "resources/read", "prompts/get"], "protectParams": true, "protectResults": true, "requireJsonRpc": true }
 }
 ```
@@ -29,6 +31,7 @@
 | Key | Type / values | Default | Notes |
 |---|---|---|---|
+| `configVersion` | positive integer | `1` | Config schema version stamp. Absent = treated as the current version. A value **newer** than this build understands **fails closed** at load; a non-positive/non-integer value throws. See [`config-version.md`](./config-version.md). |
 | `mode` | `dry-run` \| `report-only` \| `enforce` | `dry-run` | Global enforcement mode. `dry-run`/`report-only` detect + audit only; `enforce` transforms/blocks. Overridden by `policy.mode` when set. |
 ## `target`
@@ -45,6 +48,8 @@
 |---|---|---|---|
 | `proxy.host` | non-empty string | `127.0.0.1` | Bind address. Non-loopback hosts require the `--allow-remote-bind` CLI flag — config alone will not start (see [Binding beyond loopback](#binding-beyond-loopback)). |
 | `proxy.port` | integer 0–65535 | `11016` | Listen port (`0` = ephemeral). Override per-run with `--port`. |
+| `proxy.tls` | `null` or `{ keyFile, certFile }` / `{ pfxFile, passphrase? }` | `null` | TLS material loaded from **file paths** at startup into a TLS context. When present, Haechi terminates TLS itself (serves `https`). Required (or `trustForwardedProto`) for a remote bind — see [Binding beyond loopback](#binding-beyond-loopback). Fail-closed: a non-null value that does not resolve to usable material `((key && cert) or pfx)`, mixes `pfxFile` with `keyFile`/`certFile`, or names an unreadable file throws at load. |
+| `proxy.trustForwardedProto` | boolean | `false` | Operator acknowledgement that a **trusted reverse proxy terminates TLS** in front of Haechi. When `true`, a remote bind may stay plain `http`, but Haechi then **refuses any request whose `X-Forwarded-Proto` is not `https`** (checked before auth/body; the `/__haechi/*` liveness routes are exempt). Never a substitute for real TLS when Haechi is itself internet-facing. |
 ## `responseProtection`
@@ -74,6 +79,11 @@ Inspects upstream JSON responses (off by default — turn on to protect what com
 |---|---|---|---|
 | `limits.maxRequestBytes` | positive integer | `1048576` | Request body cap; over the limit returns `413`. Enforced incrementally (the body is not fully buffered first). |
 | `limits.upstreamTimeoutMs` | positive integer | `120000` | Upstream request timeout; on expiry returns `504 haechi_upstream_timeout`. Connection failure returns `502 haechi_upstream_unreachable`. |
+| `limits.maxNestingDepth` | positive integer | `256` | Max JSON nesting depth walked during detection. A more deeply nested body is rejected `413 haechi_request_too_deeply_nested` (fail-closed, before upstream), guarding the recursive payload walk against a stack overflow. Bounds container descent; leaves at the limit are still inspected. (Separately, a non-UTF-8 request body is rejected fail-closed: `400 haechi_request_body_not_utf8`.) |
+| `limits.maxInFlight` | non-negative integer | `0` | Global max-in-flight backpressure ceiling. `0` disables it (no ceiling — 1.1 behavior). When `> 0` and the live in-flight count is at the ceiling, a **new** request is rejected `503` with a `Retry-After` header and `{ "error": "haechi_overloaded" }`, **before** auth/body-read. The `/__haechi/*` observability routes are **exempt** (liveness + metrics stay scrapable under saturation). Each rejection increments `haechi_overloaded_total`. See the [operations runbook](./operations-runbook.md#5-backpressure-tuning). |
+| `limits.shutdownGraceMs` | non-negative integer (ms) | `10000` | Graceful-shutdown grace period. On `SIGINT`/`SIGTERM` the proxy stops accepting connections, closes idle keep-alive sockets immediately, waits for in-flight requests to drain, then after this grace force-closes any lingering socket so a stuck keep-alive cannot hold shutdown open forever. Also seeds the backpressure `Retry-After` seconds. Set your orchestrator's termination grace **above** this value. |
+| `limits.requestTimeoutMs` | `null` \| non-negative integer (ms) | `null` | Maps to the Node HTTP server `requestTimeout`. `null` leaves Node's default untouched (behavior unchanged). Set a number to cap slow whole-request delivery; `0` disables the timeout (Node semantics). |
+| `limits.headersTimeoutMs` | `null` \| non-negative integer (ms) | `null` | Maps to the Node HTTP server `headersTimeout`. `null` leaves Node's default untouched. Set a number to cap slow header delivery (slow-loris); `0` disables it. |
 ## `policy`
@@ -93,6 +103,19 @@ The detect→decide core. See [Detection types & actions](#detection-types--acti
 | Key | Type / values | Default | Notes |
 |---|---|---|---|
 | `filters.customRules` | array of rule objects | `[]` | Extra detection rules: `{ id, type, pattern, flags?, confidence? }`. Patterns are ReDoS-screened (≤500 chars, no nested quantifiers, no backreferences) and rejected at load if unsafe. |
+| `filters.minConfidence` | number in `[0, 1]` | `0` | Precision dial. Each rule carries a `confidence` (0.6–0.95); a detection whose confidence is **below** this threshold is dropped before the policy decides. `0` (the default) gates nothing, preserving prior behavior. **Hard-block exemption:** a hard-block type (`secret`, `api_key`, `kr_rrn`, `card`) is **never** dropped on confidence — `minConfidence` trims only the precision-risky soft types (e.g. `phone`, `email`, `injection`), so a low-confidence credential/PII leak is still acted on (fail-closed). |
+| `filters.allowlist` | array of strings and/or `{ value?, path? }` | `[]` | Operator false-positive exceptions. A detection whose matched **value** equals a string/`value` entry, or whose PII-safe JSON **path** (the hashed `pathText`, as shown in the audit) equals a `path` entry, is suppressed before the policy decides (when an entry sets both `value` and `path`, **both** must match). **Hard-block exemption:** an entry that would suppress a hard-block type (`secret`/`api_key`/`kr_rrn`/`card`) is **ignored** and the detection still fires — the allowlist can only clear a benign **soft-type** FP, never silence a credential/PII leak. Every suppression and every `minConfidence` drop is **audited** by count and type (`summary.suppressedByType` / `summary.droppedByType` / `suppressedCount` / `droppedCount`) — never the raw value. Use this to clear one benign FP without deleting a whole rule. |
+### Detection benchmark
+Detection precision/recall is measured, not assumed. A labeled corpus of synthetic test fixtures (`tests/fixtures/detection-corpus.json` — positive samples per type plus benign hard-negatives) drives a per-type scorer:
+```bash
+npm run bench:detection   # print the per-type TP/FP/FN + precision/recall table
+npm run scan:detection    # CI regression gate: fail if any type regresses below baseline
+```
+`bench:detection` (`scripts/bench-detection.mjs`) runs the default filter engine over each corpus case and reports true/false positives and false negatives per type. `scan:detection` compares the live scores against the pinned baseline (`scripts/detection-baseline.json`) and **fails only on a regression** — a precision or recall drop below the recorded numbers. The baseline deliberately bakes in the current imperfect state (the audit-reproduced false positives on `phone`/`card`/`secret`, and the known coverage-gap misses for AWS/GitHub/Google/Slack keys, JWT, and PEM headers), so the gate passes today and trips only when a change makes detection worse. It runs in `release:preflight` after the doc-freshness gate. Regenerate the baseline after an intentional rule change with `node scripts/bench-detection.mjs --write-baseline` and review the diff. Closing the recorded gaps and false positives is WS2b/WS2c of the reliability-hardening track.
 ## `keys`
@@ -126,6 +149,69 @@ The detect→decide core. See [Detection types & actions](#detection-types--acti
 |---|---|---|---|
 | `privacy.profile` | `null` \| `kr-pipa` \| `eu-gdpr` \| `us-general` | `null` | Applies a regional baseline action set before enforcement. Profiles may **strengthen** but never weaken your explicit actions. Engineering defaults, not legal advice. |
+## `logging`
+| Key | Type / values | Default | Notes |
+|---|---|---|---|
+| `logging.format` | `text` \| `json` | `text` | `text` keeps the human-readable startup/shutdown/error lines (unchanged). `json` emits one single-line JSON object per event. Fail-closed: any other value throws. |
+In `json` mode the proxy's internal-error log is a single line `{ "level": "error", "event": "proxy_internal_error", "correlationId", "errorName", "statusCode" }`, and startup/shutdown emit `proxy_listening` / `proxy_shutdown` (plus `*_warn` events for remote-bind / non-enforce-mode / response-protection-disabled). **No log field ever carries a request/response payload, header, token, or any PII** — error logs carry the error *class name* and the request `correlationId` only.
+## `metrics`
+| Key | Type / values | Default | Notes |
+|---|---|---|---|
+| `metrics.enabled` | boolean | `true` | Gates the `GET /__haechi/metrics` route. When `false`, that route returns `404`. Fail-closed: a non-boolean throws. |
+The metrics collector is also an **injectable collaborator** (`createRuntime(config, { metrics })`); see [Operability endpoints](#operability-endpoints) for the contract and the no-PII guarantee.
+## Operability endpoints
+The proxy serves four unauthenticated endpoints under the reserved `/__haechi/*` prefix, checked **before** auth and body-read. They never proxy upstream.
+| Endpoint | Status | Body | Purpose |
+|---|---|---|---|
+| `GET /__haechi/live` | `200` | `{ ok: true, version }` | Cheap process liveness. |
+| `GET /__haechi/ready` | `200` / `503` | `{ ready, version, checks }` | Readiness. **Fail-closed**: a gateway that cannot append to its audit log is **not** ready (`503`). The default JSONL sink's `checks.auditWritable` confirms its audit directory/file is writable without writing an event; a sink lacking a `ready()`/`healthCheck()` method is treated as ready. |
+| `GET /__haechi/health` | `200` | `{ ok: true, mode, version }` | Back-compat (the original health endpoint, now with `version`). |
+| `GET /__haechi/metrics` | `200` / `404` | Prometheus text | Telemetry (see below). `404` when `metrics.enabled: false`. |
+`version` is the running package version (`package.json`).
+### Telemetry (`/__haechi/metrics`)
+The endpoint renders the **Prometheus text exposition format** (`# HELP` / `# TYPE` + `name{label="..."} value`), `Content-Type: text/plain`. Counters: `haechi_requests_total{route,mode,decision}` plus `haechi_blocks_total`, `haechi_auth_denied_total`, `haechi_rate_limited_total`, `haechi_upstream_timeout_total`, `haechi_upstream_error_total`, `haechi_response_unprotected_total`, `haechi_internal_error_total`; one histogram `haechi_request_duration_seconds{route}`.
+**No-PII-in-telemetry invariant.** Every metric name and **every label value** is a bounded enum — a route id, a policy mode, or a fixed decision class (`forwarded` / `blocked` / `auth_denied` / `rate_limited` / `model_not_allowed` / …). A metric label is **never** an identity id/subject, a token, or a detected value: there is no per-identity or per-value label cardinality. This is the no-plaintext-in-audit invariant extended to telemetry; the metrics module additionally length-caps and charset-sanitizes label values as defence in depth.
+### `providers.metrics` injection seam
+The metrics collector is supplied programmatically through `createRuntime(config, providers)` — the same seam as `cryptoProvider`/`authProvider`/`rateLimiter`. It is **not** a JSON config key.
+```js
+const runtime = createRuntime(config, { metrics });
+```
+An injected `metrics` must implement `increment(name, labels?, amount?)`, `observe(name, value, labels?)`, and `render() -> string`; `createRuntime` fails closed at construction if it does not. The **default** is a zero-dependency in-memory collector that renders the Prometheus text above. A multi-replica operator injects a shared/remote collector satisfying the same contract.
+### `correlationId` (audit + logs)
+The proxy generates a per-**request** `correlationId` (a UUID). It is threaded into the protect context, so each request's request- and response-direction audit events carry the same additive top-level `correlationId` field, and into the proxy's internal-error log line — letting an operator join a logged error to its audit trail. It is `null` for non-proxy `protectJson()` calls (preserving prior behavior). The id is a UUID and is **never** a payload/identity/PII value.
+## Env-var configuration overlay (deploy)
+For container / 12-factor deploys, a **fixed allowlist of NON-SECRET operational keys** can be overridden from the environment. The env value **wins over the config file** and is validated **fail-closed** — an invalid value makes the process fail to start. Applied in `loadConfig()` after reading the file and before validation.
+| Env var | Config key | Type / values |
+|---|---|---|
+| `HAECHI_PROXY_PORT` | `proxy.port` | integer 0–65535 |
+| `HAECHI_PROXY_HOST` | `proxy.host` | non-empty string |
+| `HAECHI_UPSTREAM` | `target.upstream` | URL string |
+| `HAECHI_MODE` | `mode` | `dry-run` \| `report-only` \| `enforce` |
+| `HAECHI_LOG_FORMAT` | `logging.format` | `text` \| `json` |
+**Secrets are NOT overlayable — by design.** There is **no** `HAECHI_*` variable for `keys.*`, the auth token store, or any token/secret. Secrets stay in the config file or are supplied via injected providers (`createRuntime(config, { cryptoProvider, authProvider, … })`). Putting a secret in a process environment risks leaking it through `/proc`, crash dumps, and orchestrator inspect output, so the overlay allowlist excludes them. See the [operations runbook](./operations-runbook.md#2-configuration-via-the-env-var-overlay).
 ## `mcp`
 Applies to `mcp-stdio` and `mcp-wrap`.
@@ -173,15 +259,54 @@ Per-client controls layered on top of the base `policy`. See [Named profiles](#n
 | `policy.profiles` | `{ <name>: { presets?, actions?, modelAllowlist?, rate? } }` | `{}` | Named profiles; each overrides the base policy. |
 | `policy.profileBinding` | `{ byScope?, byLabel?, default }` | unset | Maps identity scopes/labels (`"k=v"` for labels) to profile names. `default` is **required** when `profiles` is set and should be the strictest profile (fail-closed). |
 | `policy.modelAllowlist` | string array | unset | Allowed `model` values (base level; also settable per profile). A disallowed model → `403`. Empty/absent = allow all. |
-| `policy.rate` | `{ requestsPerMinute }` | unset | Per-identity request rate limit (base level or per profile). Over the limit → `429`. In-memory, per-process. |
+| `policy.rate` | `{ requestsPerMinute }` | unset | Per-identity request rate limit (base level or per profile). Over the limit → `429`. In-memory, per-process; see [Rate limiter injection](#rate-limiter-injection) for the multi-replica seam. |
 ### Named profiles
 When an identity authenticates, its profile resolves in order **scope → label → `default`**; scope precedes label and the first match wins. Without `profiles`, or under `auth.provider: none`, the base policy applies. The resolved profile's policy engine, `modelAllowlist`, and `rate` govern that request.
+### Rate limiter injection
+The rate limiter is an **injectable collaborator**, supplied programmatically through the `providers` argument of `createRuntime(config, providers)` — the same seam as the external `cryptoProvider`/`authProvider`. It is **not** a JSON config key.
+```js
+const runtime = createRuntime(config, { rateLimiter });
+```
+An injected `rateLimiter` must implement `allow(key, limit) -> boolean` (where `key` is the per-identity bucket and `limit` is the resolved `requestsPerMinute`); `createRuntime` fails closed at construction if it does not. The proxy consults `runtime.rateLimiter` for every rate-governed request.
+The **default** is a per-process, in-memory fixed-window counter: it resets on restart and is **not shared across replicas**, so total throughput multiplies by the replica count behind a load balancer. Its window map is self-bounding (a lazy, amortized sweep evicts aged-out one-shot identities — no background timer). For a multi-replica deployment, enforce a per-identity limit at a shared front door **or** inject a shared-store implementation (e.g. Redis-backed) that satisfies the same `allow(key, limit)` contract. See [Shared responsibility §4](./shared-responsibility.md#4-horizontal-scale--multiple-replicas).
 ## Detection types & actions
-Built-in detection `type` values: `email`, `phone`, `kr_rrn`, `card`, `api_key`, `secret`, and `injection` (response-direction heuristic, report-only by default). Custom rules may introduce new types.
+Built-in detection `type` values: `email`, `phone`, `kr_rrn`, `card`, `api_key`, `secret`, `us_ssn`, `iban`, and `injection` (response-direction heuristic, report-only by default). Custom rules may introduce new types.
+### Supported credential & PII matrix
+Detection is regex + optional validator (no ML). Every rule is **anchored tightly** to keep precision high; precision is prioritized over recall, and the corpus (`tests/fixtures/detection-corpus.json`) carries a hard-negative for each rule. The KR phone rule and the US SSN/IBAN validators reject look-alike ids/timestamps.
+| Type | Detects | Anchor / validator | Notes |
+|---|---|---|---|
+| `email` | RFC-style addresses | local + domain + TLD | — |
+| `phone` | KR mobile (`01[016789]`, `+82`) | bare separator-less runs must be `0`-led | KR landlines out of scope. |
+| `phone` | E.164 international | **leading `+` required** (`+[1-9]` + 6–14 digits) | A bare digit run is never matched (collides with ids/timestamps). |
+| `phone` | US/NANP national | **separators required** (`(NXX) NXX-XXXX` or `NXX-NXX-XXXX`) | A separator-less 10-digit run is not matched. |
+| `kr_rrn` | KR resident registration number | check-digit validator | Shape-valid but checksum-invalid → rejected. |
+| `card` | Payment card (PAN) | Luhn validator, 13–19 digits | — |
+| `us_ssn` | US Social Security Number | `AAA-GG-SSSS` + SSA-range validator (rejects area `000`/`666`/`900-999`, group `00`, serial `0000`) | Separators required; a bare 9-digit id is not an SSN. |
+| `iban` | International Bank Account Number | **mod-97 checksum** validator | The checksum is the precision guard — IBAN-shaped non-97-valid strings are rejected. |
+| `api_key` | OpenAI-style (`sk_`/`rk_`/`pk_`) | prefix + ≥24 chars | — |
+| `api_key` | AWS access key id | `AKIA`/`ASIA` + exactly 16 uppercase-alnum | — |
+| `api_key` | Google API key | `AIza` + 35 URL-safe chars | — |
+| `secret` | `Bearer <token>` | `Bearer` + ≥16 chars | — |
+| `secret` | Assignment `<key> = <value>` | key vocabulary: `api_key`, `api_secret`, `secret`, `secret_key`, `aws_secret_access_key`, `client_secret`, `private_key`, `access_token`, `refresh_token`, `token`, `password` | Catches bare-base64 secrets (e.g. AWS secret access key) via the assignment form. |
+| `secret` | GitHub token | `gh[pousr]_` + ≥36 base64-ish chars | pat/oauth/user/server/refresh variants. |
+| `secret` | Slack token | `xox[baprs]-` + ≥10-char body | bot/user/refresh/legacy variants. |
+| `secret` | JWT | three base64url segments, first starts `eyJ` (the base64 of `{"`) | The `eyJ` anchor rejects arbitrary dotted tokens. |
+| `secret` | PEM private key | `-----BEGIN … PRIVATE KEY-----` armor header | The header presence is the signal; prose mentioning "private key" is not matched. |
+| `injection` | prompt-injection heuristics | response-direction only, `allow` by default | See [Action strength](#action-strength); report-only. |
+Detection covers string values, JSON number leaves (request direction), and object keys. Each **string leaf is NFKC-normalized before matching**, so Unicode-evasion forms (full-width digits `４２４２…`, full-width `＠`, mathematical/enclosed alphanumerics) are folded to their ASCII compatibility form and still detected. When the fold preserves UTF-16 length the exact evaded span is redacted/blocked; when it changes length (e.g. mathematical digits, ligatures) detection fails closed and the whole leaf is redacted/blocked. Base64/percent-encoded values (after decoding) and URL query strings remain documented exclusions (see `docs/current/threat-model.md`). On the response direction, Haechi's own transform markers and bare JSON number leaves are skipped (request direction is always full-scan).
 Actions (weakest → strongest):
@@ -240,17 +365,32 @@ When a preset and an override (or a privacy profile) disagree, the **stronger**
 ## Binding beyond loopback
-The proxy refuses non-loopback hosts unless the CLI flag is passed explicitly — `proxy.host: "0.0.0.0"` in config alone will not start, by design:
+The proxy refuses non-loopback hosts unless the CLI flag is passed explicitly — `proxy.host: "0.0.0.0"` in config alone will not start, by design. A remote bind **additionally requires TLS**: either Haechi terminates TLS itself (`proxy.tls`), or you explicitly acknowledge a fronting TLS terminator (`proxy.trustForwardedProto`). A remote bind with neither **throws at startup** — Haechi will not serve bearer tokens and payloads in plaintext on a non-loopback listener.
+**Option A — Haechi terminates TLS** (serves `https`):
+```jsonc
+// haechi.config.json
+"proxy": { "host": "0.0.0.0", "tls": { "keyFile": "/etc/haechi/tls/key.pem", "certFile": "/etc/haechi/tls/cert.pem" } }
+// or PKCS#12: "tls": { "pfxFile": "/etc/haechi/tls/server.pfx", "passphrase": "…" }
+```
 ```bash
 haechi proxy --config haechi.config.json --host 0.0.0.0 --allow-remote-bind
+# → Haechi proxy listening on https://0.0.0.0:11016
+```
+**Option B — a trusted reverse proxy terminates TLS** in front of Haechi (Haechi stays plain `http` on a private network behind the hop):
+```jsonc
+"proxy": { "host": "0.0.0.0", "trustForwardedProto": true }
 ```
+With `trustForwardedProto: true`, Haechi **refuses any request whose `X-Forwarded-Proto` is not `https`** (a plaintext request that bypassed the hop) with a fail-closed `403`, checked before auth and body-read. The `/__haechi/*` liveness/metrics routes are exempt so a loopback sidecar can still scrape them. Only the trusted terminator may set `X-Forwarded-Proto` — do not enable this if untrusted clients can reach the Haechi port directly.
-**The proxy has no client authentication yet** (planned for 0.6): anyone who can reach the port can use your upstream and the token round-trip path. Use `--allow-remote-bind` only behind explicit network controls — bind `0.0.0.0` inside a container and restrict the host port mapping (`-p 127.0.0.1:11016:11016`), or front it with a firewall/VPN/authenticating reverse proxy.
+**The proxy ships bearer client authentication** (`auth.provider: bearer`, shipped in 0.6): a hashed token store, per-identity policy profiles, a model allowlist, and a per-identity rate limit (see [`auth`](#auth) and [Named profiles](#named-profiles)). The default `auth.provider: none` leaves the proxy unauthenticated — with `none`, anyone who can reach the port can use your upstream and the token round-trip path. The built-in rate limit is single-process (in-memory, per-process); front multiple replicas with a shared limiter. Use `--allow-remote-bind` only behind explicit network controls regardless — bind `0.0.0.0` inside a container and restrict the host port mapping (`-p 127.0.0.1:11016:11016`), or front it with a firewall/VPN/authenticating reverse proxy.
 ## Validation cheatsheet
-These throw at load (fail-closed): unknown `keys.provider`; empty `proxy.host`; out-of-range `proxy.port`; non-`jsonl` `audit.sink`; non-`local` `tokenVault.provider`; bad `revealPolicy`; non-positive `retentionDays`; non-boolean `deterministic`/`detokenizeResponses`; empty/non-string `deterministicTypes`; empty/non-string `mcp.allowedMethods`; non-boolean `mcp.*` flags; unknown `privacy.profile`; bad `responseProtection.failureMode`; non-positive `responseProtection.maxBytes`; non-boolean `responseProtection.scanNumbers`; bad `streaming.requestMode`/`streaming.responseMode`; non-positive `streaming.maxMatchBytes`; bad `auth.provider`; empty `auth.store`; non-string `auth.allowedLabelKeys`; non-object `policy.profiles`; `policy.profileBinding` without a valid `default`; non-string `policy.modelAllowlist`; non-positive `policy.rate.requestsPerMinute`; non-positive `limits.*`; unknown `target.type`/`adapter`; unsafe custom regex; weakening action without `allowUnsafeOverrides`.
+These throw at load (fail-closed): unknown `keys.provider`; empty `proxy.host`; out-of-range `proxy.port`; non-boolean `proxy.trustForwardedProto`; a `proxy.tls` that is non-`null` but not an object, sets `keyFile` without `certFile` (or vice-versa), mixes `pfxFile` with `keyFile`/`certFile`, names an unreadable file, or does not resolve to usable material `((key && cert) or pfx)`; non-`jsonl` `audit.sink`; non-`local` `tokenVault.provider`; bad `revealPolicy`; non-positive `retentionDays`; non-boolean `deterministic`/`detokenizeResponses`; empty/non-string `deterministicTypes`; empty/non-string `mcp.allowedMethods`; non-boolean `mcp.*` flags; unknown `privacy.profile`; bad `responseProtection.failureMode`; non-positive `responseProtection.maxBytes`; non-boolean `responseProtection.scanNumbers`; bad `streaming.requestMode`/`streaming.responseMode`; non-positive `streaming.maxMatchBytes`; bad `auth.provider`; empty `auth.store`; non-string `auth.allowedLabelKeys`; non-object `policy.profiles`; `policy.profileBinding` without a valid `default`; non-string `policy.modelAllowlist`; non-positive `policy.rate.requestsPerMinute`; non-positive `limits.maxRequestBytes`/`limits.upstreamTimeoutMs`/`limits.maxNestingDepth`; negative or non-integer `limits.maxInFlight`/`limits.shutdownGraceMs`; non-`null`/negative/non-integer `limits.requestTimeoutMs`/`limits.headersTimeoutMs`; non-positive-integer or **newer-than-supported** `configVersion`; unknown `target.type`/`adapter`; unsafe custom regex; weakening action without `allowUnsafeOverrides`; non-`text`/`json` `logging.format`; non-boolean `metrics.enabled`; an invalid `HAECHI_*` env overlay value (bad `HAECHI_PROXY_PORT`, unknown `HAECHI_MODE`, malformed `HAECHI_UPSTREAM`, …).
 # Satellite operator configuration (0.9)

package/docs/current/operations-runbook.ko.md ADDED Viewed

@@ -0,0 +1,121 @@
+# Haechi 운영 런북 (Day-2)
+- 상태: Living document (코어 1.2.x 추적)
+Haechi를 프로덕션에서 운영하기 위한 실무 가이드입니다: 배포, 환경변수 오버레이를 통한 설정, health/readiness/metrics 모니터링, 우아한 종료, 백프레셔 튜닝, 그리고 해시 체인을 깨지 않는 audit 로그 회전입니다.
+이 문서는 운영 가이드이며 컴플라이언스 보증이 아닙니다. 전체 설정 레퍼런스는 [`configuration.ko.md`](./configuration.ko.md)를, 신뢰 경계는 [`threat-model.ko.md`](./threat-model.ko.md)를 참고하십시오.
+## 1. 배포
+Haechi는 런타임 의존성이 0인 Node `>=22` 패키지입니다. 리포지토리 루트의 참조용 [`Dockerfile`](../../Dockerfile), [`docker-compose.yml`](../../docker-compose.yml), [`.dockerignore`](../../.dockerignore)가 하드닝된 이미지를 빌드합니다(이 파일들은 npm 타르볼에 포함되지 **않는** 리포지토리 배포 자산입니다). 이미지는:
+- Node 22 slim 베이스를 핀으로 고정하고(`engines: ">=22"`와 일치),
+- 비루트 `node` 사용자로 실행하며,
+- 런타임 파일만 복사하고(`.haechi` 비밀, 테스트, 문서 소스 제외),
+- audit 체인 / 키 파일 / 토큰 볼트를 위한 쓰기 가능 `/app/.haechi` 볼륨을 선언하고 나머지 트리는 읽기 전용으로 실행하며,
+- `/__haechi/live`에 대한 `HEALTHCHECK`를 제공합니다.
+```bash
+docker compose up -d        # 참조 스택 빌드 + 실행
+docker compose logs -f haechi
+```
+**TLS + 인증으로 앞단을 보호하십시오.** Haechi는 자체 TLS가 없습니다. 포트는 TLS를 종단하고 인증하는 리버스 프록시(nginx / Caddy / Traefik / API 게이트웨이)에만 공개하고, 원시 Haechi 포트를 공개 인터페이스에 절대 노출하지 마십시오. compose 예제는 바로 이 이유로 호스트 loopback(`127.0.0.1:11016`)에만 공개합니다.
+**Loopback 너머 바인딩.** 컨테이너 내부에서는 매핑된 포트가 도달 가능하도록 Haechi가 `0.0.0.0`에 바인딩해야 하며, 이는 `--allow-remote-bind`를 요구합니다(참조 `CMD`가 전달합니다). 호스트에서는 기본 loopback 바인딩을 선호하고 리버스 프록시를 통해 Haechi에 접근하십시오. [Loopback 너머 바인딩](./configuration.ko.md)을 참고하십시오.
+## 2. 환경변수 오버레이를 통한 설정
+컨테이너 / 12-factor 배포를 위해 **비밀이 아닌 운영 키의 고정 allowlist**를 환경변수로 덮어쓸 수 있습니다. 환경변수 값은 **설정 파일보다 우선**하며 fail-closed로 검증됩니다 — 잘못된 값(잘못된 포트, 알 수 없는 모드)은 프로세스를 조용히 약화시키지 않고 **기동 실패**시킵니다.
+| 환경변수 | 설정 키 | 타입 / 값 | 예시 |
+|---|---|---|---|
+| `HAECHI_PROXY_PORT` | `proxy.port` | 정수 0–65535 | `11016` |
+| `HAECHI_PROXY_HOST` | `proxy.host` | 비어 있지 않은 문자열 | `0.0.0.0` |
+| `HAECHI_UPSTREAM` | `target.upstream` | URL 문자열 | `http://llm:8000` |
+| `HAECHI_MODE` | `mode` | `dry-run` \| `report-only` \| `enforce` | `enforce` |
+| `HAECHI_LOG_FORMAT` | `logging.format` | `text` \| `json` | `json` |
+**비밀은 설계상 오버레이 대상이 아닙니다.** `keys.*`(로컬 키 파일이나 외부 키 경로), auth 토큰 저장소, 어떤 토큰/비밀에 대한 `HAECHI_*` 변수도 **없습니다**. 비밀은 마운트된 설정 파일에 두거나 **주입된 provider**(`createRuntime(config, { cryptoProvider, authProvider, … })`)로 공급합니다. 비밀을 프로세스 환경에 두면 `/proc`, 크래시 덤프, 오케스트레이터 inspect 출력, 자식 프로세스를 통해 누출될 위험이 있으므로 오버레이 allowlist에서 완전히 제외합니다.
+오버레이는 `loadConfig()`에서 파일을 읽은 뒤 `normalizeConfig()` 이전에 적용되므로, 오버레이된 값도 파일에 설정된 값과 동일한 검증을 거칩니다.
+## 3. Health, readiness, metrics 스크레이핑
+예약된 `/__haechi/*` 프리픽스 아래 인증이 필요 없는 네 개의 라우트로, 인증/바디 읽기 이전에 검사되며 upstream을 절대 프록시하지 않습니다(전체 레퍼런스: [운영 엔드포인트](./configuration.ko.md#운영-엔드포인트)):
+| 엔드포인트 | 용도 |
+|---|---|
+| `GET /__haechi/live` | **Liveness** — 재시작 프로브. 가볍고, 이벤트 루프가 서비스하는 동안 200. |
+| `GET /__haechi/ready` | **Readiness** — 트래픽 게이트. **audit sink에 쓸 수 없으면 503**(감사를 못 하는 게이트웨이는 ready가 아님). 로드밸런서/오케스트레이터 readiness 프로브를 여기로 지정하십시오. |
+| `GET /__haechi/health` | 하위 호환(`ok` + `mode` + `version`). |
+| `GET /__haechi/metrics` | Prometheus 텍스트 노출. `metrics.enabled: false`이면 `404`. |
+Prometheus(또는 OpenMetrics 호환 스크레이퍼)로 **`/metrics`를 스크레이프**하십시오:
+```yaml
+scrape_configs:
+  - job_name: haechi
+    metrics_path: /__haechi/metrics
+    static_configs:
+      - targets: ["haechi:11016"]
+```
+주요 신호: `haechi_requests_total{route,mode,decision}`, `haechi_blocks_total`, `haechi_auth_denied_total`, `haechi_rate_limited_total`, `haechi_overloaded_total`(백프레셔 503), `haechi_upstream_timeout_total`, `haechi_upstream_error_total`, `haechi_response_unprotected_total`, `haechi_internal_error_total`, 그리고 `haechi_request_duration_seconds{route}` 히스토그램.
+**텔레메트리 no-PII 불변식.** 모든 메트릭 이름과 **모든 라벨 값**은 경계가 있는 enum(route id / mode / decision class)이며, identity·토큰·탐지 값이 절대 아닙니다. 동일한 불변식이 구조화 로그에도 적용됩니다: `logging.format: json`(또는 `HAECHI_LOG_FORMAT=json`)에서 기동/종료/오류 로그는 `correlationId`와 오류 클래스 이름만 담고 페이로드는 절대 담지 않습니다. `correlationId`는 해당 요청의 audit 이벤트에도 나타나므로, 기록된 오류를 그 audit 추적과 연결할 수 있습니다.
+## 4. 우아한 종료
+`SIGINT`/`SIGTERM` 시 CLI는 프록시의 `close()`를 호출하고, 이는 **우아하게 드레인**합니다:
+1. 새 연결 수락을 멈추고(`server.close()`),
+2. idle keep-alive 소켓을 즉시 닫고(`closeIdleConnections()`),
+3. in-flight 요청이 끝날 때까지 기다리고,
+4. 유예 기간(`limits.shutdownGraceMs`, 기본 10000ms) 후 남은 소켓을 강제 종료하여(`closeAllConnections()`) 멈춘 keep-alive가 종료를 무한정 붙잡지 못하게 합니다.
+`close()`는 in-flight 요청이 빠지거나 유예가 지나면 resolve합니다. 오케스트레이터의 `terminationGracePeriod`(쿠버네티스) / `stop_grace_period`(compose)를 `limits.shutdownGraceMs`보다 **크게** 설정하여 플랫폼이 드레인 도중 SIGKILL하지 않게 하십시오. 가장 긴 허용 in-flight 요청에 맞춰 `limits.shutdownGraceMs`를 튜닝하십시오.
+## 5. 백프레셔 튜닝
+`limits.maxInFlight`는 동시에 처리되는 요청 수의 전역 상한입니다.
+- `0`(기본)은 상한을 비활성화합니다 — 1.1 동작 그대로.
+- `> 0`: 현재 in-flight 수가 상한에 도달하면 **새** 요청은 `Retry-After` 헤더(`limits.shutdownGraceMs`에서 유도한 초)와 `{ "error": "haechi_overloaded" }` 바디와 함께, 인증/바디 읽기 **이전에** `503`으로 거부됩니다. 거부마다 `haechi_overloaded_total`이 증가합니다.
+- `/__haechi/*` 관측 라우트는 상한에서 **예외**이므로, 포화 상태에서도 liveness와 `/metrics`를 스크레이프할 수 있습니다 — 부하를 떨어내는 *이유*를 여전히 볼 수 있습니다.
+`maxInFlight`를 upstream + 호스트가 감당할 수 있는 동시성 근처로 설정하고(`haechi_request_duration_seconds`와 upstream 포화를 관찰), 게이트웨이가 붕괴 대신 깔끔한 503으로 부하를 떨어내도록 여유를 두십시오. 느린 upstream이 슬롯을 무한정 점유하지 못하도록 튜닝된 `limits.upstreamTimeoutMs`와 함께 사용하십시오.
+### 튜닝된 타임아웃
+`limits.requestTimeoutMs`와 `limits.headersTimeoutMs`는 Node HTTP 서버의 `requestTimeout` / `headersTimeout`에 매핑됩니다. 둘 다 기본값 `null` = Node 서버 기본값을 그대로 둠(옵트인하지 않으면 동작 불변)입니다. slow-loris 류의 느린 요청/헤더 전달을 제한하려면 숫자를 설정하고, `0`은 해당 타임아웃을 비활성화합니다(Node 의미).
+## 6. 체인 인지 audit 로그 회전 & 보존
+audit 로그는 **SHA-256 해시 체인**입니다(`audit.path`): 각 이벤트의 `auditIntegrity.previousHash`가 이전 이벤트 해시에 연결되므로, 삽입·삭제·수정·재정렬은 `haechi audit-verify` / `verifyAuditChain`로 탐지됩니다. 선택적 **anchor 스트림**(`audit.anchor`)은 체인 헤드를 별도의 append-only 매체에 기록하여 tail truncation(최신 이벤트 삭제)까지 잡아냅니다. [`audit` 개념](./configuration.ko.md#audit)과 위협 모델을 참고하십시오.
+**체인을 중간에서 잘라내거나 다시 쓰지 마십시오.** `audit.jsonl`을 제자리에서 truncate하거나 이전 줄을 다시 쓰면 **체인이 깨지고** 검증이 실패합니다(더 나쁘게는 변조 증거가 조용히 사라집니다). **새 세그먼트를 시작**하고 이전 세그먼트를 보존하는 방식으로 회전하십시오:
+1. writer를 **멈추거나 정지**시킵니다(우아한 종료, 또는 점검 시간대에 회전). 기본 JSONL sink는 append 방식이므로, 열려 있는 파일을 회전하는 일을 피하는 것입니다.
+2. 현재 세그먼트를 **그대로 보존한 채 옆으로 옮깁니다**: `mv .haechi/audit.jsonl .haechi/audit-2026-06-12.jsonl`(대응하는 anchor도: `mv .haechi/audit.anchor.jsonl .haechi/audit-2026-06-12.anchor.jsonl`).
+3. Haechi를 재시작하여(또는 `audit.path` / `audit.anchor.path`를 새 파일로 지정하여) **새 세그먼트를 시작**합니다. 새 체인은 `previousHash: null`로 시작합니다 — 독립적으로 검증 가능한 새 체인입니다. 이는 의도된 동작입니다: 각 세그먼트가 자체적으로 검증 가능한 체인이며, 회전 경계를 넘어 체인을 잇지 **않습니다**.
+4. 보존된 각 세그먼트를 자체 anchor로 **독립 검증**합니다: `haechi audit-verify --audit .haechi/audit-2026-06-12.jsonl --anchor .haechi/audit-2026-06-12.anchor.jsonl`.
+5. 전체 이력이 검증 가능하도록 보존 기간 동안 **이전 세그먼트를 보관**합니다. 가능하면 삭제 대신 append-only / WORM 저장소로 아카이브하십시오. anchor의 방어는 anchor가 별도의 append-only 매체에 존재한다는 전제에 기반합니다.
+**보존:** 회전된 각 세그먼트(및 그 anchor)를 요구되는 audit 보존 기간 동안 유지한 뒤 세그먼트 단위로 만료시키십시오 — 세그먼트 내 일부 줄을 절대 부분 삭제하지 마십시오. 토큰 볼트 보존은 독립적이며(`tokenVault.retentionDays`), audit 회전은 토큰을 정리하지 않습니다.
+아카이브 파이프라인에 검증 단계를 유지하지 않는 한, 나중에 재검증이 불가능한 방식으로 세그먼트를 압축/암호화하지 **마십시오**. 회전된 세그먼트는 여전히 검증될 때에만 증거로서 유용합니다.
+## 7. 빠른 참조
+| 작업 | 커맨드 |
+|---|---|
+| 시작(compose) | `docker compose up -d` |
+| Liveness | `curl localhost:11016/__haechi/live` |
+| Readiness | `curl localhost:11016/__haechi/ready` |
+| Metrics | `curl localhost:11016/__haechi/metrics` |
+| 세그먼트 검증 | `haechi audit-verify --audit <seg>.jsonl --anchor <seg>.anchor.jsonl` |
+| 우아한 정지 | `docker compose stop` (SIGTERM → 드레인) |
+참고: `configVersion` 스탬프와 업그레이드 노트는 [`config-version.ko.md`](./config-version.ko.md)를 참고하십시오.

package/docs/current/operations-runbook.md ADDED Viewed

@@ -0,0 +1,204 @@
+# Haechi Operations Runbook (Day-2)
+- Status: Living document (tracks core 1.2.x)
+A practical guide to running Haechi in production: deploy, configure via the
+env-var overlay, monitor with health/readiness/metrics, shut down gracefully,
+tune backpressure, and rotate the audit log without breaking its hash chain.
+This is an operability guide, not a compliance guarantee. See
+[`configuration.md`](./configuration.md) for the full config reference and
+[`threat-model.md`](./threat-model.md) for the trust boundary.
+## 1. Deploy
+Haechi is a zero-runtime-dependency Node `>=22` package. The reference
+[`Dockerfile`](../../Dockerfile), [`docker-compose.yml`](../../docker-compose.yml),
+and [`.dockerignore`](../../.dockerignore) at the repo root build a hardened
+image (these files are **not** shipped in the npm tarball — they are repo deploy
+assets). The image:
+- pins a Node 22 slim base (matches `engines: ">=22"`),
+- runs as the non-root `node` user,
+- copies only the runtime files (no `.haechi` secrets, no tests, no docs sources),
+- declares a writable `/app/.haechi` volume for the audit chain / key file / token
+  vault and runs the rest of the tree read-only,
+- ships a `HEALTHCHECK` against `/__haechi/live`.
+```bash
+docker compose up -d        # build + run the reference stack
+docker compose logs -f haechi
+```
+**Front it with TLS + auth.** Haechi has no TLS of its own. Publish its port only
+to a TLS-terminating, authenticating reverse proxy (nginx / Caddy / Traefik / an
+API gateway); never expose the raw Haechi port on a public interface. The compose
+example publishes to host loopback (`127.0.0.1:11016`) for exactly this reason.
+**Binding beyond loopback.** Inside a container Haechi must bind `0.0.0.0` for the
+mapped port to be reachable, which requires `--allow-remote-bind` (the reference
+`CMD` passes it). On a host, prefer the default loopback bind and reach Haechi
+through the reverse proxy. See [Binding beyond loopback](./configuration.md#binding-beyond-loopback).
+## 2. Configuration via the env-var overlay
+For container / 12-factor deploys, a **fixed allowlist of NON-SECRET operational
+keys** may be overridden from the environment. The env value **wins over the
+config file** and is validated fail-closed — an invalid value (bad port, unknown
+mode) makes the process **fail to start** rather than degrade silently.
+| Env var | Config key | Type / values | Example |
+|---|---|---|---|
+| `HAECHI_PROXY_PORT` | `proxy.port` | integer 0–65535 | `11016` |
+| `HAECHI_PROXY_HOST` | `proxy.host` | non-empty string | `0.0.0.0` |
+| `HAECHI_UPSTREAM` | `target.upstream` | URL string | `http://llm:8000` |
+| `HAECHI_MODE` | `mode` | `dry-run` \| `report-only` \| `enforce` | `enforce` |
+| `HAECHI_LOG_FORMAT` | `logging.format` | `text` \| `json` | `json` |
+**Secrets are NOT overlayable — by design.** There is **no** `HAECHI_*` variable
+for `keys.*` (the local key file or an external key path), the auth token store,
+or any token/secret. Secrets stay in the mounted config file or are supplied via
+**injected providers** (`createRuntime(config, { cryptoProvider, authProvider, … })`).
+Putting a secret in a process environment invites leaking it through `/proc`,
+crash dumps, orchestrator inspect output, and child processes — so the overlay
+allowlist excludes them outright.
+The overlay is applied in `loadConfig()` after reading the file and before
+`normalizeConfig()`, so an overlaid value passes the same validation as a
+file-set one.
+## 3. Health, readiness, and metrics scraping
+Four unauthenticated routes under the reserved `/__haechi/*` prefix, checked
+before auth/body-read, never proxying upstream (full reference:
+[Operability endpoints](./configuration.md#operability-endpoints)):
+| Endpoint | Use |
+|---|---|
+| `GET /__haechi/live` | **Liveness** — restart probe. Cheap; 200 while the event loop serves. |
+| `GET /__haechi/ready` | **Readiness** — traffic gate. **503 when the audit sink is not writable** (a gateway that cannot audit is not ready). Point your load balancer / orchestrator readiness probe here. |
+| `GET /__haechi/health` | Back-compat (`ok` + `mode` + `version`). |
+| `GET /__haechi/metrics` | Prometheus text exposition. `404` when `metrics.enabled: false`. |
+**Scrape `/metrics`** with Prometheus (or any OpenMetrics-compatible scraper):
+```yaml
+scrape_configs:
+  - job_name: haechi
+    metrics_path: /__haechi/metrics
+    static_configs:
+      - targets: ["haechi:11016"]
+```
+Key signals: `haechi_requests_total{route,mode,decision}`, `haechi_blocks_total`,
+`haechi_auth_denied_total`, `haechi_rate_limited_total`, `haechi_overloaded_total`
+(backpressure 503s), `haechi_upstream_timeout_total`, `haechi_upstream_error_total`,
+`haechi_response_unprotected_total`, `haechi_internal_error_total`, and the
+`haechi_request_duration_seconds{route}` histogram.
+**No-PII-in-telemetry invariant.** Every metric name and **every label value** is
+a bounded enum (route id / mode / decision class) — never an identity, token, or
+detected value. The same invariant covers structured logs: with
+`logging.format: json` (or `HAECHI_LOG_FORMAT=json`), startup/shutdown/error logs
+carry a `correlationId` and an error class name only, never a payload. The
+`correlationId` also appears on the request's audit events, so you can join a
+logged error to its audit trail.
+## 4. Graceful shutdown
+On `SIGINT`/`SIGTERM` the CLI calls the proxy's `close()`, which **drains
+gracefully**:
+1. stops accepting new connections (`server.close()`),
+2. immediately closes idle keep-alive sockets (`closeIdleConnections()`),
+3. waits for in-flight requests to finish,
+4. after a grace period (`limits.shutdownGraceMs`, default 10000ms) force-closes
+   any lingering socket (`closeAllConnections()`) so a stuck keep-alive cannot
+   hold shutdown open forever.
+`close()` resolves once in-flight requests drain or the grace elapses. Set your
+orchestrator's `terminationGracePeriod` (Kubernetes) / `stop_grace_period`
+(compose) **above** `limits.shutdownGraceMs` so the platform does not SIGKILL
+mid-drain. Tune `limits.shutdownGraceMs` to your longest acceptable in-flight
+request.
+## 5. Backpressure tuning
+`limits.maxInFlight` is a global ceiling on concurrently-processing requests.
+- `0` (default) disables the ceiling — unchanged 1.1 behavior.
+- `> 0`: when the live in-flight count is at the ceiling, a **new** request is
+  rejected `503` with a `Retry-After` header (seconds, derived from
+  `limits.shutdownGraceMs`) and a `{ "error": "haechi_overloaded" }` body, **before**
+  auth and body-read. Each rejection increments `haechi_overloaded_total`.
+- The `/__haechi/*` observability routes are **exempt** from the ceiling, so
+  liveness and `/metrics` stay scrapable under saturation — you can still see
+  *why* you are shedding load.
+Set `maxInFlight` near the concurrency your upstream + host can sustain (watch
+`haechi_request_duration_seconds` and upstream saturation), leaving headroom so
+the gateway sheds load with a clean 503 instead of collapsing. Pair it with a
+tuned `limits.upstreamTimeoutMs` so a slow upstream cannot pin slots indefinitely.
+### Tuned timeouts
+`limits.requestTimeoutMs` and `limits.headersTimeoutMs` map to the Node HTTP
+server's `requestTimeout` / `headersTimeout`. Both default to `null` = leave
+Node's server defaults untouched (behavior unchanged unless you opt in). Set a
+number to cap slow-loris-style slow request/header delivery; `0` disables that
+specific timeout (Node semantics).
+## 6. Chain-aware audit log rotation & retention
+The audit log is a **SHA-256 hash chain** (`audit.path`): each event's
+`auditIntegrity.previousHash` links to the prior event's hash, so any insert,
+delete, edit, or reorder is detectable by `haechi audit-verify` /
+`verifyAuditChain`. An optional **anchor stream** (`audit.anchor`) appends the
+chain head to separate append-only media so even tail truncation (deleting the
+newest events) is caught. See [`audit` concepts](./configuration.md#audit) and the
+threat model.
+**Never truncate or rewrite a chain mid-stream.** Rotating by truncating
+`audit.jsonl` in place, or rewriting earlier lines, **breaks the chain** and makes
+verification fail (or, worse, silently destroys tamper evidence). Rotate by
+**starting a new segment**, preserving prior segments:
+1. **Stop or quiesce** the writer (graceful shutdown, or rotate at a maintenance
+   window). The default JSONL sink appends; rotating a file it holds open is what
+   you are avoiding.
+2. **Move the current segment aside**, keeping it intact:
+   `mv .haechi/audit.jsonl .haechi/audit-2026-06-12.jsonl` (and the matching
+   anchor: `mv .haechi/audit.anchor.jsonl .haechi/audit-2026-06-12.anchor.jsonl`).
+3. **Start a fresh segment** by restarting Haechi (or pointing `audit.path` /
+   `audit.anchor.path` at the new files). The new chain begins with
+   `previousHash: null` — a fresh, independently-verifiable chain. This is
+   expected: each segment is its own verifiable chain; you do **not** chain across
+   the rotation boundary.
+4. **Verify each retained segment independently** with its own anchor:
+   `haechi audit-verify --audit .haechi/audit-2026-06-12.jsonl --anchor .haechi/audit-2026-06-12.anchor.jsonl`.
+5. **Retain prior segments** for your retention window so the full history stays
+   verifiable. Archive (don't delete) to append-only / WORM storage where you can;
+   the anchor's defense assumes the anchor lives on separate, append-only media.
+**Retention:** keep each rotated segment (and its anchor) for your required
+audit-retention period, then expire whole segments — never partial lines within a
+segment. Token-vault retention is independent (`tokenVault.retentionDays`); audit
+rotation does not purge tokens.
+**Do not** compress/encrypt a segment in a way that prevents later
+re-verification unless you keep the verification step in your archival pipeline. A
+rotated segment is only useful as evidence if it still verifies.
+## 7. Quick reference
+| Task | Command |
+|---|---|
+| Start (compose) | `docker compose up -d` |
+| Liveness | `curl localhost:11016/__haechi/live` |
+| Readiness | `curl localhost:11016/__haechi/ready` |
+| Metrics | `curl localhost:11016/__haechi/metrics` |
+| Verify a segment | `haechi audit-verify --audit <seg>.jsonl --anchor <seg>.anchor.jsonl` |
+| Graceful stop | `docker compose stop` (SIGTERM → drain) |
+See also: [`config-version.md`](./config-version.md) for the `configVersion`
+stamp and upgrade notes.