npm - haechi - Versions diffs - 0.4.0 → 0.5.0 - Mend

haechi 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/README.ko.md +227 -0
package/README.md +13 -4
package/docs/README.md +3 -6
package/docs/current/api-stability.ko.md +2 -1
package/docs/current/api-stability.md +1 -0
package/docs/current/configuration.ko.md +210 -0
package/docs/current/configuration.md +210 -0
package/docs/current/release-0.5-implementation-scope.ko.md +69 -0
package/docs/current/release-0.5-implementation-scope.md +69 -0
package/docs/current/release-process.ko.md +2 -2
package/docs/current/release-process.md +2 -2
package/docs/current/risk-register-release-gate.ko.md +2 -2
package/docs/current/risk-register-release-gate.md +2 -2
package/docs/current/threat-model.ko.md +6 -4
package/docs/current/threat-model.md +5 -3
package/haechi.config.example.json +3 -1
package/package.json +3 -2
package/packages/cli/bin/haechi.mjs +163 -22
package/packages/cli/runtime.mjs +10 -2
package/packages/core/index.mjs +110 -1
package/packages/protocol-adapters/index.mjs +33 -14
package/packages/proxy/index.mjs +108 -1
package/packages/stream-filter/index.mjs +194 -0

package/README.ko.md ADDED Viewed

@@ -0,0 +1,227 @@
+# Haechi
+[![npm](https://img.shields.io/npm/v/haechi)](https://www.npmjs.com/package/haechi)
+[![CI](https://github.com/raeseoklee/haechi/actions/workflows/ci.yml/badge.svg)](https://github.com/raeseoklee/haechi/actions/workflows/ci.yml)
+[![license](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](LICENSE)
+[![node](https://img.shields.io/node/v/haechi)](https://nodejs.org)
+[![status](https://img.shields.io/badge/status-developer%20preview-orange)](docs/current/risk-register-release-gate.md)
+[English](README.md) | **한국어**
+Haechi는 LLM, MCP, vLLM, Ollama, 그리고 에이전트 payload가 모델, 도구, 로그, 또는 proxy에 도달하기 전에 보호하기 위한 자체 호스팅 AI 컨텍스트 집행 레이어의 실험적 개발자 프리뷰이다.
+이름은 분별력과 보호를 상징하는 한국의 수호 신수 해치에서 유래했다.
+이 저장소는 로컬 개발, 보안 설계 검토, 자체 호스팅 통합 실험을 위한 것이다. 운영 환경에 바로 사용할 수 있는 상태가 아니며, 컴플라이언스를 보장하지 않는다.
+현재 개발자 프리뷰 범위는 로컬 도입에 초점을 맞추고 있다:
+- `haechi init`: 로컬 키, 샘플 설정, audit 경로를 생성한다
+- `haechi protect`: OpenAI 호환 JSON payload를 검사하고 보호한다
+- `haechi report`: 원시 payload 없이 audit 이벤트를 요약한다
+- `haechi proxy`: 기존 LLM 호출을 위한 로컬 HTTP JSON proxy를 실행한다
+- `haechi status`: 현재 설정 하에서 보호되는 항목과 그렇지 않은 항목을 표시한다
+- `haechi audit-verify`: audit hash chain을 검증하고 head hash를 출력한다
+- `haechi mcp-wrap -- <command>`: MCP 서버를 양방향 stdio 보호로 래핑한다
+## 설치
+```bash
+npm install -g haechi
+haechi init
+```
+설치 없이 실행하려면:
+```bash
+npx haechi init
+```
+## 빠른 시작
+이 저장소를 클론한 후:
+```bash
+npm test
+npm run demo:init
+npm run demo:protect
+npm run demo:report
+```
+기본 설정은 `dry-run` 모드로 실행된다. 민감한 값을 탐지하고 audit 메타데이터를 기록하지만, 정책 모드를 변경하기 전까지는 아웃바운드 payload를 수정하지 않는다.
+`npm run demo:init`은 `haechi.config.json`과 `.haechi/dev.keys.json`을 로컬에 생성한다. 생성된 키 파일은 로컬 개발 전용이다. Haechi 0.3.x는 운영 환경용 KMS/HSM/Vault 키 provider를 포함하지 않는다. 비밀 정보를 포함하지 않는 템플릿은 `haechi.config.example.json`에서 확인할 수 있다.
+## Local Proxy
+```bash
+node packages/cli/bin/haechi.mjs proxy --config haechi.config.json
+```
+기존 HTTP JSON 클라이언트를 `http://localhost:1016`으로 지정하고, `haechi.config.json`에서 `target.upstream`을 설정한다. 다른 로컬 포트를 사용하려면 설정에서 `proxy.port`를 변경하거나 `--port`를 전달한다.
+proxy는 기본적으로 loopback에 바인드된다. `0.0.0.0`, `::`, 또는 다른 non-loopback 호스트에 바인딩하려면 `--allow-remote-bind`를 명시적으로 제공해야 한다. 이 플래그는 명시적인 네트워크 접근 통제 하에서만 사용한다.
+`stream: true`인 스트리밍 요청은 기본적으로 차단된다. `streaming.requestMode`를 `inspect`로 설정하면 SSE/NDJSON 응답을 stream-filter한다(bounded sliding buffer가 프레임에 걸쳐 쪼개진 PII도 잡는다; `streaming.maxMatchBytes` 참고). 또는 호출자가 보호되지 않는 스트리밍을 명시적으로 인정하는 경우에만 `pass-through`로 설정한다.
+Ollama의 `/api/chat`과 `/api/generate`는 `stream` 필드가 생략되면 기본적으로 스트리밍하므로, proxy는 `stream: false`가 명시적으로 설정되지 않으면 해당 요청을 스트리밍으로 간주한다.
+upstream 요청은 `limits.upstreamTimeoutMs`(기본값 120000) 이후 타임아웃되며 `504 haechi_upstream_timeout`으로 실패한다.
+## Local Inference Servers
+Haechi 0.3은 OpenAI 호환 서버, vLLM, Ollama, llama.cpp를 위한 프로토콜 adapter 프리셋을 포함한다.
+```json
+{
+  "target": {
+    "type": "vllm-openai",
+    "upstream": "http://127.0.0.1:8000"
+  },
+  "policy": {
+    "mode": "enforce",
+    "presets": ["local-inference"]
+  },
+  "responseProtection": {
+    "enabled": true,
+    "mode": "enforce",
+    "failureMode": "fail-closed"
+  }
+}
+```
+그런 다음 OpenAI 호환 클라이언트를 `http://127.0.0.1:1016/v1`으로 지정한다. Ollama 네이티브 API의 경우 `target.adapter: "ollama"`를 사용하고 proxy를 통해 `/api/chat` 또는 `/api/generate`를 호출한다.
+## 토큰 왕복
+tokenization을 사용하면 모델은 안정적인 token을 받고 호출자는 평문을 돌려받는다:
+```json
+{
+  "policy": { "mode": "enforce", "actions": { "email": "tokenize" } },
+  "responseProtection": { "enabled": true, "mode": "enforce" },
+  "tokenVault": {
+    "deterministic": true,
+    "detokenizeResponses": true
+  }
+}
+```
+- `tokenVault.deterministic` (기본값 `false`): 동일한 값이 항상 동일한 token으로 매핑된다(로컬 키에서 파생된 도메인 분리 키로 HMAC — 원시 AES 키가 아님). 재전송된 히스토리가 동일한 token으로 재tokenize되므로 멀티턴 채팅에 필요하다. **트레이드오프:** 동일한 값이 요청 간에 연결 가능해진다. `deterministicTypes`(예: `["email"]`)는 determinism을 선택한 type에만 제한한다.
+- `tokenVault.detokenizeResponses` (기본값 `false`): 해당 요청을 보호하는 동안 발급된 token**만** 해당 요청의 응답에서 복원한다. 다른 클라이언트나 요청의 token은 복원되지 않는다. `revealPolicy`와 독립적이며, 모든 복원은 개수 단위로 audit 기록되고 값은 기록되지 않는다. `responseProtection.enabled`가 필요하다.
+## MCP Wrap
+stdio MCP 서버를 래핑하여 양방향 트래픽을 필터링한다 — MCP 클라이언트 설정에서 커맨드만 변경하면 된다:
+```json
+{
+  "mcpServers": {
+    "some-server": {
+      "command": "npx",
+      "args": ["-y", "haechi", "mcp-wrap", "--config", "/path/haechi.config.json", "--", "npx", "some-mcp-server"]
+    }
+  }
+}
+```
+클라이언트→서버 요청은 `mcp.allowedMethods` allowlist와 params 보호를 통과하고, 서버→클라이언트 결과는 params/result 보호와 injection 휴리스틱(아래 참고)을 적용받는다. 거부된 요청은 클라이언트에게 응답되며 서버에 도달하지 않는다. stderr과 exit code는 그대로 전달된다.
+## Injection Detection (Preview)
+응답 및 tool result 텍스트는 간접 prompt injection(지시문 재정의, 역할 재할당, prompt 마커, 사용자에게 숨기기 표현, 은밀한 tool 유도)에 대한 휴리스틱 규칙으로 검사된다. `injection` type은 **기본적으로 report-only**이다: 탐지 결과는 audit 로그에 기록되지만 수정하거나 차단하지 않는다. 신호를 신뢰할 수 있게 되면 명시적으로 격상한다:
+```json
+{ "policy": { "actions": { "injection": "block" } } }
+```
+이 휴리스틱은 prompt injection에 대한 완전한 방어책이 아니다. `docs/current/threat-model.md`를 참고하라.
+## 설정
+`haechi init`은 `haechi.config.json`을 생성하며, 비밀 정보를 포함하지 않는 템플릿은 `haechi.config.example.json`에 있다. 모든 키는 fail-closed 방식으로 검증된다 — 알 수 없거나 잘못된 형식의 값은 시작을 거부한다.
+| 키 | 기본값 | 설명 |
+|---|---|---|
+| `mode` / `policy.mode` | `dry-run` | `dry-run`과 `report-only`는 탐지 및 audit만 수행하고, `enforce`는 변환/차단을 적용한다. `policy.mode`가 `mode`보다 우선한다 |
+| `target.type` / `target.adapter` | `llm-http` / `openai-compatible` | upstream 프로토콜: `openai-compatible`, `vllm-openai`, `ollama`, `llama-cpp`. 알 수 없는 type은 fail-closed로 처리된다 |
+| `target.upstream` | `http://127.0.0.1:9999` | proxy가 요청을 전달하는 유일한 upstream (절대 URL 요청 대상은 거부된다) |
+| `proxy.host` / `proxy.port` | `127.0.0.1` / `1016` | proxy 바인드 주소. 아래의 remote 바인딩 참고 |
+| `responseProtection.enabled` | `false` | upstream JSON 응답을 검사한다. `failureMode: fail-closed`는 비JSON/압축/대용량 응답을 거부한다 |
+| `responseProtection.maxBytes` | `1048576` | 응답 크기의 상한 — `failureMode: allow` 상태에서도 적용된다 |
+| `streaming.requestMode` | `block` | `block`은 스트리밍을 501 차단; `inspect`는 SSE/NDJSON 응답을 stream-filter; `pass-through`는 검사 없이 전달(audit 기록). Ollama chat/generate는 `stream: false`가 없으면 스트리밍으로 간주된다 |
+| `streaming.responseMode` | `enforce` | 검사된 스트림에 적용되는 enforcement 모드(`dry-run`/`report-only`/`enforce`) |
+| `streaming.maxMatchBytes` | `256` | cross-frame 매칭 윈도우; 이보다 긴 단일 매치는 프레임에 걸쳐 쪼개질 수 있음 |
+| `limits.maxRequestBytes` | `1048576` | 요청 바디 상한 (한도 초과 시 413) |
+| `limits.upstreamTimeoutMs` | `120000` | upstream 타임아웃 (만료 시 504) |
+| `policy.presets` | `korean-pii`, `secrets-only`, `llm-redact` | 병합되는 프리셋 action; 병합은 강화는 가능하지만 약화는 불가 |
+| `policy.actions` | `card: block` | type별 action: `allow`/`redact`/`mask`/`tokenize`/`encrypt`/`block` |
+| `filters.customRules` | `[]` | 추가 정규식 규칙 (ReDoS 검사: 중첩 한정자/역참조 없음) |
+| `keys.provider` / `keys.keyFile` | `local` / `.haechi/dev.keys.json` | 개발 전용 소프트웨어 키 (0600). `external`은 프로그래밍 방식으로 crypto provider를 주입해야 한다 |
+| `audit.path` | `.haechi/audit.jsonl` | hash chain JSONL audit 로그; `haechi audit-verify`로 검증한다 |
+| `tokenVault.revealPolicy` | `disabled` | 수동 reveal 게이트 (`local-dev`로 활성화; 모든 결정이 audit 기록됨) |
+| `tokenVault.retentionDays` | `30` | 만료된 token은 vault 쓰기 시 또는 `haechi token-purge --expired`로 삭제된다 |
+| `tokenVault.deterministic` / `deterministicTypes` / `detokenizeResponses` | `false` / `null` / `false` | 토큰 왕복 (위 참고) |
+| `privacy.profile` | `null` | `kr-pipa`, `eu-gdpr`, `us-general` 기준 action (강화 전용) |
+| `mcp.allowedMethods` | `initialize`, `tools/call`, `resources/read`, `prompts/get` | `mcp-stdio`/`mcp-wrap`에서 클라이언트가 호출할 수 있는 method allowlist |
+위 표는 빠른 참고용이다. 키별 전체 레퍼런스 — 타입, 검증 규칙, 프리셋, action 강도, 일반적인 설정 — 는 [`docs/current/configuration.md`](docs/current/configuration.md)에 있으며, CLI에서 축약 버전을 출력한다:
+```bash
+haechi config        # 설정 가이드
+haechi help          # 모든 커맨드
+haechi help proxy    # 특정 커맨드
+haechi status        # 현재 설정의 실제 적용 상태
+```
+### loopback 밖으로 바인딩 (0.0.0.0)
+proxy는 CLI 플래그를 명시적으로 전달하지 않으면 non-loopback 호스트를 거부한다 — 설정 파일에 `proxy.host: "0.0.0.0"`을 지정해도 의도적으로 시작되지 않는다(설정 파일을 복사해도 게이트웨이가 자동으로 노출되지 않도록):
+```bash
+haechi proxy --config haechi.config.json --host 0.0.0.0 --allow-remote-bind
+```
+**proxy는 아직 클라이언트 인증을 제공하지 않는다** (0.6 계획): 포트에 접근할 수 있는 누구든 upstream과 token round-trip 경로를 사용할 수 있다. `--allow-remote-bind`는 명시적인 네트워크 통제 하에서만 사용한다:
+- **컨테이너**: 컨테이너 내에서 `0.0.0.0`으로 바인딩하는 것은 일반적인 패턴이다 — 포트 매핑에서 노출을 제한한다(예: `-p 127.0.0.1:1016:1016`)
+- **LAN/원격**: 방화벽, VPN(예: Tailscale), 또는 인증 reverse proxy를 앞에 둔다
+## Privacy Profiles
+Haechi는 로컬 정책 부트스트래핑을 위한 기본 지역별 Privacy Profiles를 포함한다:
+- `kr-pipa`
+- `eu-gdpr`
+- `us-general`
+`haechi.config.json`에서 `privacy.profile`을 설정하면 집행 전에 해당 프로필의 기본 action이 적용된다. 이 프로필은 엔지니어링 기본값이며, 법적 자문이 아니다.
+## 보안 노트
+- 이 프로젝트는 컴플라이언스를 보장하지 않는다.
+- 0.1 crypto provider는 Node `crypto`와 AES-256-GCM 및 로컬 소프트웨어 키를 사용한다.
+- Audit 이벤트에는 원시 prompt, tool result, secret, 또는 PII 값이 포함되어서는 안 된다.
+- 알 수 없거나 잘못된 정책/설정 오류는 집행 경로에서 fail-closed로 처리되어야 한다.
+- Response protection은 명시적인 allow 정책이 설정되지 않는 한 비JSON, 잘못된 JSON, 압축, 또는 대용량 응답에 대해 fail-closed로 처리된다.
+- Token reveal 및 purge 결정은 audit 로그에 기록된다(token id와 결정만 기록되며, 평문은 기록되지 않는다). 만료된 token은 vault 변경 시 또는 `haechi token-purge --expired`로 제거된다.
+- `haechi init --force`는 로컬 키를 교체한다: 기존 키는 `retired` 상태로 보관되어 기존 암호문과 token vault 레코드를 `kid`로 복호화할 수 있다.
+- Privacy profile은 명시적으로 더 엄격한 사용자 action을 강화할 수는 있지만 약화할 수는 없다.
+- 탐지는 문자열 값, JSON 숫자(예: 카드 번호), 객체 키 이름을 검사한다. Base64/URL 인코딩된 값과 URL 쿼리 스트링은 검사되지 않는다.
+- 이 패키지는 개발자 프리뷰이다. 인터넷에 노출된 운영 LLM 게이트웨이로 사용하지 않는다.
+## 현재 범위
+0.1 빠른 시작 범위는 `docs/current/mvp-0.1-implementation-scope.md`에 설명되어 있다.
+0.2는 로컬 TokenVault, 서명된 policy bundle 커맨드, 플러그인 매니페스트 검증, MCP stdio JSON-RPC 라인 필터 스켈레톤을 추가한다. `docs/current/release-0.2-implementation-scope.md` 참고.
+0.3은 로컬 inference 프로토콜 adapter, 선택적 JSON 응답 보호, npm 패키지 메타데이터, 배포 준비 export를 추가한다. `docs/current/release-0.3-implementation-scope.md` 참고.
+0.3.1은 릴리스 안전 게이트, 응답 fail-closed 동작, audit hash chaining, token reveal 거버넌스, provider injection, privacy profile, CI/SBOM/provenance 워크플로 스캐폴딩, 그리고 전용 위협/공유 책임/API 안정성 문서를 추가한다.
+0.3.2는 보안 강화 릴리스이자 첫 번째 npm 개발자 프리뷰 대상이다: Ollama 암묵적 스트리밍 fail-closed 처리, 감사된 token reveal/purge, 보존 기간 purge, kid 기반 키 교체, 도메인 분리 policy bundle 서명, JSON 숫자/객체 키 탐지, upstream 타임아웃, stale lock 복구, 그리고 non-enforcing 모드 경고. `docs/current/release-0.3.2-hardening-scope.md` 참고.
+0.4.0은 token round-trip(deterministic tokenization + 요청 스코프 응답 detokenization), `mcp-wrap` 양방향 MCP 필터, `status` 및 `audit-verify` 커맨드, report-only injection detection 휴리스틱을 추가하고, 0.6 인증을 위한 PII-safe `identity`/`authProvider` 계약을 예약한다. `docs/current/release-0.4-implementation-scope.md` 참고.
+0.5.0은 SSE/NDJSON 스트리밍 응답 검사를 추가한다: `streaming.requestMode: "inspect"`가 bounded sliding buffer로 응답을 stream-filter하여 프레임에 걸쳐 쪼개진 PII도 잡는다(`streaming.maxMatchBytes`). `docs/current/release-0.5-implementation-scope.md` 참고.

package/README.md CHANGED Viewed

@@ -6,6 +6,8 @@
 [![node](https://img.shields.io/node/v/haechi)](https://nodejs.org)
 [![status](https://img.shields.io/badge/status-developer%20preview-orange)](docs/current/risk-register-release-gate.md)
+**English** | [한국어](README.ko.md)
 Haechi is an experimental developer preview of a self-hosted AI context enforcement layer for protecting LLM, MCP, vLLM, Ollama, and agent payloads before they reach models, tools, logs, or proxies.
 The name comes from Haechi, a Korean guardian figure associated with discernment and protection.
@@ -60,7 +62,7 @@ Point an existing HTTP JSON client at `http://localhost:1016` and set `target.up
 The proxy binds to loopback by default. Binding to `0.0.0.0`, `::`, or another non-loopback host fails unless `--allow-remote-bind` is provided. Use that flag only behind explicit network access controls.
-Streaming requests with `stream: true` are blocked by default. Haechi 0.3.x does not inspect SSE or NDJSON streams. Set `streaming.requestMode` to `pass-through` only when the caller explicitly accepts that streaming payloads are not protected by Haechi.
+Streaming requests with `stream: true` are blocked by default. Set `streaming.requestMode` to `inspect` to stream-filter SSE/NDJSON responses (a bounded sliding buffer catches PII split across frames; see `streaming.maxMatchBytes`), or to `pass-through` only when the caller explicitly accepts unprotected streaming.
 Ollama `/api/chat` and `/api/generate` stream by default when the `stream` field is omitted, so the proxy treats those requests as streaming unless `stream: false` is explicitly set.
@@ -147,7 +149,9 @@ These heuristics are not a complete defense against prompt injection; see `docs/
 | `proxy.host` / `proxy.port` | `127.0.0.1` / `1016` | Proxy bind address. See remote binding below |
 | `responseProtection.enabled` | `false` | Inspect upstream JSON responses. `failureMode: fail-closed` rejects non-JSON/compressed/oversized responses |
 | `responseProtection.maxBytes` | `1048576` | Hard response size cap — enforced even in `failureMode: allow` |
-| `streaming.requestMode` | `block` | `stream: true` requests get 501 unless `pass-through` (uninspected, audited). Ollama chat/generate count as streaming unless `stream: false` |
+| `streaming.requestMode` | `block` | `block` 501s streaming; `inspect` stream-filters SSE/NDJSON responses; `pass-through` forwards uninspected (audited). Ollama chat/generate count as streaming unless `stream: false` |
+| `streaming.responseMode` | `enforce` | Enforcement mode for inspected streams (`dry-run`/`report-only`/`enforce`) |
+| `streaming.maxMatchBytes` | `256` | Cross-frame match window; a single match longer than this may split across frames |
 | `limits.maxRequestBytes` | `1048576` | Request body cap (413 over the limit) |
 | `limits.upstreamTimeoutMs` | `120000` | Upstream timeout (504 on expiry) |
 | `policy.presets` | `korean-pii`, `secrets-only`, `llm-redact` | Merged preset actions; merges can strengthen but never weaken |
@@ -161,10 +165,13 @@ These heuristics are not a complete defense against prompt injection; see `docs/
 | `privacy.profile` | `null` | `kr-pipa`, `eu-gdpr`, `us-general` baseline actions (strengthen-only) |
 | `mcp.allowedMethods` | `initialize`, `tools/call`, `resources/read`, `prompts/get` | Client-callable method allowlist for `mcp-stdio`/`mcp-wrap` |
-Check the effective state at any time:
+The table above is a quick reference. The full per-key reference — types, validation rules, presets, action strength, and common setups — is in [`docs/current/configuration.md`](docs/current/configuration.md), and the CLI prints a condensed version:
 ```bash
-haechi status
+haechi config        # configuration guide
+haechi help          # all commands
+haechi help proxy    # one command
+haechi status        # effective state of the current config
 ```
 ### Binding beyond loopback (0.0.0.0)
@@ -216,3 +223,5 @@ Set `privacy.profile` in `haechi.config.json` to apply the profile's default act
 0.3.2 is a security-hardening release and the first npm developer preview target: Ollama implicit-streaming fail-closed handling, audited token reveal/purge, retention purge, kid-based key rotation, domain-separated policy bundle signing, JSON number/object key detection, upstream timeouts, stale lock recovery, and non-enforcing-mode warnings. See `docs/current/release-0.3.2-hardening-scope.md`.
 0.4.0 adds the token round-trip (deterministic tokenization + request-scoped response detokenization), the `mcp-wrap` bidirectional MCP filter, `status` and `audit-verify` commands, report-only injection detection heuristics, and reserves the PII-safe `identity`/`authProvider` contracts for 0.6 auth. See `docs/current/release-0.4-implementation-scope.md`.
+0.5.0 adds SSE/NDJSON streaming response inspection: `streaming.requestMode: "inspect"` stream-filters responses with a bounded sliding buffer that catches PII split across frames (`streaming.maxMatchBytes`). See `docs/current/release-0.5-implementation-scope.md`.

package/docs/README.md CHANGED Viewed

@@ -15,20 +15,17 @@ English is the primary documentation language. Korean translations are maintaine
 - `docs/current/release-0.3-implementation-scope.md`: 0.3 vLLM/Ollama/llama.cpp adapters, response protection, npm publish readiness scope
 - `docs/current/release-0.3.2-hardening-scope.md`: 0.3.2 security hardening release; first npm developer preview target
 - `docs/current/release-0.4-implementation-scope.md`: 0.4 token round-trip, mcp-wrap, audit-verify/status, identity/authProvider contract reservation
+- `docs/current/release-0.5-implementation-scope.md`: 0.5 SSE/NDJSON streaming response inspection with bounded cross-frame buffer
+- `docs/current/configuration.md`: full configuration reference (every key, defaults, validation, presets, common setups)
 - `docs/current/risk-register-release-gate.md`: release-blocking risks, security/operational risk status, npm release gates (0.3.2 baseline)
 - `docs/current/threat-model.md`: Haechi 0.3.2 trust boundaries, protected assets, key threats and controls
 - `docs/current/shared-responsibility.md`: responsibility split between Haechi and users/operators in self-hosted deployments
 - `docs/current/api-stability.md`: developer preview API stability and migration note criteria
 - `docs/current/release-process.md`: release preflight, SBOM, npm provenance publish procedure
-## Archive
-- `docs/archive/2026-06-08-initial/research-summary.md`: initial research summary for the general-purpose modular segment-encryption concept
-- `docs/archive/2026-06-08-initial/`: PRD, SRS, and security review drafts for the general-purpose segment-encryption concept (Korean, historical record)
 ## Direction Change Record
-Early documents covered a general-purpose modular segment-encryption layer spanning HTTP/HTTPS/socket/gRPC/A2A. The current direction is a specialized protection solution for prompts, context, tool calls, resources, artifacts, and streaming messages across AI, LLM, MCP, A2A, and agent platforms.
+Early drafts (now removed; see git history before this commit) covered a general-purpose modular segment-encryption layer spanning HTTP/HTTPS/socket/gRPC/A2A. The current direction is a specialized protection solution for prompts, context, tool calls, resources, artifacts, and streaming messages across AI, LLM, MCP, A2A, and agent platforms.
 Open-source/self-hosted security infrastructure takes priority over commercial SaaS. Current documents therefore center on the self-hosted SDK/CLI/proxy, replaceable `CryptoProvider`, `PolicyEngine`, `FilterEngine`, `KeyProvider`, `AuditSink`, and plugin conformance tests, rather than a hosted control plane.

package/docs/current/api-stability.ko.md CHANGED Viewed

@@ -2,7 +2,7 @@
 - 문서 상태: Draft 0.1
 - 작성일: 2026-06-10
-- 기준 버전: 0.4.0
+- 기준 버전: 0.5.0
 ## 1. 버전 해석
@@ -41,6 +41,7 @@
 - `injection` detection type과 휴리스틱 룰
 - `identity` audit 필드와 `authProvider` 계약 (0.4 예약, 0.6 구현 — 그 전까지 형태 변경 가능)
 - `status` / `audit-verify` CLI 출력 형태
+- `haechi/stream-filter` (`inspectResponseStream`, path helpers) 및 `createStreamProtector` (스트리밍 검사 내부 구현)
 ## 4. Migration note 기준

package/docs/current/api-stability.md CHANGED Viewed

@@ -41,6 +41,7 @@ The following exports are treated as preview in 0.4.0.
 - `injection` detection type and its heuristic rules
 - `identity` audit field and the `authProvider` contract (reserved in 0.4, implemented in 0.6 — shape may change until then)
 - `status` / `audit-verify` CLI output shapes
+- `haechi/stream-filter` (`inspectResponseStream`, path helpers) and `createStreamProtector` (streaming inspection internals)
 ## 4. Migration note criteria

package/docs/current/configuration.ko.md ADDED Viewed

@@ -0,0 +1,210 @@
+# Haechi 설정 레퍼런스
+- 문서 상태: Living document
+- 기준 버전: 0.5.0
+`haechi init`은 `haechi.config.json`을 생성하며, 비밀 정보를 포함하지 않는 템플릿은 `haechi.config.example.json`에 있다. 모든 커맨드는 `--config <path>`로 설정 파일을 읽는다(기본값: `haechi.config.json`). 설정은 **fail-closed 방식으로 검증**된다: 알 수 없는 provider, 범위를 벗어난 숫자, 잘못된 형식의 값은 자동으로 무시되지 않고 로드 시점에 오류를 발생시킨다. `haechi config`는 이 레퍼런스를 출력하며, `haechi status`는 특정 설정 파일의 *실제 적용* 상태를 출력한다.
+## 전체 기본값
+```json
+{
+  "mode": "dry-run",
+  "target": { "type": "llm-http", "adapter": "openai-compatible", "upstream": "http://127.0.0.1:9999" },
+  "proxy": { "host": "127.0.0.1", "port": 1016 },
+  "responseProtection": { "enabled": false, "mode": "enforce", "failureMode": "fail-closed", "allowNonJson": false, "allowCompressed": false, "maxBytes": 1048576 },
+  "streaming": { "requestMode": "block" },
+  "limits": { "maxRequestBytes": 1048576, "upstreamTimeoutMs": 120000 },
+  "policy": { "mode": "dry-run", "presets": ["korean-pii", "secrets-only", "llm-redact"], "defaultAction": "redact", "actions": { "card": "block" } },
+  "filters": { "customRules": [] },
+  "keys": { "provider": "local", "keyFile": ".haechi/dev.keys.json" },
+  "audit": { "sink": "jsonl", "path": ".haechi/audit.jsonl" },
+  "tokenVault": { "provider": "local", "path": ".haechi/token-vault.json", "revealPolicy": "disabled", "retentionDays": 30, "deterministic": false, "deterministicTypes": null, "detokenizeResponses": false },
+  "privacy": { "profile": null },
+  "mcp": { "allowedMethods": ["initialize", "tools/call", "resources/read", "prompts/get"], "protectParams": true, "protectResults": true, "requireJsonRpc": true }
+}
+```
+## 최상위
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `mode` | `dry-run` \| `report-only` \| `enforce` | `dry-run` | 전역 집행 모드. `dry-run`/`report-only`는 탐지 및 audit만 수행하며, `enforce`는 변환/차단을 적용한다. `policy.mode`가 설정된 경우 해당 값이 우선한다. |
+## `target`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `target.type` | `llm-http` \| `openai-compatible` \| `vllm-openai` \| `ollama` \| `llama-cpp` | `llm-http` | 프로토콜 adapter를 선택한다. `llm-http`는 `openai-compatible`의 별칭이다. 알 수 없는 값은 로드 시 **fail-closed**로 처리된다. |
+| `target.adapter` | 동일한 값 집합 | `openai-compatible` | adapter를 명시적으로 지정한다. 보통은 설정하지 않고 `type`이 결정하도록 두면 된다. |
+| `target.upstream` | URL 문자열 | `http://127.0.0.1:9999` | proxy가 요청을 전달하는 유일한 upstream. 요청 대상은 origin-form 경로여야 하며, 절대 URL 대상은 거부된다(SSRF 방어). |
+## `proxy`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `proxy.host` | 비어 있지 않은 문자열 | `127.0.0.1` | 바인드 주소. loopback이 아닌 host를 사용하려면 `--allow-remote-bind` CLI 플래그가 필요하다 — 설정 파일만으로는 시작되지 않는다([loopback 밖으로 바인딩](#binding-beyond-loopback) 참고). |
+| `proxy.port` | 정수 0–65535 | `1016` | 리슨 포트(`0` = 임시 포트). `--port`로 실행 시마다 덮어쓸 수 있다. |
+## `responseProtection`
+upstream JSON 응답을 검사한다(기본적으로 꺼져 있음 — 모델로부터 *돌아오는* 내용을 보호하려면 활성화한다).
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `responseProtection.enabled` | boolean | `false` | 마스터 스위치. `detokenizeResponses`가 작동하려면 반드시 활성화되어 있어야 한다. |
+| `responseProtection.mode` | `dry-run` \| `report-only` \| `enforce` | `enforce` | 응답 방향의 집행 모드. |
+| `responseProtection.failureMode` | `fail-closed` \| `allow` | `fail-closed` | *검사 불가능한* 응답(비JSON, 잘못된 JSON, 압축)에 대한 처리 방식. `fail-closed`는 502를 반환하고, `allow`는 통과시킨다(audit 기록됨). |
+| `responseProtection.allowNonJson` | boolean | `false` | 비JSON 응답을 검사 없이 통과시킨다. |
+| `responseProtection.allowCompressed` | boolean | `false` | 압축 응답을 검사 없이 통과시킨다. |
+| `responseProtection.maxBytes` | 양의 정수 | `1048576` | 응답 크기의 상한. `failureMode: allow` 상태에서도 적용되며, 크기를 초과한 응답은 항상 거부된다. |
+## `streaming`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `streaming.requestMode` | `block` \| `pass-through` \| `inspect` | `block` | `block`은 스트리밍 요청에 `501`을 반환한다; `inspect`는 bounded cross-frame 버퍼로 SSE/NDJSON 응답을 stream-filter한다; `pass-through`는 검사 없이 전달한다(감사됨). Ollama의 `/api/chat`과 `/api/generate`는 `stream: false`가 명시되지 않으면 streaming으로 간주된다. |
+| `streaming.responseMode` | `dry-run` \| `report-only` \| `enforce` | `enforce` | 검사된 스트림에 적용되는 집행 모드(요청 방향과 독립적). |
+| `streaming.maxMatchBytes` | 양의 정수 | `256` | inspect 시 cross-frame 매칭 윈도우. 이 크기의 tail을 보유하여 프레임에 걸친 탐지를 방출 전에 포착할 수 있다; 이 값보다 긴 단일 매칭은 프레임에 걸쳐 분할될 수 있다. |
+## `limits`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `limits.maxRequestBytes` | 양의 정수 | `1048576` | 요청 바디 크기 상한. 초과 시 `413`을 반환한다. 바디를 전부 버퍼링하지 않고 증분 방식으로 적용된다. |
+| `limits.upstreamTimeoutMs` | 양의 정수 | `120000` | upstream 요청 타임아웃. 만료 시 `504 haechi_upstream_timeout`을 반환한다. 연결 실패 시에는 `502 haechi_upstream_unreachable`을 반환한다. |
+## `policy`
+탐지→결정의 핵심. [Detection type과 action](#detection-types--actions) 참고.
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `policy.mode` | `dry-run` \| `report-only` \| `enforce` | `dry-run` | 실제 적용되는 집행 모드(`policy.mode ?? mode`). |
+| `policy.presets` | preset 이름 배열 | `["korean-pii", "secrets-only", "llm-redact"]` | 순서대로 병합되는 내장 action 집합. [Presets](#presets) 참고. |
+| `policy.defaultAction` | action | `redact` | 명시적 매핑이 없는 탐지 type에 적용되는 action. |
+| `policy.actions` | `{ <type>: <action> }` | `{ "card": "block" }` | type별 개별 재정의. 병합 시 **강화**는 가능하지만 약화는 불가([Action strength](#action-strength) 참고). `injection`은 설정하지 않으면 기본적으로 `allow`이다. |
+| `policy.allowUnsafeOverrides` | boolean | `false` | 더 약한 action이 더 강한 action을 덮어쓰는 것을 허용한다. 기본적으로 꺼져 있으며, 활성화하면 안전 장치가 제거된다. |
+| `policy.bundlePath` | 경로 | 미설정 | 인라인 정책 대신 서명된 policy bundle을 로드한다(`keys.keyFile`에 대해 검증됨). |
+## `filters`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `filters.customRules` | 규칙 객체 배열 | `[]` | 추가 탐지 규칙: `{ id, type, pattern, flags?, confidence? }`. 패턴은 ReDoS 검사를 통과해야 하며(≤500자, 중첩 한정자 없음, 역참조 없음), 안전하지 않으면 로드 시 거부된다. |
+## `keys`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `keys.provider` | `local` \| `external` | `local` | `local`은 소프트웨어 AES-256-GCM 키 파일을 사용한다(개발 전용). `external`은 키 자료를 포함하지 않으며, `createRuntime(config, { cryptoProvider })`를 통해 crypto provider를 주입해야 한다. |
+| `keys.keyFile` | 경로 | `.haechi/dev.keys.json` | 로컬 키 파일(모드 `0600`). `haechi init --force`는 키를 교체하며, 기존 키는 `kid`로 기존 암호문/token이 복호화 가능하도록 퇴역 상태로 보관된다. |
+## `audit`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `audit.sink` | `jsonl` | `jsonl` | `jsonl`만 지원된다. |
+| `audit.path` | 경로 | `.haechi/audit.jsonl` | SHA-256 hash chain 로그. `haechi audit-verify`로 검증한다. 평문/PII를 포함하지 않는다. |
+## `tokenVault`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `tokenVault.provider` | `local` | `local` | `local`만 지원된다. |
+| `tokenVault.path` | 경로 | `.haechi/token-vault.json` | 암호화된 token 저장소(원자적 쓰기, 파일 락). |
+| `tokenVault.revealPolicy` | `disabled` \| `local-dev` | `disabled` | **수동** reveal(`token-reveal`)을 허용할지 결정한다. 모든 reveal/purge 결정은 audit 기록된다. detokenization과는 독립적이다. |
+| `tokenVault.retentionDays` | 양의 수 | `30` | Token TTL. 만료된 token은 vault 쓰기 시 또는 `token-purge --expired`로 삭제된다. |
+| `tokenVault.deterministic` | boolean | `false` | 동일한 `(type, value)` → 동일한 token(도메인 분리된 파생 키로 HMAC). 멀티턴에 필요하다. **트레이드오프:** 동일한 값이 연결 가능해진다. |
+| `tokenVault.deterministicTypes` | `null` \| 비어 있지 않은 문자열 배열 | `null` | `null`이면 deterministic 활성화 시 모든 type에 적용. 그렇지 않으면 열거된 type에만 determinism을 제한한다(예: `["email"]`). |
+| `tokenVault.detokenizeResponses` | boolean | `false` | 해당 요청을 처리하며 발급한 token을 응답에서 복원한다. 동일 요청을 보호하며 발급된 token만 복원되며, `responseProtection.enabled`가 필요하다. 개수 단위로 audit 기록된다. |
+## `privacy`
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `privacy.profile` | `null` \| `kr-pipa` \| `eu-gdpr` \| `us-general` | `null` | 집행 전에 지역별 기준 action 집합을 적용한다. 프로필은 명시적 action을 **강화**할 수는 있지만 약화할 수는 없다. 엔지니어링 기본값이며, 법적 자문이 아니다. |
+## `mcp`
+`mcp-stdio`와 `mcp-wrap`에 적용된다.
+| 키 | 타입 / 값 | 기본값 | 설명 |
+|---|---|---|---|
+| `mcp.allowedMethods` | 비어 있지 않은 문자열 배열 | `["initialize", "tools/call", "resources/read", "prompts/get"]` | 클라이언트가 호출할 수 있는 method allowlist(`"*"`는 전체 허용). 서버가 먼저 시작하는 요청은 allowlist를 우회하지만 params는 여전히 보호된다. |
+| `mcp.protectParams` | boolean | `true` | 요청 `params`를 보호한다. |
+| `mcp.protectResults` | boolean | `true` | 응답 `result`를 보호한다(injection 휴리스틱도 실행). |
+| `mcp.requireJsonRpc` | boolean | `true` | `jsonrpc: "2.0"`을 요구하며, 규격에 맞지 않는 메시지는 거부된다. |
+## Detection type과 action
+내장 탐지 `type` 값: `email`, `phone`, `kr_rrn`, `card`, `api_key`, `secret`, `injection`(응답 방향 휴리스틱, 기본 report-only). 커스텀 규칙으로 새로운 type을 추가할 수 있다.
+Action(약한 것 → 강한 것 순):
+| Action | 효과 |
+|---|---|
+| `allow` | 변경 없음(탐지 및 audit은 기록됨). |
+| `redact` | `[REDACTED:<type>]`으로 교체한다. |
+| `mask` | 부분 마스킹한다(값이 8자 이하이면 전체 마스킹). |
+| `tokenize` | vault token으로 교체한다. token vault를 통해 복원 가능하다. |
+| `encrypt` | 인라인 AES-256-GCM 봉투로 교체한다. |
+| `block` | 전체 payload를 거부한다(`403`/`-32001`/exit 3). |
+### Action strength
+preset과 override(또는 privacy profile)가 충돌할 경우 **강한** action이 우선하며, `policy.allowUnsafeOverrides`가 `true`가 아니면 더 강한 action을 약화하려 할 경우 오류가 발생한다. 강도 순: `allow`(0) < `redact`/`mask`(1) < `tokenize`/`encrypt`(2) < `block`(3).
+### Presets
+| Preset | 효과 |
+|---|---|
+| `llm-redact` | 기본 `redact`; `email: redact`, `phone: mask` |
+| `korean-pii` | `kr_rrn: block`, `phone: mask`, `email: redact` |
+| `secrets-only` | `api_key: block`, `secret: block` |
+| `strict-block` | 기본 `block` |
+| `mcp-basic` | 기본 `redact`; `api_key`/`secret`/`kr_rrn: block` |
+| `local-inference` | 기본 `redact`; `email: tokenize`, `phone: mask`, secrets/`kr_rrn: block` |
+| `local-only` | 전송을 외부 전송이 아닌 것으로 표시(메타데이터) |
+## 자주 쓰는 설정
+**enforce 모드에서 요청 보호(최소 설정):**
+```json
+{ "mode": "enforce", "policy": { "mode": "enforce" } }
+```
+**로컬 inference, response protection + token round-trip:**
+```json
+{
+  "mode": "enforce",
+  "target": { "type": "vllm-openai", "upstream": "http://127.0.0.1:8000" },
+  "policy": { "mode": "enforce", "presets": ["local-inference"] },
+  "responseProtection": { "enabled": true, "mode": "enforce" },
+  "tokenVault": { "deterministic": true, "detokenizeResponses": true }
+}
+```
+**EU 프로필, secret 차단, injection 플래그:**
+```json
+{
+  "mode": "enforce",
+  "privacy": { "profile": "eu-gdpr" },
+  "policy": { "mode": "enforce", "actions": { "injection": "redact" } },
+  "responseProtection": { "enabled": true }
+}
+```
+## loopback 밖으로 바인딩
+proxy는 CLI 플래그를 명시적으로 전달하지 않으면 loopback이 아닌 host를 거부한다 — 설정 파일에 `proxy.host: "0.0.0.0"`을 지정해도 의도적으로 시작되지 않는다:
+```bash
+haechi proxy --config haechi.config.json --host 0.0.0.0 --allow-remote-bind
+```
+**proxy는 아직 클라이언트 인증을 제공하지 않는다**(0.6 계획): 포트에 접근할 수 있는 누구든 upstream과 token round-trip 경로를 사용할 수 있다. `--allow-remote-bind`는 명시적인 네트워크 통제 하에서만 사용해야 한다 — 컨테이너 내에서 `0.0.0.0`으로 바인드하고 host 포트 매핑을 제한하거나(`-p 127.0.0.1:1016:1016`), 방화벽/VPN/인증 reverse proxy 뒤에 두어야 한다.
+## 검증 요약
+다음은 로드 시 오류(fail-closed)를 발생시킨다: 알 수 없는 `keys.provider`; 빈 `proxy.host`; 범위를 벗어난 `proxy.port`; `jsonl`이 아닌 `audit.sink`; `local`이 아닌 `tokenVault.provider`; 잘못된 `revealPolicy`; 양수가 아닌 `retentionDays`; boolean이 아닌 `deterministic`/`detokenizeResponses`; 비어 있거나 문자열이 아닌 `deterministicTypes`; 비어 있거나 문자열이 아닌 `mcp.allowedMethods`; boolean이 아닌 `mcp.*` 플래그; 알 수 없는 `privacy.profile`; 잘못된 `responseProtection.failureMode`; 양수가 아닌 `responseProtection.maxBytes`; 잘못된 `streaming.requestMode`; 잘못된 `streaming.responseMode`; 양수가 아닌 `streaming.maxMatchBytes`; 양수가 아닌 `limits.*`; 알 수 없는 `target.type`/`adapter`; 안전하지 않은 커스텀 정규식; `allowUnsafeOverrides` 없이 action을 약화하려는 시도.