haechi 1.2.0 → 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/README.ko.md +57 -11
  2. package/README.md +57 -11
  3. package/docs/current/code-review-risk-register-2026-06-16.ko.md +377 -0
  4. package/docs/current/code-review-risk-register-2026-06-16.md +377 -0
  5. package/docs/current/config-version.ko.md +2 -2
  6. package/docs/current/config-version.md +2 -2
  7. package/docs/current/configuration.ko.md +28 -11
  8. package/docs/current/configuration.md +28 -11
  9. package/docs/current/operations-runbook.ko.md +36 -2
  10. package/docs/current/operations-runbook.md +39 -2
  11. package/docs/current/release-process.ko.md +5 -1
  12. package/docs/current/release-process.md +5 -1
  13. package/docs/current/risk-register-release-gate.ko.md +34 -8
  14. package/docs/current/risk-register-release-gate.md +34 -8
  15. package/docs/current/shared-responsibility.ko.md +12 -3
  16. package/docs/current/shared-responsibility.md +12 -3
  17. package/docs/current/threat-model.ko.md +7 -3
  18. package/docs/current/threat-model.md +7 -3
  19. package/examples/local-proxy-demo/README.md +51 -0
  20. package/examples/local-proxy-demo/demo.mjs +144 -0
  21. package/examples/local-proxy-demo/demo.tape +19 -0
  22. package/examples/local-proxy-demo/live-demo.mjs +121 -0
  23. package/examples/local-proxy-demo/live-demo.tape +25 -0
  24. package/haechi.config.example.json +2 -1
  25. package/package.json +3 -1
  26. package/packages/cli/bin/haechi.mjs +95 -5
  27. package/packages/cli/runtime.mjs +61 -1
  28. package/packages/core/index.mjs +15 -0
  29. package/packages/crypto/index.mjs +42 -20
  30. package/packages/filter/index.mjs +679 -6
  31. package/packages/privacy-profiles/index.mjs +72 -3
  32. package/packages/protocol-adapters/index.mjs +99 -1
  33. package/packages/proxy/index.mjs +270 -29
  34. package/packages/ssrf/index.mjs +60 -4
  35. package/packages/stream-filter/index.mjs +194 -17
@@ -1,6 +1,6 @@
1
1
  # Haechi Shared Responsibility
2
2
 
3
- - Status: Living document (tracks core 1.2.x)
3
+ - Status: Living document (tracks core 1.3.x)
4
4
  - Date: 2026-06-10
5
5
 
6
6
  ## 1. Responsibility Matrix
@@ -9,7 +9,7 @@
9
9
  |---|---|---|
10
10
  | Local development | CLI, default config, dev key generation | Do not reuse dev keys in production or shared environments |
11
11
  | Policy enforcement | redact/mask/tokenize/encrypt/block pipeline | Select actions appropriate to regulatory and organizational policy |
12
- | HTTP proxy | Loopback default, remote bind guard, body/response limits | Authentication, TLS termination, firewall, upstream auth |
12
+ | HTTP proxy | Loopback default, remote bind guard, body/response limits, default-drop upstream header allowlist (gateway-client auth separated from upstream-provider auth) | Authentication, TLS termination, firewall, upstream auth; supply the upstream provider key intentionally (client `Authorization` is forwarded only with `auth.provider: none`; otherwise set provider key headers like `x-api-key` or list extras in `target.forwardHeaders`) |
13
13
  | Streaming | Blocked by default | Accept the risk of no protection when using pass-through |
14
14
  | TokenVault | Encrypted storage, reveal blocked by default, purge | Reveal approval workflow, DSAR/retention operations |
15
15
  | Audit | Plaintext removal, hash chain | Append-only storage, backup, retention period, external signing |
@@ -41,7 +41,16 @@
41
41
 
42
42
  Haechi's stateful controls are single-process by design. Running 2+ replicas behind a load balancer **silently weakens** them unless the operator supplies shared infrastructure:
43
43
 
44
- - **Rate limit** is per-process and in-memory — total throughput multiplies by the replica count. Enforce a per-identity limit at a shared front door, or inject a shared-store `rateLimiter` via `createRuntime(config, { rateLimiter })` (the seam satisfies the `allow(key, limit)` contract; see [`configuration.md` → Rate limiter injection](./configuration.md#rate-limiter-injection)). The default per-process limiter also bounds its window map (no unbounded memory growth keyed by identity).
44
+ - **Rate limit** is per-process and in-memory — total throughput multiplies by the replica count. Enforce a per-identity limit at a shared front door, or inject a shared-store `rateLimiter` via `createRuntime(config, { rateLimiter })` (the seam satisfies the `allow(key, limit)` contract, which may return `boolean` or `Promise<boolean>`; see [`configuration.md` → Rate limiter injection](./configuration.md#rate-limiter-injection)). The [`haechi-ratelimit-redis`](https://github.com/raeseoklee/haechi/tree/main/satellites/ratelimit-redis) satellite is the reference shared-store (Redis-backed) implementation — a fixed-window counter over an injected client. The default per-process limiter also bounds its window map (no unbounded memory growth keyed by identity).
45
45
  - **Audit hash chain + anchor** are single-writer. Give each replica its **own** `audit.path` (and anchor path); never share one audit file across replicas, or the chain forks into an unverifiable state.
46
46
  - **TokenVault and the auth store** are whole-file local stores — correct for one host, but not a shared multi-writer store. For multi-replica tokenization, inject a shared `tokenVault`.
47
47
  - File locking relies on `O_EXCL` + atomic rename, which do not hold on NFS / shared filesystems — keep these stores on local disk.
48
+
49
+ ## 5. Gateway auth vs upstream auth (header forwarding)
50
+
51
+ Haechi keeps **gateway-client authentication** and **upstream-provider authentication** separate. The proxy does NOT forward arbitrary client headers to the model upstream; it applies a default-drop allowlist (P0-CR-001):
52
+
53
+ - When `auth.provider` is `bearer`/`external`/`plugin`, the client's `Authorization` is the **gateway credential** Haechi consumed and is **never forwarded** upstream. Supply the upstream provider key out-of-band — set the provider key header (`x-api-key`, `x-goog-api-key`, etc., all on the allowlist) on the client request, or front the upstream with your own credential injection.
54
+ - When `auth.provider` is `none`, the client's `Authorization` is treated as the **upstream provider key** and is forwarded (the OpenAI-compatible pass-through pattern).
55
+ - `Cookie`, `Set-Cookie`, `Proxy-Authorization`, and hop-by-hop headers are always dropped; any non-allowlisted header is dropped by default. Use `target.forwardHeaders` (lowercase names) to widen the allowlist for an unusual upstream — it cannot re-enable an always-dropped credential/hop-by-hop header.
56
+ - **Operator responsibility:** confirm your upstream actually receives the credential header it needs (the gateway no longer relays the client `Authorization` under gateway auth), and treat `target.forwardHeaders` as a reviewed allowlist, not a catch-all.
@@ -1,6 +1,6 @@
1
1
  # Haechi Threat Model
2
2
 
3
- - 문서 상태: Living document(core 1.2.x 추적)
3
+ - 문서 상태: Living document(core 1.3.x 추적)
4
4
  - 작성일: 2026-06-10
5
5
 
6
6
  ## 1. 보호 대상
@@ -33,6 +33,9 @@ Haechi가 보호하려는 주요 자산은 다음과 같습니다.
33
33
  | 위협 | 영향 | 현재 통제 |
34
34
  |---|---|---|
35
35
  | 인터넷 노출 proxy | 인증 없는 LLM gateway | non-loopback bind 기본 실패 |
36
+ | gateway credential의 upstream 전달 | Haechi가 소비한 gateway 토큰인 클라이언트 `Authorization`, `Cookie`, `Proxy-Authorization`가 모델 제공자로 전달되어 gateway 비밀이 신뢰 경계를 넘어 유출됩니다 (P0-CR-001) | **기본 차단 upstream 헤더 허용목록.** proxy는 명시적인 제공자/어댑터 헤더 집합(`x-api-key`, `anthropic-version`, `anthropic-beta`, `x-goog-api-key`, `openai-organization`, `openai-beta`, `accept`, `accept-language`, `user-agent`, `content-type`)만 전달합니다. `Cookie`/`Set-Cookie`/`Proxy-Authorization`와 hop-by-hop 헤더는 항상 폐기됩니다. `Authorization`은 `auth.provider !== none`이면(gateway credential이므로) 폐기되고, `auth.provider: none`일 때만(upstream 제공자 키이므로) 전달됩니다. `target.forwardHeaders`는 허용목록을 추가로 넓히지만 항상 폐기되는 헤더를 다시 켤 수는 없습니다(설정 시점 fail-closed) |
37
+ | 압축 해제된 바디에 잔존하는 압축 헤더 | Node `fetch()`가 gzip/br/deflate를 자동 해제하지만, upstream의 `content-encoding`/`content-length`를 유지한 채 전달하면 downstream 클라이언트가 평문 바이트에 `content-encoding: gzip`을 보고 "incorrect header check"로 실패합니다 (P1-CR-003) | **모든 응답 경로에서 중앙화된 `sanitizeResponseHeaders`**(pass-through, 전달/미보호, 보호, streaming): `content-encoding`, `content-length`, `transfer-encoding`, hop-by-hop 헤더를 제거하고, 완전 버퍼링된 바디에 한해 올바른 `content-length`만 다시 설정합니다 |
38
+ | 무제한 streaming pass-through | `streaming.requestMode: "pass-through"`가 크기 제한 없이 전체 upstream 바디를 버퍼링해, 장수명·악의적 스트림이 메모리/연결 자원을 무한정 점유할 수 있었습니다 (P1-CR-004) | **진정한 경계 streaming pass-through**: upstream 바디를 도착하는 대로 실행 바이트 카운트(`responseProtection.maxBytes`)와 함께 클라이언트로 파이핑하며, 한도를 초과하면 upstream 읽기를 취소하고 클라이언트 쓰기를 종료합니다(크기 기준 fail-closed). 동일한 한도가 미보호/전달 버퍼링 읽기에도 적용됩니다(한도 초과 시 502) |
36
39
  | streaming 우회 | SSE/NDJSON 평문 유출 | `inspect` 모드는 SSE/NDJSON을 stream-filter합니다. `block`(기본값)은 거부하고, `pass-through`는 명시적으로 감사된 opt-out입니다 |
37
40
  | Ollama 암묵 streaming 우회 | `stream` 생략 시 NDJSON 평문 유출 | `/api/chat`·`/api/generate`는 `stream: false`를 명시하지 않으면 streaming으로 간주해 기본 차단합니다 |
38
41
  | 비JSON/압축/대용량 응답 | responseProtection 우회 | fail-closed response policy |
@@ -46,7 +49,8 @@ Haechi가 보호하려는 주요 자산은 다음과 같습니다.
46
49
  | 행 걸린 upstream | proxy 연결 고갈 | `limits.upstreamTimeoutMs` 기본 120s, 초과 시 504 fail |
47
50
  | signing/encryption 키 혼용 | key separation 위반 | policy bundle 서명 키를 domain-separated 파생 키로 분리 |
48
51
  | JSON number/object key 은닉 | 카드번호 등 비문자열 leaf 미탐지 | number leaf와 object key도 detection/transform 대상 |
49
- | 모든 정규식 규칙을 우회하는 유니코드 난독화 | card/RRN/phone/email/secret을 시각·의미상 동등한 비ASCII 유니코드 형태(전각 숫자 `4242…`, 전각 `@`, 수학·원문자 영숫자)로 보내 모든 탐지 규칙을 무력화 | **매칭 전 각 string leaf의 NFKC 정규화**(WS2d)입니다. 정규화가 무변환인 경우(leaf의 약 99%) 탐지는 이전과 바이트 단위로 동일합니다. 접힘이 **위치 안정적**인 경우(모든 코드포인트가 같은 UTF-16 길이로 접히고 코드포인트별 접힘이 전체 정규화를 그대로 재구성) 정규화 사본에서 탐지하고 원본의 정확한 구간을 redact/block하며, 기록되는 값은 접힌 형태가 아니라 원본 바이트입니다. 그 외 — 길이가 달라지거나(수학 숫자·합자) 총 길이는 같지만 내부 offset을 이동시키는 수축+확장 보상 — 의 경우 offset을 원본에 매핑할 수 없으므로 탐지가 **fail closed**되어 leaf 전체를 덮는 단일 탐지로 처리합니다(leaf 전체 redact/block — 우회 시도를 과도 redact하는 것이 안전한 실패입니다). `String.prototype.normalize` 빌트인을 사용하므로 새 의존성은 없습니다. **수용된 잔여:** base64/percent-encoded 페이로드는 여전히 디코딩 후 재검사하지 않습니다(§4 참조) |
52
+ | 모든 정규식 규칙을 우회하는 유니코드 난독화 | card/RRN/phone/email/secret을 시각·의미상 동등한 비ASCII 유니코드 형태(전각 숫자 `4242…`, 전각 `@`, 수학·원문자 영숫자)로 보내 모든 탐지 규칙을 무력화 | **매칭 전 각 string leaf의 NFKC 정규화**(WS2d)입니다. 정규화가 무변환인 경우(leaf의 약 99%) 탐지는 이전과 바이트 단위로 동일합니다. 접힘이 **위치 안정적**인 경우(모든 코드포인트가 같은 UTF-16 길이로 접히고 코드포인트별 접힘이 전체 정규화를 그대로 재구성) 정규화 사본에서 탐지하고 원본의 정확한 구간을 redact/block하며, 기록되는 값은 접힌 형태가 아니라 원본 바이트입니다. 그 외 — 길이가 달라지거나(수학 숫자·합자) 총 길이는 같지만 내부 offset을 이동시키는 수축+확장 보상 — 의 경우 offset을 원본에 매핑할 수 없으므로 탐지가 **fail closed**되어 leaf 전체를 덮는 단일 탐지로 처리합니다(leaf 전체 redact/block — 우회 시도를 과도 redact하는 것이 안전한 실패입니다). `String.prototype.normalize` 빌트인을 사용하므로 새 의존성은 없습니다. **잔여는 이제 opt-in 통제입니다:** base64/percent-encoded 페이로드는 `filters.decodeAndRescan`이 활성화된 경우에만 디코딩 후 재검사합니다(다음 행 및 §4 참조) |
53
+ | 모든 정규식 규칙을 우회하는 base64/percent-encoded 페이로드 | 전송 전 base64·percent로 인코딩된 card/RRN/secret은 모든 규칙을 통과합니다(Haechi는 NFKC 텍스트에서 매칭하지만 디코딩하지 않습니다) | **opt-in `filters.decodeAndRescan`**입니다(기본 OFF → 이전과 바이트 단위로 동일). ON일 때, 일반 NFKC 스캔 이후 base64/base64url로 **보이는** string leaf(고정 알파벳, 유효한 길이, `16…8192` 바이트 범위, 같은 leaf로 round-trip, `node:buffer` `isUtf8`로 **유효한 UTF-8** 디코딩)이거나 `%XX` 이스케이프를 포함하는 leaf(try/catch 안의 `decodeURIComponent`)를 디코딩하여 같은 규칙·validator로 재검사합니다. **offset 처리는 fail closed입니다:** 디코딩된 매칭은 인코딩된 leaf에 유효한 offset이 없으므로, 원본 인코딩 leaf 전체를 덮는 **WHOLE-LEAF** 탐지(`start:0, end:leaf.length`)를 발생시킵니다 — transform이 leaf 전체를 redact/block하며, 디코딩된 offset을 원본으로 되돌려 매핑하지 않습니다. **정밀도 가드:** 디코딩된 매칭은 **validator 기반이거나 하드 블록 타입**일 때만 발생합니다(Luhn 통과 `card`, 체크섬 `kr_rrn`/`us_ssn`, IBAN mod-97, 또는 앵커된 규칙의 `secret`/`api_key`). validator 없는 디코딩된 소프트 타입 매칭(맨 전화번호 형태 등)은 발생하지 **않으므로** 무작위 base64는 오탐하지 않습니다. 새 의존성은 없습니다(`node:buffer` Buffer + `decodeURIComponent` 빌트인). **수용된 잔여:** Haechi가 디코딩하지 않는 인코딩(gzip, hex, 중첩/이중 인코딩, 커스텀 알파벳), 그리고 양성 텍스트 안에 Luhn-유효 16자리 런으로 디코딩되도록 의도적으로 조작된 평문(이에 발생하는 것은 오탐이 아니라 올바른 동작) |
50
54
  | 인증 없는 멀티 클라이언트 접근 | 로컬 프로세스가 upstream / token round-trip 경로를 무단 사용 | 선택적 bearer auth (`auth.provider: bearer`); 없거나 잘못된 경우 → 바디 읽기 전 401; identity별 rate limit 및 model allowlist |
51
55
  | Audit tail truncation | 꼬리 audit 레코드의 무음 삭제 | 추가 전용/별도 미디어의 `audit.anchor` head-hash anchoring으로 마지막 anchor까지의 절단 탐지 (0.7) |
52
56
  | Local dev key in production | 소프트웨어 키의 운영 custody 오용 | `assertCryptoProviderConformance`를 통한 외부 `cryptoProvider` 주입; reference KMS adapter (envelope 암호화) |
@@ -87,7 +91,7 @@ Haechi는 다음을 보장하지 않습니다.
87
91
  - 법적 컴플라이언스 인증
88
92
  - 모델 hallucination, prompt injection 완전 방어
89
93
  - 외부 MCP server의 OAuth/resource binding 검증
90
- - base64/percent-encoded 값의 **디코딩 후** 검사 — Haechi는 NFKC 정규화 텍스트에서 매칭하지만(§3의 유니코드 난독화 행 참조) base64/URL 디코딩 후 재검사는 하지 **않습니다**. 전송 전 base64·percent로 인코딩된 값은 검사되지 않습니다. (WS2d는 디코딩-후-재검사 패스를 보류했습니다. 상시 디코딩은 오탐이 많고, recall-safe한 opt-in을 범위 내에서 precision-neutral하게 만들 없어 문서화된 제외로 남깁니다.)
94
+ - base64/percent-encoded 값의 **기본** 디코딩 검사 — Haechi는 NFKC 정규화 텍스트에서 매칭하며(§3의 유니코드 난독화 행 참조) opt-in `filters.decodeAndRescan`(기본 OFF)을 활성화하지 않는 한 base64/URL 디코딩 후 재검사는 하지 **않습니다**. OFF이면 전송 전 base64·percent로 인코딩된 값은 검사되지 않습니다. ON이면 §3에 설명된 정밀도 가드(validator 기반 / 하드 블록 매칭만, WHOLE-LEAF fail-closed)와 함께 디코딩-후-재검사 패스가 동작합니다. WS2d는 *상시* 디코딩을 보류했고(오탐이 많고 범위 내에서 precision-neutral하지 않음), opt-in 통제는 트레이드오프를 수용하는 운영자를 위해 그 잔여를 닫습니다. 다른 인코딩(gzip/hex/중첩/커스텀 알파벳)은 여전히 범위 밖입니다.
91
95
  - URL query string 내 민감값 검사 (JSON body만 검사)
92
96
  - 마지막 anchor 이후의 audit tail truncation — `audit.anchor`(0.7)는 anchor가 추가 전용/별도 미디어에 있을 때 마지막 anchor까지의 레코드 삭제를 탐지합니다. 마지막 anchor 이후 기록된 레코드와 동일 파일시스템 anchor는 대상에서 제외됩니다
93
97
  - JSON-RPC batch 메시지 처리 (MCP stdio filter는 batch를 fail-closed로 거부)
@@ -1,6 +1,6 @@
1
1
  # Haechi Threat Model
2
2
 
3
- - Status: Living document (tracks core 1.2.x)
3
+ - Status: Living document (tracks core 1.3.x)
4
4
  - Date: 2026-06-10
5
5
 
6
6
  ## 1. Assets Under Protection
@@ -33,6 +33,9 @@ The primary assets Haechi protects are:
33
33
  | Threat | Impact | Current Control |
34
34
  |---|---|---|
35
35
  | Internet-exposed proxy | Unauthenticated LLM gateway | Non-loopback bind fails by default |
36
+ | Gateway credential forwarded upstream | The client `Authorization` (the gateway token Haechi consumed), `Cookie`, or `Proxy-Authorization` is forwarded to the model provider, leaking a gateway secret across the trust boundary (P0-CR-001) | **Default-drop upstream header allowlist.** The proxy forwards only an explicit provider/adapter header set (`x-api-key`, `anthropic-version`, `anthropic-beta`, `x-goog-api-key`, `openai-organization`, `openai-beta`, `accept`, `accept-language`, `user-agent`, `content-type`). `Cookie`/`Set-Cookie`/`Proxy-Authorization` and hop-by-hop headers are always dropped. `Authorization` is dropped when `auth.provider !== none` (it is the gateway credential), and forwarded only when `auth.provider: none` (it is the upstream provider key). `target.forwardHeaders` widens the allowlist additively but cannot re-enable an always-dropped header (fail-closed at config time) |
37
+ | Decompressed body with stale compression headers | Node `fetch()` auto-decompresses gzip/br/deflate, but a forwarded response that keeps the upstream `content-encoding`/`content-length` makes a downstream client see e.g. `content-encoding: gzip` on plain bytes and fail with "incorrect header check" (P1-CR-003) | **Centralized `sanitizeResponseHeaders` on every response path** (pass-through, forwarded/unprotected, protected, streaming): strips `content-encoding`, `content-length`, `transfer-encoding`, and hop-by-hop headers; a correct `content-length` is re-set only for a fully-buffered body |
38
+ | Unbounded streaming pass-through | `streaming.requestMode: "pass-through"` buffered the full upstream body with no size cap, so a long-lived or malicious stream could hold memory/connection resources indefinitely (P1-CR-004) | **True bounded streaming pass-through**: the upstream body is piped to the client as it arrives with a running byte cap (`responseProtection.maxBytes`); exceeding the cap cancels the upstream read and tears down the client write (fail-closed on size). The same cap applies to the unprotected/forwarded buffered-body read (502 over the cap) |
36
39
  | Streaming bypass | SSE/NDJSON plaintext leak | `inspect` mode stream-filters SSE/NDJSON; `block` (default) refuses; `pass-through` is an explicit audited opt-out |
37
40
  | Ollama implicit streaming bypass | NDJSON plaintext leak when `stream` is omitted | `/api/chat` and `/api/generate` are treated as streaming unless `stream: false` is explicit; blocked by default |
38
41
  | Non-JSON / compressed / oversized response | responseProtection bypass | Fail-closed response policy |
@@ -46,7 +49,8 @@ The primary assets Haechi protects are:
46
49
  | Hung upstream | Proxy connection exhaustion | `limits.upstreamTimeoutMs` default 120 s; 504 fail on timeout |
47
50
  | Signing/encryption key conflation | Key separation violation | Policy bundle signing key isolated as a domain-separated derived key |
48
51
  | JSON number / object key concealment | Undetected non-string leaves such as card numbers | Number leaves and object keys included in detection/transform scope |
49
- | Unicode-obfuscation evasion of every regex rule | A card/RRN/phone/email/secret sent in a visually/semantically equivalent non-ASCII Unicode form (full-width digits `4242…`, full-width `@`, mathematical/enclosed alphanumerics) defeats every detection rule | **NFKC normalization of each string leaf before matching** (WS2d). When the normalization is a no-op (~99% of leaves) detection is byte-identical to before. When the fold is **position-stable** (every codepoint folds to the same UTF-16 length and the per-codepoint folds reconstruct the whole normalization), detection runs on the normalized copy and the exact original span is redacted/blocked (the recorded value is the original bytes, never the fold). Otherwise — a length change (mathematical digits/ligatures) **or** a compensating contraction+expansion that keeps the total length equal while shifting interior offsets — offsets cannot map back, so detection **fails closed** to a single whole-leaf detection (the entire leaf is redacted/blocked — over-redacting an evasion attempt is the safe failure). Uses the `String.prototype.normalize` builtin (no new dependency). **Accepted residual:** base64/percent-encoded payloads are still not decoded-and-rescanned (see §4) |
52
+ | Unicode-obfuscation evasion of every regex rule | A card/RRN/phone/email/secret sent in a visually/semantically equivalent non-ASCII Unicode form (full-width digits `4242…`, full-width `@`, mathematical/enclosed alphanumerics) defeats every detection rule | **NFKC normalization of each string leaf before matching** (WS2d). When the normalization is a no-op (~99% of leaves) detection is byte-identical to before. When the fold is **position-stable** (every codepoint folds to the same UTF-16 length and the per-codepoint folds reconstruct the whole normalization), detection runs on the normalized copy and the exact original span is redacted/blocked (the recorded value is the original bytes, never the fold). Otherwise — a length change (mathematical digits/ligatures) **or** a compensating contraction+expansion that keeps the total length equal while shifting interior offsets — offsets cannot map back, so detection **fails closed** to a single whole-leaf detection (the entire leaf is redacted/blocked — over-redacting an evasion attempt is the safe failure). Uses the `String.prototype.normalize` builtin (no new dependency). **Residual now an opt-in control:** base64/percent-encoded payloads are decoded-and-rescanned only when `filters.decodeAndRescan` is enabled (see the next row and §4) |
53
+ | Base64/percent-encoded payload evades every regex rule | A card/RRN/secret base64- or percent-encoded before sending passes every rule (Haechi matches the NFKC text but does not decode) | **Opt-in `filters.decodeAndRescan`** (default OFF → byte-identical to before). When ON, after the normal NFKC scan a string leaf that LOOKS base64/base64url (anchored alphabet, valid length, within `16…8192` bytes, round-trips to the same leaf, decodes to **valid UTF-8** via `node:buffer` `isUtf8`) or contains a `%XX` escape (`decodeURIComponent` in try/catch) is decoded and rescanned with the same rules + validators. **Offset handling fails closed:** a decoded hit has no offset in the encoded leaf, so it emits a **WHOLE-LEAF** detection of the original encoded leaf (`start:0, end:leaf.length`) — the transform redacts/blocks the entire leaf; a decoded offset is never mapped back. **Precision guard:** a decoded hit only fires when it is **validator-backed or a hard-block type** (a Luhn-passing `card`, a checksum `kr_rrn`/`us_ssn`, an IBAN mod-97, or a `secret`/`api_key` on its anchored rule). A decoded soft-type-without-validator match (a bare phone-shaped run) does **not** fire, so random base64 does not false-positive. Zero new dependency (`node:buffer` Buffer + the `decodeURIComponent` builtin). **Accepted residual:** an encoding Haechi does not decode (gzip, hex, nested/double-encoding, a custom alphabet), and a deliberately contrived plaintext that decodes to a Luhn-valid 16-digit run inside benign text (firing on it is correct, not a false positive) |
50
54
  | Unauthenticated multi-client access | Any local process uses the upstream / token round-trip | Optional bearer auth (`auth.provider: bearer`); missing/invalid → 401 before body read; per-identity rate limit and model allowlist |
51
55
  | Audit tail truncation | Silent deletion of trailing audit records | `audit.anchor` head-hash anchoring on append-only/separate media detects truncation back to the last anchor (0.7) |
52
56
  | Local dev key in production | Software key misused as production custody | External `cryptoProvider` injection with `assertCryptoProviderConformance`; reference KMS adapter (envelope encryption) |
@@ -87,7 +91,7 @@ Haechi does not guarantee:
87
91
  - Legal compliance certification
88
92
  - Complete defense against model hallucination or prompt injection
89
93
  - OAuth/resource binding validation for external MCP servers
90
- - Inspection of base64/percent-encoded values **after decoding** — Haechi matches on the NFKC-normalized text (see the Unicode-evasion row in §3) but does **not** base64/URL-decode-and-rescan. A value that is base64- or percent-encoded before sending is not inspected. (WS2d deferred a decode-and-rescan pass: an always-on decode is false-positive-prone, and a recall-safe opt-in could not be made precision-neutral within scope; it remains a documented exclusion.)
94
+ - Inspection of base64/percent-encoded values **after decoding** **by default** — Haechi matches on the NFKC-normalized text (see the Unicode-evasion row in §3) and does **not** base64/URL-decode-and-rescan unless the opt-in `filters.decodeAndRescan` is enabled (default OFF). With it OFF, a value that is base64- or percent-encoded before sending is not inspected. With it ON, the decode-and-rescan pass runs with the precision guard described in §3 (validator-backed / hard-block hits only, whole-leaf fail-closed). WS2d deferred an *always-on* decode (false-positive-prone, not precision-neutral within scope); the opt-in control closes that residual for operators who accept the trade-off, and other encodings (gzip/hex/nested/custom-alphabet) remain out of scope.
91
95
  - Detection of sensitive values in URL query strings (JSON body only)
92
96
  - Audit tail truncation beyond the last anchor — `audit.anchor` (0.7) detects deletion of records back to the last anchor when the anchor is on append-only/separate media; records written after the last anchor, and same-filesystem anchors, are not covered
93
97
  - JSON-RPC batch message processing (the MCP stdio filter rejects batches fail-closed)
@@ -0,0 +1,51 @@
1
+ # Local end-to-end demo
2
+
3
+ A self-contained, **reproducible** walkthrough of Haechi — no remote model required.
4
+ It stands up a tiny OpenAI-compatible *stub* upstream and the **real** Haechi proxy
5
+ in front of it (in `enforce` mode), then narrates what happens to a payload carrying
6
+ an email, a phone number, an API key, and a card number.
7
+
8
+ ```bash
9
+ node examples/local-proxy-demo/demo.mjs
10
+ # or, from the repo root:
11
+ npm run demo
12
+ ```
13
+
14
+ What it shows, in order:
15
+
16
+ 1. **The model only sees protected values** — the proxy detects and transforms the
17
+ payload *before* forwarding, so the stub (standing in for the model) receives
18
+ `[TOKEN:…]` for the email, a masked phone, and `[REDACTED:api_key]` for the key.
19
+ 2. **The token round-trip** — because the email was *tokenized* (reversible), the
20
+ caller gets `minji.kim@example.com` back, while the masked phone and redacted
21
+ secret stay protected. The model's own leaked secret in its reply is
22
+ response-protected too.
23
+ 3. **The audit log** carries detection metadata and is hash-chained — and never any
24
+ plaintext email/phone/key.
25
+ 4. **Day-2 operability** — the live `/__haechi/ready` readiness probe and the
26
+ Prometheus `/__haechi/metrics` surface.
27
+ 5. **A card number is blocked outright** (`403`, fail-closed) — it never reaches the
28
+ model.
29
+
30
+ Zero dependencies (only `node:` builtins + the in-repo `haechi` packages). The demo
31
+ is programmatic for reproducibility; for the real CLI invocation see the
32
+ [Quickstart](../../README.md#quickstart) and
33
+ [`docs/current/configuration.md`](../../docs/current/configuration.md).
34
+
35
+ ## Live demo against a real model
36
+
37
+ `live-demo.mjs` runs the same flow against a **real** upstream (vLLM / Ollama / any
38
+ OpenAI-compatible server) instead of the stub. It asks the model to repeat the phone
39
+ number it was given — and the model can only return the *masked* form, because the
40
+ real number never reached it. This is the run recorded in the README GIF
41
+ (`demo.tape` records the stub demo; `live-demo.tape` records this one).
42
+
43
+ ```bash
44
+ HAECHI_LIVE_UPSTREAM=http://127.0.0.1:8000 \
45
+ HAECHI_LIVE_MODEL="Qwen/Qwen3.6-35B-A3B-FP8" \
46
+ node examples/local-proxy-demo/live-demo.mjs
47
+ ```
48
+
49
+ `HAECHI_LIVE_TYPE` (default `vllm-openai`) and `HAECHI_LIVE_MODEL` override the target.
50
+ For Qwen3-style reasoning servers the request sets `chat_template_kwargs.enable_thinking
51
+ = false` so the reply is a terse line; non-reasoning servers ignore it.
@@ -0,0 +1,144 @@
1
+ #!/usr/bin/env node
2
+ // Self-contained, reproducible Haechi demo — no remote model required.
3
+ //
4
+ // It stands up a tiny OpenAI-compatible *stub* upstream and the REAL Haechi proxy
5
+ // in front of it, then walks through what Haechi does to a payload that carries an
6
+ // email, a phone number, an API key, and a card:
7
+ // 1. the model only ever sees redacted/tokenized values (proven by echoing the
8
+ // exact body the stub received),
9
+ // 2. the caller gets the original email back (the token round-trip),
10
+ // 3. the audit log carries no plaintext,
11
+ // 4. the live /__haechi/metrics + /__haechi/ready operability surface,
12
+ // 5. a card is blocked outright (fail-closed).
13
+ //
14
+ // Run: node examples/local-proxy-demo/demo.mjs (or: npm run demo)
15
+ // Zero dependencies — only node: builtins and the in-repo haechi packages.
16
+
17
+ import { createServer } from "node:http";
18
+ import { mkdtemp, readFile } from "node:fs/promises";
19
+ import { tmpdir } from "node:os";
20
+ import { join } from "node:path";
21
+
22
+ import { createRuntime } from "../../packages/cli/runtime.mjs";
23
+ import { createHaechiProxy } from "../../packages/proxy/index.mjs";
24
+ import { initLocalKeyFile } from "../../packages/crypto/index.mjs";
25
+
26
+ const B = "\x1b[1m", D = "\x1b[2m", G = "\x1b[32m", Y = "\x1b[33m", C = "\x1b[36m", R = "\x1b[31m", X = "\x1b[0m";
27
+ const rule = () => console.log(D + "─".repeat(64) + X);
28
+ const scene = (n, t) => { console.log(); rule(); console.log(`${B}${C} ${n}. ${t}${X}`); rule(); };
29
+ const pause = (ms) => new Promise((r) => setTimeout(r, ms));
30
+
31
+ // A minimal OpenAI-compatible stub. It records the EXACT body it receives (which is
32
+ // whatever the proxy forwarded, i.e. the protected payload) and replies with a
33
+ // canned assistant message that itself leaks a secret, to exercise response protection.
34
+ function startStubUpstream() {
35
+ let lastReceived = null;
36
+ const server = createServer((req, res) => {
37
+ let body = "";
38
+ req.on("data", (c) => (body += c));
39
+ req.on("end", () => {
40
+ lastReceived = body;
41
+ // Echo the (already-protected) user content back so the response exercises the
42
+ // token round-trip, and append a leaked secret so response protection fires.
43
+ let echoed = "";
44
+ try { echoed = JSON.parse(body).messages.at(-1).content; } catch { /* ignore */ }
45
+ res.writeHead(200, { "content-type": "application/json" });
46
+ res.end(JSON.stringify({
47
+ id: "chatcmpl-demo",
48
+ object: "chat.completion",
49
+ choices: [{ index: 0, message: { role: "assistant",
50
+ content: `Noted — I will follow up. You wrote: "${echoed}" (our ref: token=DEMOleak9876543210notRealzyxwvu)` } }]
51
+ }));
52
+ });
53
+ });
54
+ return new Promise((resolve) => {
55
+ server.listen(0, "127.0.0.1", () => resolve({ server, url: `http://127.0.0.1:${server.address().port}`, received: () => lastReceived }));
56
+ });
57
+ }
58
+
59
+ async function main() {
60
+ console.log(`\n${B}🛡 Haechi — local end-to-end demo${X} ${D}(stub upstream, real proxy, enforce mode)${X}`);
61
+
62
+ const dir = await mkdtemp(join(tmpdir(), "haechi-demo-"));
63
+ const keyFile = join(dir, ".haechi", "dev.keys.json");
64
+ const auditPath = join(dir, ".haechi", "audit.jsonl");
65
+ await initLocalKeyFile(keyFile, { force: true });
66
+ const stub = await startStubUpstream();
67
+
68
+ const runtime = createRuntime({
69
+ mode: "enforce",
70
+ target: { type: "openai-compatible", upstream: stub.url },
71
+ policy: {
72
+ mode: "enforce",
73
+ presets: ["llm-redact"],
74
+ actions: { email: "tokenize", phone: "mask", secret: "redact", api_key: "redact", card: "block" }
75
+ },
76
+ tokenVault: { detokenizeResponses: true },
77
+ responseProtection: { enabled: true, mode: "enforce", failureMode: "fail-closed" },
78
+ keys: { keyFile },
79
+ audit: { path: auditPath }
80
+ });
81
+ const proxy = createHaechiProxy({ runtime, port: 0 });
82
+ const addr = await proxy.listen();
83
+ const base = `http://127.0.0.1:${addr.port}`;
84
+
85
+ // ── Scene 1 ───────────────────────────────────────────────────────────────
86
+ scene(1, "A prompt with an email, a phone number, and a deploy secret");
87
+ const userText = "Contact minji.kim@example.com or 010-1234-5678. Deploy api_key=DEMOkey0123456789notARealSecretabcdef.";
88
+ console.log(`${Y}you send →${X} ${userText}`);
89
+ await pause(700);
90
+ const r1 = await fetch(`${base}/v1/chat/completions`, {
91
+ method: "POST", headers: { "content-type": "application/json" },
92
+ body: JSON.stringify({ model: "demo", messages: [{ role: "user", content: userText }] })
93
+ });
94
+ const out1 = await r1.json();
95
+
96
+ scene(2, "What the MODEL actually received (the proxy protected it first)");
97
+ const forwarded = JSON.parse(stub.received());
98
+ console.log(`${G}model sees →${X} ${forwarded.messages[0].content}`);
99
+ console.log(`${D} (email → [TOKEN:…], phone → masked, secret → [REDACTED])${X}`);
100
+ await pause(700);
101
+
102
+ scene(3, "What YOU get back — the email token is restored (round-trip)");
103
+ console.log(`${G}you receive →${X} ${out1.choices[0].message.content}`);
104
+ console.log(`${D} (email restored from its token; phone stays masked; keys stay redacted both ways)${X}`);
105
+ await pause(700);
106
+
107
+ // ── Scene 4 ───────────────────────────────────────────────────────────────
108
+ scene(4, "The audit log — tamper-evident, and never any plaintext");
109
+ const audit = (await readFile(auditPath, "utf8")).trim().split("\n");
110
+ const ev = JSON.parse(audit[0]);
111
+ console.log(`${D}detections:${X} ${ev.detections.map((d) => `${d.type}→${d.action}`).join(" ")}`);
112
+ console.log(`${D}leaks the email/secret/phone?${X} ${audit.join("").match(/minji\.kim@|DEMOkey0123|010-1234-5678/) ? R + "YES" + X : G + "no — clean" + X}`);
113
+ await pause(700);
114
+
115
+ // ── Scene 5 ───────────────────────────────────────────────────────────────
116
+ scene(5, "Day-2 operability — live health + Prometheus metrics");
117
+ const ready = await (await fetch(`${base}/__haechi/ready`)).json();
118
+ console.log(`${D}/__haechi/ready →${X} ${ready.ready ? G + "ready" : R + "not ready"}${X} ${D}(audit writable: ${ready.checks?.auditWritable})${X}`);
119
+ const metrics = await (await fetch(`${base}/__haechi/metrics`)).text();
120
+ for (const line of metrics.split("\n").filter((l) => /^haechi_requests_total\{|^haechi_blocks_total /.test(l)).slice(0, 4)) {
121
+ console.log(`${D}metric:${X} ${line}`);
122
+ }
123
+ await pause(700);
124
+
125
+ // ── Scene 6 ───────────────────────────────────────────────────────────────
126
+ scene(6, "A card number is blocked outright (fail-closed)");
127
+ const r2 = await fetch(`${base}/v1/chat/completions`, {
128
+ method: "POST", headers: { "content-type": "application/json" },
129
+ body: JSON.stringify({ model: "demo", messages: [{ role: "user", content: "charge card 4242 4242 4242 4242 now" }] })
130
+ });
131
+ console.log(`${Y}you send →${X} "charge card 4242 4242 4242 4242 now"`);
132
+ console.log(`${G}proxy →${X} HTTP ${r2.status} ${r2.status === 403 ? R + B + "BLOCKED" + X : ""} ${D}(the card never reaches the model)${X}`);
133
+
134
+ console.log();
135
+ rule();
136
+ console.log(`${B}${G} ✓ done${X} ${D}— detection → redact/tokenize/block → forward → audit, all local.${X}`);
137
+ rule();
138
+ console.log(`${D} config reference: haechi.config.example.json · docs/current/configuration.md${X}\n`);
139
+
140
+ await proxy.close();
141
+ stub.server.close();
142
+ }
143
+
144
+ main().then(() => process.exit(0)).catch((e) => { console.error("demo failed:", e); process.exit(1); });
@@ -0,0 +1,19 @@
1
+ # VHS tape for the Haechi local end-to-end demo.
2
+ # Regenerate the README GIF with: vhs examples/local-proxy-demo/demo.tape
3
+ # (run from the repo root; requires vhs + ttyd + ffmpeg)
4
+
5
+ Output docs/assets/haechi-demo.gif
6
+
7
+ Set Shell "bash"
8
+ Set FontSize 15
9
+ Set Width 1180
10
+ Set Height 840
11
+ Set Padding 18
12
+ Set Theme "Catppuccin Mocha"
13
+ Set TypingSpeed 55ms
14
+
15
+ Sleep 500ms
16
+ Type "node examples/local-proxy-demo/demo.mjs"
17
+ Sleep 600ms
18
+ Enter
19
+ Sleep 9s
@@ -0,0 +1,121 @@
1
+ #!/usr/bin/env node
2
+ // Live end-to-end demo against a REAL upstream model (vLLM / Ollama / any
3
+ // OpenAI-compatible server). Unlike demo.mjs (which uses a deterministic stub),
4
+ // this proves protection against an actual model: it asks the model to repeat the
5
+ // phone number it was given, and the model can only return the *masked* form —
6
+ // the real number never reached it.
7
+ //
8
+ // HAECHI_LIVE_UPSTREAM=http://127.0.0.1:8000 \
9
+ // HAECHI_LIVE_MODEL="Qwen/Qwen3.6-35B-A3B-FP8" \
10
+ // node examples/local-proxy-demo/live-demo.mjs
11
+ //
12
+ // Defaults: type=vllm-openai. HAECHI_LIVE_TYPE and HAECHI_LIVE_MODEL override.
13
+ // Zero dependencies — only node: builtins + the in-repo haechi packages.
14
+
15
+ import { mkdtemp, readFile } from "node:fs/promises";
16
+ import { tmpdir } from "node:os";
17
+ import { join } from "node:path";
18
+
19
+ import { createRuntime } from "../../packages/cli/runtime.mjs";
20
+ import { createHaechiProxy } from "../../packages/proxy/index.mjs";
21
+ import { initLocalKeyFile } from "../../packages/crypto/index.mjs";
22
+
23
+ const B = "\x1b[1m", D = "\x1b[2m", G = "\x1b[32m", Y = "\x1b[33m", C = "\x1b[36m", R = "\x1b[31m", X = "\x1b[0m";
24
+ const rule = () => console.log(D + "─".repeat(64) + X);
25
+ const scene = (n, t) => { console.log(); rule(); console.log(`${B}${C} ${n}. ${t}${X}`); rule(); };
26
+ const pause = (ms) => new Promise((r) => setTimeout(r, ms));
27
+
28
+ const UPSTREAM = process.env.HAECHI_LIVE_UPSTREAM;
29
+ const TYPE = process.env.HAECHI_LIVE_TYPE || "vllm-openai";
30
+ const MODEL = process.env.HAECHI_LIVE_MODEL || "Qwen/Qwen3.6-35B-A3B-FP8";
31
+ if (!UPSTREAM) {
32
+ console.error("Set HAECHI_LIVE_UPSTREAM (e.g. http://127.0.0.1:8000) to a reachable OpenAI-compatible server.");
33
+ console.error("For a no-backend reproducible run, use: npm run demo");
34
+ process.exit(2);
35
+ }
36
+
37
+ async function chat(base, content, extra = {}) {
38
+ const t0 = Date.now();
39
+ const res = await fetch(`${base}/v1/chat/completions`, {
40
+ method: "POST", headers: { "content-type": "application/json" },
41
+ body: JSON.stringify({ model: MODEL, max_tokens: 128, temperature: 0,
42
+ // Qwen3 reasoning models: ask for a direct answer (no chain-of-thought) so
43
+ // the demo gets a terse content reply. Ignored by non-reasoning servers.
44
+ chat_template_kwargs: { enable_thinking: false },
45
+ messages: [{ role: "user", content }], ...extra })
46
+ });
47
+ const body = await res.json();
48
+ return { status: res.status, ms: Date.now() - t0, text: body.choices?.[0]?.message?.content ?? body.error?.message ?? "(no content)" };
49
+ }
50
+
51
+ async function main() {
52
+ console.log(`\n${B}🛡 Haechi — LIVE end-to-end demo${X} ${D}(real model: ${MODEL} via ${TYPE}, enforce mode)${X}`);
53
+
54
+ const dir = await mkdtemp(join(tmpdir(), "haechi-live-"));
55
+ const keyFile = join(dir, ".haechi", "dev.keys.json");
56
+ const auditPath = join(dir, ".haechi", "audit.jsonl");
57
+ await initLocalKeyFile(keyFile, { force: true });
58
+
59
+ const runtime = createRuntime({
60
+ mode: "enforce",
61
+ target: { type: TYPE, upstream: UPSTREAM },
62
+ policy: { mode: "enforce", presets: ["llm-redact"], actions: { email: "tokenize", phone: "mask", secret: "redact", api_key: "redact", card: "block" } },
63
+ tokenVault: { detokenizeResponses: true },
64
+ responseProtection: { enabled: true, mode: "enforce", failureMode: "fail-closed" },
65
+ keys: { keyFile }, audit: { path: auditPath }
66
+ });
67
+ const proxy = createHaechiProxy({ runtime, port: 0 });
68
+ const addr = await proxy.listen();
69
+ const base = `http://127.0.0.1:${addr.port}`;
70
+
71
+ // ── Scene 1 ────────────────────────────────────────────────────────────────
72
+ scene(1, "Ask a REAL model to repeat the phone number you give it");
73
+ const prompt = "Reply in one short line: repeat the phone number you were given. Phone: 010-1234-5678, email minji.kim@example.com";
74
+ console.log(`${Y}you send →${X} ${prompt}`);
75
+ await pause(700);
76
+ const r1 = await chat(base, prompt);
77
+
78
+ scene(2, "Haechi detected + protected the prompt BEFORE it left your machine");
79
+ const events = (await readFile(auditPath, "utf8")).trim().split("\n").map((l) => JSON.parse(l));
80
+ const ev = events.find((e) => Array.isArray(e.detections) && e.detections.length) ?? events[0];
81
+ console.log(`${D}detections:${X} ${(ev.detections ?? []).map((d) => `${G}${d.type}→${d.action}${X}`).join(" ")}`);
82
+ console.log(`${D}the model only ever saw:${X} email → ${C}[TOKEN:…]${X}, phone → ${C}01*********78${X}`);
83
+ await pause(700);
84
+
85
+ scene(3, "The real model replies — it can only return the MASKED phone");
86
+ console.log(`${G}${MODEL.split("/").pop()} →${X} ${B}${r1.text}${X} ${D}(${r1.ms} ms)${X}`);
87
+ console.log(`${D} your real number 010-1234-5678 never reached the model — it cannot reveal it.${X}`);
88
+ await pause(700);
89
+
90
+ // ── Scene 4 ────────────────────────────────────────────────────────────────
91
+ scene(4, "The audit log — hash-chained, and never any plaintext");
92
+ const auditRaw = await readFile(auditPath, "utf8");
93
+ console.log(`${D}leaks the real email/phone?${X} ${/minji\.kim@example|010-1234-5678/.test(auditRaw) ? R + "YES" + X : G + "no — clean" + X}`);
94
+ await pause(700);
95
+
96
+ // ── Scene 5 ────────────────────────────────────────────────────────────────
97
+ scene(5, "Day-2 operability — live readiness + Prometheus metrics");
98
+ const ready = await (await fetch(`${base}/__haechi/ready`)).json();
99
+ console.log(`${D}/__haechi/ready →${X} ${ready.ready ? G + "ready" : R + "not ready"}${X}`);
100
+ const metrics = await (await fetch(`${base}/__haechi/metrics`)).text();
101
+ for (const line of metrics.split("\n").filter((l) => /^haechi_requests_total\{/.test(l)).slice(0, 3)) {
102
+ console.log(`${D}metric:${X} ${line}`);
103
+ }
104
+ await pause(700);
105
+
106
+ // ── Scene 6 ────────────────────────────────────────────────────────────────
107
+ scene(6, "A card number is blocked before it ever reaches the model");
108
+ const r2 = await chat(base, "charge card 4242 4242 4242 4242 now");
109
+ console.log(`${Y}you send →${X} "charge card 4242 4242 4242 4242 now"`);
110
+ console.log(`${G}proxy →${X} HTTP ${r2.status} ${r2.status === 403 ? R + B + "BLOCKED" + X : ""} ${D}(no upstream call made)${X}`);
111
+
112
+ console.log();
113
+ rule();
114
+ console.log(`${B}${G} ✓ live${X} ${D}— a real model, and your PII never left the gateway in the clear.${X}`);
115
+ rule();
116
+ console.log();
117
+
118
+ await proxy.close();
119
+ }
120
+
121
+ main().then(() => process.exit(0)).catch((e) => { console.error("live demo failed:", e); process.exit(1); });
@@ -0,0 +1,25 @@
1
+ # VHS tape for the Haechi LIVE demo (real upstream model).
2
+ # Regenerate the README GIF with:
3
+ # HAECHI_LIVE_UPSTREAM is set below via Env so it stays out of the recording.
4
+ # vhs examples/local-proxy-demo/live-demo.tape (run from the repo root)
5
+
6
+ Output docs/assets/haechi-demo.gif
7
+
8
+ Set Shell "bash"
9
+ Set FontSize 15
10
+ Set Width 1180
11
+ Set Height 840
12
+ Set Padding 18
13
+ Set Theme "Catppuccin Mocha"
14
+ Set TypingSpeed 55ms
15
+
16
+ # Point these at a reachable OpenAI-compatible server before recording. Using Env
17
+ # (not the typed command) keeps the upstream URL out of the captured GIF.
18
+ Env HAECHI_LIVE_UPSTREAM "http://127.0.0.1:8000"
19
+ Env HAECHI_LIVE_MODEL "Qwen/Qwen3.6-35B-A3B-FP8"
20
+
21
+ Sleep 500ms
22
+ Type "node examples/local-proxy-demo/live-demo.mjs"
23
+ Sleep 600ms
24
+ Enter
25
+ Sleep 9s
@@ -50,7 +50,8 @@
50
50
  "filters": {
51
51
  "customRules": [],
52
52
  "minConfidence": 0,
53
- "allowlist": []
53
+ "allowlist": [],
54
+ "decodeAndRescan": false
54
55
  },
55
56
  "keys": {
56
57
  "provider": "local",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "haechi",
3
- "version": "1.2.0",
3
+ "version": "1.3.1",
4
4
  "description": "Self-hosted AI context enforcement across LLM, MCP, vLLM, Ollama, and agent traffic — a stable, zero-dependency security gateway.",
5
5
  "license": "Apache-2.0",
6
6
  "type": "module",
@@ -76,11 +76,13 @@
76
76
  "checksums": "node scripts/release-checksums.mjs",
77
77
  "bench:payload": "node scripts/bench-payload.mjs",
78
78
  "bench:detection": "node scripts/bench-detection.mjs",
79
+ "bench:throughput": "node scripts/bench-throughput.mjs",
79
80
  "scan:detection": "node scripts/bench-detection.mjs --gate",
80
81
  "check:peer-ranges": "node scripts/check-satellite-peer-ranges.mjs",
81
82
  "release:preflight": "node scripts/release-preflight.mjs && node scripts/check-satellite-peer-ranges.mjs",
82
83
  "release:preflight:npm": "node scripts/release-preflight.mjs --require-npm-auth && node scripts/check-satellite-peer-ranges.mjs",
83
84
  "haechi": "node packages/cli/bin/haechi.mjs",
85
+ "demo": "node examples/local-proxy-demo/demo.mjs",
84
86
  "demo:init": "node packages/cli/bin/haechi.mjs init --force",
85
87
  "demo:protect": "node packages/cli/bin/haechi.mjs protect examples/llm-prompt-filtering/input.json --config haechi.config.json",
86
88
  "demo:report": "node packages/cli/bin/haechi.mjs report --audit .haechi/audit.jsonl"