npm - shield-harness - Versions diffs - 0.3.0 → 0.5.0 - Mend

shield-harness 0.3.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/.claude/hooks/lib/automode-detect.js +184 -0
package/.claude/hooks/lib/policy-drift.js +322 -0
package/.claude/hooks/sh-config-guard.js +184 -2
package/.claude/hooks/sh-session-start.js +83 -0
package/.claude/permissions-spec.json +16 -1
package/.claude/policies/openshell-generated.yaml +105 -0
package/README.ja.md +98 -33
package/README.md +97 -32
package/package.json +5 -2

package/README.md CHANGED Viewed

@@ -2,9 +2,9 @@
 # Shield Harness
-**Auto-defense security harness for Claude Code — approval-free, safe autonomous development**
+**Hook-driven auto-defense security harness for Claude Code**
-> **Alpha (v0.1.0)**: Security model is under active development. Permission rules and design documents are being aligned. Not recommended for production use yet.
+> **v0.5.0**: 22 hooks, 4-layer defense (L1 Permissions + L2 Hooks + L3 Sandbox + L3b OpenShell), 426 tests including 108 OWASP AITG attack simulations + 35 Auto Mode defense tests.
 [![English](https://img.shields.io/badge/lang-English-blue?style=flat-square)](#)
 [![日本語](https://img.shields.io/badge/lang-日本語-red?style=flat-square)](README.ja.md)
@@ -25,13 +25,13 @@ npx shield-harness init [--profile minimal|standard|strict]
 ## Why Shield Harness
 - **Hooks-driven defense**: 22 security hooks monitor every Claude Code operation
-- **Approval-free mode**: Delegate all security decisions to hooks, eliminating human approval dialogs
+- **Automated security decisions**: Hooks handle all security judgments in real time — no manual approval bottleneck
 - **fail-close principle**: Automatically stops when safety conditions cannot be verified
 - **Evidence recording**: Tamper-proof SHA-256 hash chain records all allow/deny decisions
 ## Architecture Overview
-3-layer defense model:
+4-layer defense model:
 | Layer    | Defense            | Implementation                                     |
 | -------- | ------------------ | -------------------------------------------------- |
@@ -50,30 +50,30 @@ npx shield-harness init [--profile minimal|standard|strict]
 ## Hook Catalog
-| #   | Hook             | Event                 | Responsibility                                   |
-| --- | ---------------- | --------------------- | ------------------------------------------------ |
-| 1   | permission       | PreToolUse            | 4-category tool usage classification             |
-| 2   | gate             | PreToolUse            | 7 attack vector inspection for Bash commands     |
-| 3   | injection-guard  | PreToolUse            | 9-category 50+ pattern injection detection       |
-| 4   | data-boundary    | PreToolUse            | Production data boundary + jurisdiction tracking |
-| 5   | quiet-inject     | PreToolUse            | Auto-inject quiet flags                          |
-| 6   | evidence         | PostToolUse           | SHA-256 hash chain evidence                      |
-| 7   | output-control   | PostToolUse           | Output truncation + token budget                 |
-| 8   | dep-audit        | PostToolUse           | Package install detection                        |
-| 9   | lint-on-save     | PostToolUse           | Auto lint execution                              |
-| 10  | session-start    | SessionStart          | Session init + integrity baseline                |
-| 11  | session-end      | SessionEnd            | Cleanup + statistics                             |
-| 12  | circuit-breaker  | Stop                  | Retry limit (3 attempts)                         |
-| 13  | config-guard     | ConfigChange          | Settings change monitoring                       |
-| 14  | user-prompt      | UserPromptSubmit      | User input injection scanning                    |
-| 15  | permission-learn | PermissionRequest     | Permission learning guard                        |
-| 16  | elicitation      | Elicitation           | Phishing + scope guard                           |
-| 17  | subagent         | SubagentStart         | Subagent budget constraint (25%)                 |
-| 18  | instructions     | InstructionsLoaded    | Rule file integrity monitoring                   |
-| 19  | precompact       | PreCompact            | Pre-compaction backup                            |
-| 20  | postcompact      | PostCompact           | Post-compaction restore + verify                 |
-| 21  | worktree         | WorktreeCreate/Remove | Security propagation + evidence merge            |
-| 22  | task-gate        | TaskCompleted         | Test gate                                        |
+| #   | Hook             | Event                 | Responsibility                                                                       |
+| --- | ---------------- | --------------------- | ------------------------------------------------------------------------------------ |
+| 1   | permission       | PreToolUse            | 4-category tool usage classification                                                 |
+| 2   | gate             | PreToolUse            | 7 attack vector inspection for Bash commands                                         |
+| 3   | injection-guard  | PreToolUse            | 9-category 50+ pattern injection detection                                           |
+| 4   | data-boundary    | PreToolUse            | Production data boundary + jurisdiction tracking                                     |
+| 5   | quiet-inject     | PreToolUse            | Auto-inject quiet flags                                                              |
+| 6   | evidence         | PostToolUse           | SHA-256 hash chain evidence                                                          |
+| 7   | output-control   | PostToolUse           | Output truncation + token budget                                                     |
+| 8   | dep-audit        | PostToolUse           | Package install detection                                                            |
+| 9   | lint-on-save     | PostToolUse           | Auto lint execution                                                                  |
+| 10  | session-start    | SessionStart          | Session init + integrity baseline                                                    |
+| 11  | session-end      | SessionEnd            | Cleanup + statistics                                                                 |
+| 12  | circuit-breaker  | Stop                  | Retry limit (3 attempts)                                                             |
+| 13  | config-guard     | ConfigChange          | Settings change monitoring + OpenShell policy file protection + Auto Mode protection |
+| 14  | user-prompt      | UserPromptSubmit      | User input injection scanning                                                        |
+| 15  | permission-learn | PermissionRequest     | Permission learning guard                                                            |
+| 16  | elicitation      | Elicitation           | Phishing + scope guard                                                               |
+| 17  | subagent         | SubagentStart         | Subagent budget constraint (25%)                                                     |
+| 18  | instructions     | InstructionsLoaded    | Rule file integrity monitoring                                                       |
+| 19  | precompact       | PreCompact            | Pre-compaction backup                                                                |
+| 20  | postcompact      | PostCompact           | Post-compaction restore + verify                                                     |
+| 21  | worktree         | WorktreeCreate/Remove | Security propagation + evidence merge                                                |
+| 22  | task-gate        | TaskCompleted         | Test gate                                                                            |
 ## Pipeline
@@ -146,10 +146,75 @@ Key benefits for Windows users:
 - Policies exist **outside** the agent process — the agent cannot disable its own guardrails
 - Runs on Docker Desktop + WSL2 backend (typical Windows dev setup)
-- Reduces residual risk from 5% to <1%
+- Significantly reduces residual risk from Layer 1-2 pattern matching limitations
 - Freely removable — stop the container and Shield Harness falls back to Layer 1-2
-> **Note**: OpenShell is Alpha (v0.0.13) — APIs may change with future releases.
+> **Note**: OpenShell is Alpha (v0.0.13) — APIs may change with future releases. Shield Harness GA Phase integration is complete (ADR-037): config guard policy file protection, policy drift check, and full documentation are ready.
+#### Setup
+**Prerequisites**: [Docker Desktop](https://www.docker.com/products/docker-desktop/) (WSL2 backend on Windows)
+```bash
+# 1. Install Docker Desktop and verify it is running
+#    https://www.docker.com/products/docker-desktop/
+docker --version
+# 2. Install OpenShell CLI
+pip install openshell
+# 3. Generate policy from permissions-spec.json
+#    Creates .claude/policies/openshell-generated.yaml
+npx shield-harness policy generate
+# 4. Start OpenShell container and run Claude Code inside it
+#    Docker pulls the sandbox image automatically on first run
+#    Kernel-level enforcement (Landlock/Seccomp/Network NS) is active inside the container
+openshell run --policy .claude/policies/openshell-generated.yaml
+```
+Claude Code running inside the OpenShell container automatically receives Layer 3b kernel enforcement. Shield Harness detects this at session start (`sh-session-start.js`) — no additional configuration required.
+Without OpenShell, Shield Harness falls back to Layer 1-2 defense (no degradation in hook protection).
+Policy files are protected by:
+- `permissions.deny`: `Edit/Write(.claude/policies/**)` blocks agent modification
+- `sandbox.denyWrite`: `.claude/policies` in filesystem deny list
+- `sh-config-guard.js`: Hash tracking detects policy file tampering or weakening
+- `sh-session-start.js`: Drift check at session start verifies spec-policy alignment
+## Testing
+```bash
+# Run all tests (426 tests including attack simulations)
+npm test
+# Run attack simulation tests only
+node --test tests/attack-sim-*.test.js
+```
+| Test Suite                    | Category                               | Tests |
+| ----------------------------- | -------------------------------------- | ----- |
+| attack-sim-prompt-injection   | AITG-APP-01: Direct Prompt Injection   | 25    |
+| attack-sim-indirect-injection | AITG-APP-02: Indirect Prompt Injection | 18    |
+| attack-sim-data-leak          | AITG-APP-03: Sensitive Data Leak       | 20    |
+| attack-sim-agentic-limits     | AITG-APP-06: Agentic Behavior Limits   | 18    |
+| attack-sim-sandbox-escape     | NVIDIA 3-axis: Sandbox Escape          | 15    |
+| attack-sim-defense-chain      | SAIF: Defense-in-depth Chain           | 12    |
+| attack-sim-automode-bypass    | Auto Mode: soft_deny/soft_allow bypass | 15    |
+## Auto Mode Awareness (v0.5.0)
+Shield Harness detects Claude Code's Auto Mode (Research Preview) configuration at session start and protects against dangerous settings:
+| Setting                | Risk                                                       | Shield Harness Response                                              |
+| ---------------------- | ---------------------------------------------------------- | -------------------------------------------------------------------- |
+| `autoMode.soft_deny`   | **CRITICAL** — disables all classifier default protections | Config-guard blocks addition; session-start outputs CRITICAL warning |
+| `autoMode.soft_allow`  | WARN — auto-approves specific tools                        | Config-guard blocks expansion; session-start outputs WARNING         |
+| `autoMode.environment` | Safe — informational only                                  | Detected and recorded in session                                     |
+All existing hooks (PreToolUse, PostToolUse) fire normally under Auto Mode — `permissions.deny` rules remain absolute. Auto Mode's classifier cannot override hook denials.
 ## Channel Integration
@@ -178,11 +243,11 @@ Shield Harness follows [Semantic Versioning](https://semver.org/):
 | `minor` | New features (backward compatible), Phase must-tasks completed | OCSF support, new hook, CLI option   |
 | `major` | Breaking changes                                               | Schema incompatible, settings change |
-**Release trigger**: `git tag v1.x.x && git push origin v1.x.x` triggers `release.yml` (automated npm publish + GitHub Release). Security fixes trigger an immediate patch release.
+**Release trigger**: `git tag vX.Y.Z && git push origin vX.Y.Z` triggers `release.yml` (automated npm publish + GitHub Release). Security fixes trigger an immediate patch release.
 ## References
-Shield Harness was designed by surveying 40+ Claude Code security projects. Key references:
+Key references:
 | Project                                                                      | Influence                                                                                                          |
 | ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "shield-harness",
-  "version": "0.3.0",
+  "version": "0.5.0",
   "description": "Security harness for Claude Code — hooks-driven, zero-hassle defense",
   "bin": {
     "shield-harness": "./bin/shield-harness.js"
@@ -27,7 +27,10 @@
     "test:openshell-detect": "node --test tests/openshell-detect.test.js",
     "test:openshell-evidence": "node --test tests/openshell-evidence.test.js",
     "test:tier-policy": "node --test tests/tier-policy-gen.test.js",
-    "test:policy-effectiveness": "node --test tests/policy-effectiveness.test.js"
+    "test:policy-effectiveness": "node --test tests/policy-effectiveness.test.js",
+    "test:automode": "node --test tests/automode-detect.test.js",
+    "test:automode-guard": "node --test tests/config-guard-automode.test.js",
+    "test:attack-automode": "node --test tests/attack-sim-automode-bypass.test.js"
   },
   "keywords": [
     "claude-code",