PyPI - agent-intent-guard - Versions diffs - 0.1.0__tar.gz - Mend

agent-intent-guard 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

agent_intent_guard-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,485 @@
+Metadata-Version: 2.4
+Name: agent-intent-guard
+Version: 0.1.0
+Summary: MCP tool-call guardrails with static and semantic policy checks
+License-Expression: MIT
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3 :: Only
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+Requires-Dist: PyYAML>=6.0.0
+Requires-Dist: requests>=2.31.0
+Requires-Dist: litellm>=1.82.0
+Provides-Extra: dev
+Requires-Dist: pytest>=8.0.0; extra == "dev"
+# IntentGuard
+IntentGuard is a Python guardrail layer for MCP tool calls. It runs as a proxy between an agent client and an MCP server, enforcing both static policy checks and optional semantic intent checks before a tool call is forwarded.
+## What is implemented (MVP)
+The current implementation covers all 4 roadmap phases from `agent.md`:
+1. **CLI Interceptor (Phase 1)**
+   `intent_guard/proxy.py` + `intent_guard/sdk/mcp_proxy.py` implement a stdio proxy that intercepts `tools/call` JSON-RPC requests and can block/allow calls.
+2. **Static Engine (Phase 2)**
+   `intent_guard/sdk/engine.py` loads YAML policy and enforces:
+   - `forbidden_tools`
+   - `protected_paths` (glob/fnmatch style)
+   - `max_tokens_per_call`
+   - `custom_policies` (tool-specific argument requirements/forbidden arguments)
+3. **Semantic Guardrail Providers (Phase 3)**
+   `intent_guard/sdk/providers.py` supports:
+   - `OllamaProvider` (`POST /api/generate`)
+   - `LiteLLMProvider` (`litellm.completion`) using `LLM_MODEL` and `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` from env
+   Both providers include retries (exponential backoff + jitter) and a circuit breaker.
+4. **Pause & Resume Feedback Loop (Phase 4)**
+   `terminal_approval_prompt` provides interactive approval for flagged calls (`Allow? [y/N]`).
+## Repository layout
+```text
+intent_guard/
+├── __init__.py
+├── proxy.py
+└── sdk/
+    ├── __init__.py
+    ├── engine.py
+    ├── mcp_proxy.py
+    └── providers.py
+schema/
+└── policy.yaml
+tests/
+├── conftest.py
+└── test_integration_phases.py
+```
+## Installation
+```bash
+python3 -m venv .venv
+.venv/bin/pip install -r requirements.txt
+```
+## Run tests
+```bash
+.venv/bin/pytest -q
+```
+Run live Ollama semantic tests only (requires local Ollama + `llama3.1:8b` available):
+```bash
+.venv/bin/pytest -q -m runOllamaProvider
+```
+If local model responses are slow, increase timeout (seconds):
+```bash
+OLLAMA_TIMEOUT_SECONDS=120 .venv/bin/pytest -q -m runOllamaProvider
+```
+The live semantic suite defaults to `OLLAMA_RAW=false` and bounded generation tuned for `llama3.1:8b`. You can tune:
+```bash
+OLLAMA_TIMEOUT_SECONDS=60 OLLAMA_NUM_PREDICT=256 OLLAMA_RAW=false \
+  .venv/bin/pytest -q -m runOllamaProvider
+```
+Integration tests cover all phases:
+- phase 1: interception and logging behavior
+- phase 2: static policy blocking
+- phase 3: semantic provider flow (mocked Ollama HTTP call)
+- phase 4: approval allow/deny behavior
+## Policy file
+Use `schema/policy.yaml` as a starting point:
+```yaml
+static_rules:
+  forbidden_tools: ["delete_database", "purge_all"]
+  protected_paths: ["/etc/*", ".env", "src/auth/*"]
+  max_tokens_per_call: 4000
+  rate_limits:
+    enabled: 1 # required to turn rate limiting on; 0/false bypasses checks
+    default:
+      max_calls: 60
+      window_seconds: 60
+    by_tool:
+      write_file:
+        max_calls: 10
+        window_seconds: 60
+custom_policies:
+  - tool_name: write_file
+    args:
+      all_present: ["path", "content"]
+      should_not_present: ["sudo"]
+semantic_rules:
+  provider: ollama # or litellm
+  mode: enforce # off | enforce | advisory
+  prompt_version: "v2"
+  guardrail_model: llama3.1:8b
+  critical_intent_threshold: 0.85
+  retry_attempts: 2
+  retry_base_delay_seconds: 0.25
+  retry_max_delay_seconds: 2.0
+  retry_jitter_ratio: 0.2
+  circuit_breaker_failures: 3
+  circuit_breaker_reset_seconds: 30
+  provider_fail_mode:
+    default: advisory # fail-open
+    by_tool:
+      delete_database: enforce # fail-closed
+  constraints:
+    - intent: modify_source_code
+      allowed_scope: Actions must only affect UI components or styles.
+      forbidden_scope: Should not modify database schemas or auth logic.
+```
+### Rubric scoring (v2)
+Set `prompt_version: "v2"` to switch from opaque LLM-assigned scores to
+multi-signal rubric scoring. Instead of asking the LLM for a single confidence
+number, the engine asks concrete yes/no questions across multiple dimensions
+and computes the score deterministically from the answers.
+```yaml
+semantic_rules:
+  prompt_version: "v2"
+  critical_intent_threshold: 0.85
+  scoring:
+    dimensions:
+      tool_task_alignment:
+        weight: 0.25
+      argument_scope_compliance:
+        weight: 0.30
+      no_forbidden_scope_violation:
+        weight: 0.30
+      no_side_effect_risk:
+        weight: 0.15
+```
+Default dimensions (used when `scoring` is omitted):
+| Dimension | Question | Default weight |
+|---|---|---|
+| `tool_task_alignment` | Is this tool appropriate for the stated task? | 0.25 |
+| `argument_scope_compliance` | Are arguments within the allowed scope? | 0.30 |
+| `no_forbidden_scope_violation` | Do arguments avoid the forbidden scope? | 0.30 |
+| `no_side_effect_risk` | Is the call free of destructive/exfil risk? | 0.15 |
+Score formula: `Σ(weight × pass) / Σ(weight)`. With 4 equal-pass dimensions
+the score is 1.0; any single failure drops below the 0.85 threshold.
+Decisions include `dimension_scores` with per-dimension `passed` and
+`evidence` for full auditability.
+## CLI usage
+```bash
+INTENT_GUARD_TASK="Only update frontend styles" \
+python -m intent_guard.proxy \
+  --policy schema/policy.yaml \
+  --target "npx @modelcontextprotocol/server-filesystem /path/to/repo" \
+  --model llama3.1:8b \
+  --approval-webhook "https://approval.internal/intent-guard" \
+  --approval-timeout 10 \
+  --approval-default-action deny
+```
+### Flags
+- `--policy`: YAML policy path
+- `--target`: target MCP server command
+- `--model`: optional Ollama model name for semantic checks
+- `--task`: optional task context (or set `INTENT_GUARD_TASK`)
+- `--ask-approval`: prompt user before allowing flagged calls
+- `--approval-webhook`: call this webhook for non-interactive approval decisions
+- `--approval-timeout`: timeout (seconds) for webhook approvals
+- `--approval-default-action`: `allow` or `deny` when webhook approval times out/fails
+## Native hook integration
+IntentGuard can run as the policy engine behind native hooks in Claude Code, Copilot, and Cursor.
+### Evaluate command
+Use the unified command:
+```bash
+intent-guard evaluate --policy schema/policy.yaml
+```
+Input:
+- Reads a hook payload JSON object from stdin
+- Supports generic keys like `tool_name`, `arguments`, `task_context`
+- Also supports nested payloads (`params.name`, `params.arguments`) and common aliases (`tool_input`, `args`, `prompt`)
+Output:
+- Prints a `GuardDecision` JSON object to stdout
+- Exit code `0` for allow, `1` for block, `2` for invalid input
+### Hook config templates
+Template files are shipped under `hooks/`:
+- `hooks/claude-code/settings.json`
+- `hooks/copilot/hooks.json`
+- `hooks/cursor/hooks.json`
+Each template invokes:
+```bash
+cat | intent-guard evaluate --policy schema/policy.yaml
+```
+This lets platform-native hooks call IntentGuard directly instead of wrapping only MCP servers.
+## Encoded payload detection
+Static checks can decode and normalize argument payloads before matching:
+- URL decoding
+- Unicode normalization (NFKC)
+- Base64 decoding (when valid)
+Enable or disable via:
+```yaml
+static_rules:
+  decode_arguments: true
+```
+When enabled, injection, sensitive-data, and protected-path checks run against decoded variants to catch obfuscated bypasses.
+## Response-side inspection
+IntentGuard can inspect MCP server responses before forwarding them to the client.
+Configure `response_rules` in policy:
+```yaml
+response_rules:
+  action: block # block | warn | redact
+  detect_base64: true
+  patterns:
+    - name: "GitHub Token"
+      pattern: "gh[ps]_[A-Za-z0-9_]{36,}"
+```
+Behavior:
+- `block`: return JSON-RPC error and suppress original response
+- `warn`: forward response and log warning decision
+- `redact`: redact matched text and forward sanitized response
+## Tool description change detection (rug-pull protection)
+IntentGuard can snapshot MCP `tools/list` metadata and detect changes over time.
+Configure:
+```yaml
+tool_change_rules:
+  enabled: true
+  action: warn # warn | block
+```
+Behavior:
+- On first `tools/list`, stores snapshot in `.intent-guard/tool-snapshots/<server-hash>.json`
+- On subsequent `tools/list`, compares `name`, `description`, and `inputSchema`
+- `warn`: log warning and continue
+- `block`: block response when drift is detected
+### Semantic mode and provider failure behavior
+`semantic_rules.mode` controls normal semantic enforcement:
+- `off`: semantic check disabled
+- `enforce`: semantic failures block tool calls
+- `advisory`: semantic failures are logged as warnings but calls are allowed
+`semantic_rules.provider_fail_mode` controls behavior when semantic provider is unavailable:
+- supports `default` and per-tool `by_tool` override
+- values use the same mode set: `off|enforce|advisory`
+Behavior matrix for tool criticality tiers (example mapping):
+| Tool tier | `provider_fail_mode` | Outcome on provider outage |
+|---|---|---|
+| Critical tools | `enforce` | Fail-closed (block + approval required) |
+| Standard tools | `advisory` | Fail-open with warning decision |
+| Low-risk tools | `off` | Fail-open without warning severity |
+Define tiers by assigning tools in `provider_fail_mode.by_tool`.
+`semantic_rules.prompt_version` is copied into every semantic decision and log entry as `semantic_prompt_version` so prompt changes are auditable.
+### Semantic decision caching
+To reduce repeated provider calls for identical semantic evaluations:
+```yaml
+semantic_rules:
+  decision_cache:
+    enabled: true
+    max_size: 256
+    ttl_seconds: 300
+```
+Cache key uses `(tool_name, arguments, task_context)`. Static checks always run; only semantic verdicts are cached.
+### LiteLLM provider
+To use the API provider, set in `.env` (or process env):
+```bash
+LLM_MODEL=claude-3-5-sonnet-20241022
+ANTHROPIC_API_KEY=...
+# or OPENAI_API_KEY=...
+```
+Then set `semantic_rules.provider: litellm` (or just set `LLM_MODEL` and omit explicit provider).
+### CI break-glass options
+- `INTENT_GUARD_BREAK_GLASS_TOKEN`: if set, flagged calls are auto-approved with override metadata.
+- `INTENT_GUARD_BREAK_GLASS_SIGNED_TOKEN` + `INTENT_GUARD_BREAK_GLASS_SIGNING_KEY`: optional HMAC-signed break-glass token for CI. Token format is `<base64url(json payload)>.<base64url(signature)>` where signature is `HMAC-SHA256(payload_part, signing_key)` and payload contains future `exp` (unix timestamp), for example `{"exp": 4102444800}`.
+- `INTENT_GUARD_APPROVAL_AUTH_TOKEN`: bearer token added to webhook approval requests.
+## SDK usage (Python)
+```python
+from intent_guard import IntentGuardSDK
+guard = IntentGuardSDK(
+    policy_path="schema/policy.yaml",
+    local_model="llama3.1:8b",
+    task_context="Only modify UI components"
+)
+decision = guard.evaluate("write_file", {"path": "src/auth/config.py"})
+print(decision.allowed, decision.reason)
+```
+## GuardDecision contract (stable)
+`GuardDecision` now includes machine-readable metadata for enforcement and analytics:
+- `decision_id` (UUID)
+- `code`
+- `severity`
+- `policy_name`
+- `policy_version`
+- `rule_id`
+- `timestamp` (UTC ISO-8601)
+- `override` (`who`/`why`/`ttl`, when manually approved)
+- `semantic_prompt_version` (when semantic checks are applied)
+Backward compatibility:
+- Existing fields `allowed`, `reason`, `requires_approval`, `semantic_score` are unchanged.
+- New fields are always present with safe defaults, so existing consumers can ignore them.
+## Semantic eval harness
+IntentGuard ships a lightweight semantic eval harness used in tests to measure model behavior on known-safe and known-unsafe tool calls.
+- Dataset fixtures: `tests/fixtures/semantic_eval_dataset.json`
+- Replay verdicts: `tests/fixtures/semantic_eval_verdicts.json`
+- Metrics computed: precision, recall, accuracy
+This enables reproducible regression checks for semantic policy quality.
+Versioning/migration strategy:
+- Keep parsing logic tolerant of unknown fields.
+- Use `policy_version` + `code` + `rule_id` for downstream contract evolution and dashboards.
+- Prefer adding new fields over changing/removing existing field semantics.
+## Usage examples with popular tools
+### 1) Claude Code (MCP server proxy)
+Configure the MCP server command to run through IntentGuard:
+```json
+{
+  "mcpServers": {
+    "filesystem": {
+      "command": "python",
+      "args": [
+        "-m",
+        "intent_guard.proxy",
+        "--policy",
+        "schema/policy.yaml",
+        "--target",
+        "npx @modelcontextprotocol/server-filesystem /path/to/repo",
+        "--ask-approval"
+      ],
+      "env": {
+        "INTENT_GUARD_TASK": "Refactor UI only; do not touch auth or database"
+      }
+    }
+  }
+}
+```
+### 2) Codex (MCP command wrapping)
+For Codex setups that support MCP server command configuration, point the server command to IntentGuard first, then to your real MCP server as `--target`:
+```bash
+python -m intent_guard.proxy \
+  --policy schema/policy.yaml \
+  --target "npx @modelcontextprotocol/server-filesystem /path/to/repo" \
+  --ask-approval
+```
+Use that command as the configured MCP server entry in your Codex environment.
+### 3) LangSmith / LangChain workflows
+Use IntentGuard before each tool execution and keep normal LangSmith tracing:
+```python
+from langsmith import traceable
+from intent_guard import IntentGuardSDK
+guard = IntentGuardSDK(
+    policy_path="schema/policy.yaml",
+    task_context="Only update docs and UI text"
+)
+@traceable(name="guarded_tool_call")
+def guarded_call(tool_name: str, args: dict, tool_callable):
+    decision = guard.evaluate(tool_name, args)
+    if not decision.allowed:
+        raise PermissionError(f"IntentGuard blocked: {decision.reason}")
+    return tool_callable(**args)
+```
+This keeps execution decisions visible in traces while enforcing IntentGuard policy at runtime.
+## Build and publish (pip / Artifactory)
+Build source and wheel distributions:
+```bash
+python3 -m venv .venv
+.venv/bin/pip install -U pip build twine
+.venv/bin/python -m build
+```
+Publish to your Artifactory PyPI repository:
+```bash
+export TWINE_USERNAME="<artifactory-username>"
+export TWINE_PASSWORD="<artifactory-password-or-token>"
+.venv/bin/python -m twine upload \
+  --repository-url "https://<artifactory-host>/artifactory/api/pypi/<pypi-repo>/local" \
+  dist/*
+```
+## Integration testing and Docker
+Current integration tests are in-process (`tests/test_integration_phases.py`) and do not require a database or cache service.
+If a future change adds external DB/cache dependencies, run those services in Docker for tests (same pattern as `temp-noob/rule-engine`) so test setup remains reproducible.