PyPI - agentpack-cli - Versions diffs - 0.3.9__tar.gz → 0.3.11__tar.gz - Mend

agentpack-cli 0.3.9tar.gz → 0.3.11tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (109) hide show

{agentpack_cli-0.3.9 → agentpack_cli-0.3.11}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: agentpack-cli
-Version: 0.3.9
+Version: 0.3.11
 Summary: Local context engine for AI coding agents that ranks relevant files and builds task-focused context packs.
 License: MIT
 License-File: LICENSE
@@ -40,13 +40,14 @@ Description-Content-Type: text/markdown
 # AgentPack
 [![PyPI version](https://img.shields.io/pypi/v/agentpack-cli.svg)](https://pypi.org/project/agentpack-cli/)
+[![PyPI Downloads](https://static.pepy.tech/personalized-badge/agentpack-cli?period=total&units=INTERNATIONAL_SYSTEM&left_color=BLACK&right_color=GREEN&left_text=downloads)](https://pepy.tech/projects/agentpack-cli)
 [![npm version](https://img.shields.io/npm/v/@vishal2612200/agentpack.svg)](https://www.npmjs.com/package/@vishal2612200/agentpack)
 [![npm downloads](https://img.shields.io/npm/dm/@vishal2612200/agentpack.svg)](https://www.npmjs.com/package/@vishal2612200/agentpack)
 [![Python versions](https://img.shields.io/pypi/pyversions/agentpack-cli.svg)](https://pypi.org/project/agentpack-cli/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![CI](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml/badge.svg)](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml)
-> **Status: alpha (v0.3.9).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Public benchmark proof exists for the current suite, but broader repo coverage is still growing. API may change before 1.0.
+> **Status: alpha (v0.3.11).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Public benchmark proof exists for the current suite, but broader repo coverage is still growing. API may change before 1.0.
 >
 > **Platform note:** macOS, Linux, and Windows are supported. Windows support targets PowerShell plus Git for Windows. `cmd.exe` and bare Git setups are not a supported path yet.
@@ -64,6 +65,7 @@ Use AgentPack when a repo is too large to paste and you want faster, more consis
 - [Quality Bar](#quality-bar)
 - [Download Stats](#download-stats)
 - [Debugging Selection](#debugging-selection)
+- [Task Router](#task-router)
 - [Supported Integrations](#supported-integrations)
 - [Commands](#commands)
 - [Architecture](#architecture)
@@ -78,6 +80,7 @@ Use AgentPack when a repo is too large to paste and you want faster, more consis
 - **Local code intelligence**: extracts roles, domains, entrypoints, definitions, dependencies, env reads, side effects, and external systems using static analysis.
 - **Semantic repo map**: adds a compact module-level map before file context so agents orient faster.
 - **Freshness and deltas**: records task source, git state, snapshot hashes, selected-file deltas, stale-context warnings, MCP auto-refresh signals, and a machine-readable `agentpack:freshness` block in markdown fallback artifacts.
+- **Task router**: MCP and CLI surfaces route a task to relevant files, scoped rules, installed skills, suggested commands, and safety warnings without executing skills automatically.
 - **Agent integrations**: installs Claude Code, Cursor, Windsurf, Codex, Antigravity, VS Code tasks, git hooks, and MCP configuration.
 - **Local and measurable**: no API calls for scan, summarize, rank, pack, stats, or benchmark; quality is measured with expected-file evals.
@@ -291,6 +294,40 @@ agentpack guard --agent auto --repair-stale --refresh-context
 `guard` checks pack freshness, task freshness, repo snapshot freshness, and installed agent rules/hooks. With `--repair-stale --refresh-context`, it repairs stale AgentPack rule files and refreshes missing or stale context before returning success. `agentpack pack` also self-heals stale AgentPack rule blocks for the active agent, so older installs that still run `pack` get upgraded opportunistically.
+## Task Router
+AgentPack Router is the MCP-first path for agents that need a task map before loading full context. It returns:
+- files to read first
+- repo and tool rules to apply
+- installed skills to consider
+- commands to consider, never execute automatically
+- safety warnings for external side-effect skills
+- an agent-ready prompt block
+Use MCP when available:
+```text
+route_task("fix flaky payment webhook test")
+```
+Use CLI for inspection or scripting:
+```bash
+agentpack skills scan
+agentpack skills index
+agentpack route --task "fix flaky payment webhook test"
+agentpack route --task "fix flaky payment webhook test" --format json
+```
+Router reads skills and rules from `.claude/skills/`, `~/.claude/skills/`, `~/.codex/skills/`, `~/.agents/skills/`, `.agentpack/skills/`, `.cursor/rules/`, `AGENTS.md`, `CLAUDE.md`, and `GEMINI.md`. Rules are mandatory scoped instructions; skills are optional recommendations. The local `.agentpack/skills_index.json` stores metadata only and omits raw skill/rule bodies.
+Safety defaults:
+- skills are recommended, not executed
+- suggested commands are returned as strings with reasons
+- external side-effect skills, such as deploy or cloud mutation checklists, are warned and not selected unless explicitly allowed in config
 ## Before / After Agent Behavior
 Without AgentPack:
@@ -597,10 +634,14 @@ Command map:
 | `agentpack install` | Refresh or add an agent integration without changing project state |
 | `agentpack repair` | Restore missing or drifted integration files |
 | `agentpack pack` | Generate a ranked context pack for one task |
+| `agentpack route` | Route a task to files, rules, skills, commands, and safety warnings |
+| `agentpack skills scan` | Print discovered local/global skills and rules |
+| `agentpack skills index` | Write `.agentpack/skills_index.json` metadata for faster routing |
 | `agentpack watch` | Keep the context pack fresh while you work |
 | `agentpack doctor` | Audit hooks, agent files, CLI path, and repo health |
 | `agentpack explain` | Understand why a file was selected or omitted |
 | `agentpack benchmark` | Measure recall, precision, and misses against real tasks |
+| `agentpack eval` | Run deterministic failure evals with tests, diff limits, and taxonomy labels |
 | `agentpack tune` | Suggest fixes from recent pack metrics and benchmark misses |
 | `agentpack status` | Inspect current pack freshness and metadata |
 | `agentpack diff` | Show what changed between context snapshots |
@@ -893,6 +934,32 @@ This keeps unrelated dirty files from consuming the whole context budget while p
 ---
+### `agentpack route`
+Route a task without writing context files. This is the CLI debug/admin surface for the same router used by MCP `route_task`.
+```bash
+agentpack route --task "fix flaky payment webhook test"
+agentpack route --task "fix flaky payment webhook test" --format json
+```
+Output includes relevant files, applied rules, recommended skills, suggested commands, safety warnings, and an agent prompt. It uses the existing AgentPack file ranker in memory and does not write `.agentpack/context.md`.
+---
+### `agentpack skills`
+Inspect or index installed skills and rule files.
+```bash
+agentpack skills scan
+agentpack skills index
+```
+`scan` prints discovered artifacts. `index` writes `.agentpack/skills_index.json` with metadata only; raw skill and rule bodies are omitted from the index.
+---
 ### `agentpack quickstart`
 Show the shortest useful path for the current repo.
@@ -984,6 +1051,9 @@ Register in Claude Code settings (`~/.claude/settings.json`):
 | Tool | Description |
 |---|---|
+| `route_task(task)` | Read-only task router. Returns relevant files, applied rules, recommended skills, suggested commands, safety warnings, and an agent prompt as JSON. |
+| `get_skills()` | Return discovered skill/rule inventory as JSON. |
+| `explain_route(task)` | Return route JSON with positive skill score reasons for debugging router choices. |
 | `start_task(task, mode, budget, max_tokens)` | Recommended MCP-first entry point. Writes `.agentpack/task.md`, generates a ranked pack, and returns packed markdown. |
 | `pack_context(task, mode, budget, max_tokens)` | Generate a ranked context pack. If `task` is provided, writes it to `.agentpack/task.md`; if omitted, reads `task.md` or infers from git. |
 | `get_context()` | Return the latest pack. If `.agentpack/task.md` or the repo snapshot differs from the packed metadata, it auto-refreshes before returning; otherwise it prepends a freshness header. |
@@ -1211,6 +1281,82 @@ This command does not pretend a pack is correct. It gives the next thing to insp
 ---
+### `agentpack eval`
+Run deterministic failure evals. AgentPack does not run the coding agent and
+does not use an LLM judge; it verifies the current or replayed worktree with
+commands and diff policies.
+```bash
+agentpack eval --init
+# edit .agentpack/evals.toml with real failures and checks
+agentpack eval
+agentpack eval --case auth-timeout --prove-targets
+agentpack eval --capture auth-timeout --failure-class context --check "pytest tests/test_auth.py -q"
+agentpack eval --watch --until-pass
+agentpack eval --replay --prove-targets
+agentpack eval --variant baseline
+agentpack eval --variant agentpack
+agentpack eval --compare-variants baseline:agentpack
+agentpack eval --ci-template
+agentpack eval --report
+```
+Example case:
+```toml
+[[cases]]
+id = "auth-timeout"
+task = "fix auth token timeout"
+failure_class = "context"
+failure_source = "agent_failed"
+base_ref = "HEAD"
+patch_file = ".agentpack/evals/auth-timeout.patch"
+required_changed_files = ["src/auth/token.py"]
+forbidden_changed_files = ["src/db/**"]
+max_changed_files = 5
+max_changed_lines = 250
+agent = "codex"
+context_file = ".agentpack/context.md"
+context_hash = "..."
+selected_files = ["src/auth/token.py", "tests/test_auth.py"]
+[[cases.checks]]
+name = "tests"
+command = "pytest tests/test_auth.py -q"
+timeout_s = 120
+retries = 1 # optional, marks pass-after-fail checks as flaky
+```
+Use `eval` after an agent run: capture the real failure, add deterministic
+checks such as tests, typecheck, lint, schema validation, API contract tests,
+diff size, forbidden files, or golden outputs, then rerun until the harness
+passes. The model can propose; the harness must verify.
+For hands-free local iteration, keep `agentpack eval --watch --until-pass`
+running in a terminal while the agent or developer edits. It reruns when the
+case file, patch artifacts, golden files, or git diff content changes and stops
+when all deterministic checks pass. `--capture` stores the current patch under
+`.agentpack/evals/<case-id>.patch` plus context metadata; `--replay` checks out
+`base_ref` into an isolated git worktree, applies that patch, and runs the same
+deterministic checks there. To measure AgentPack's contribution, run the same
+case with `--variant baseline` and then with `--variant agentpack`;
+`--compare-variants baseline:agentpack` reports which cases improved, regressed,
+stayed unchanged, or still need both sides. Use `--ci-template` to scaffold a
+GitHub Actions workflow for `benchmarks/evals.toml`.
+Eval files are executable trust boundaries: commands in `checks.command` run
+locally and in CI. Review eval TOML from contributors with the same care as
+shell scripts or workflow files.
+Captured patch artifacts are secret-scanned with the same local redactor used
+for context packs before they are written. If a patch line contains a real
+secret, the artifact stores `[REDACTED:<type>]` and the case records
+`patch_redaction_warnings`. Secret-bearing patches may replay with redacted
+values; replace secrets with safe fixture values when exact replay matters.
+---
 ### `agentpack status`
 Check whether the context pack is stale.

{agentpack_cli-0.3.9 → agentpack_cli-0.3.11}/README.md RENAMED Viewed

@@ -1,13 +1,14 @@
 # AgentPack
 [![PyPI version](https://img.shields.io/pypi/v/agentpack-cli.svg)](https://pypi.org/project/agentpack-cli/)
+[![PyPI Downloads](https://static.pepy.tech/personalized-badge/agentpack-cli?period=total&units=INTERNATIONAL_SYSTEM&left_color=BLACK&right_color=GREEN&left_text=downloads)](https://pepy.tech/projects/agentpack-cli)
 [![npm version](https://img.shields.io/npm/v/@vishal2612200/agentpack.svg)](https://www.npmjs.com/package/@vishal2612200/agentpack)
 [![npm downloads](https://img.shields.io/npm/dm/@vishal2612200/agentpack.svg)](https://www.npmjs.com/package/@vishal2612200/agentpack)
 [![Python versions](https://img.shields.io/pypi/pyversions/agentpack-cli.svg)](https://pypi.org/project/agentpack-cli/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![CI](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml/badge.svg)](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml)
-> **Status: alpha (v0.3.9).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Public benchmark proof exists for the current suite, but broader repo coverage is still growing. API may change before 1.0.
+> **Status: alpha (v0.3.11).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Public benchmark proof exists for the current suite, but broader repo coverage is still growing. API may change before 1.0.
 >
 > **Platform note:** macOS, Linux, and Windows are supported. Windows support targets PowerShell plus Git for Windows. `cmd.exe` and bare Git setups are not a supported path yet.
@@ -25,6 +26,7 @@ Use AgentPack when a repo is too large to paste and you want faster, more consis
 - [Quality Bar](#quality-bar)
 - [Download Stats](#download-stats)
 - [Debugging Selection](#debugging-selection)
+- [Task Router](#task-router)
 - [Supported Integrations](#supported-integrations)
 - [Commands](#commands)
 - [Architecture](#architecture)
@@ -39,6 +41,7 @@ Use AgentPack when a repo is too large to paste and you want faster, more consis
 - **Local code intelligence**: extracts roles, domains, entrypoints, definitions, dependencies, env reads, side effects, and external systems using static analysis.
 - **Semantic repo map**: adds a compact module-level map before file context so agents orient faster.
 - **Freshness and deltas**: records task source, git state, snapshot hashes, selected-file deltas, stale-context warnings, MCP auto-refresh signals, and a machine-readable `agentpack:freshness` block in markdown fallback artifacts.
+- **Task router**: MCP and CLI surfaces route a task to relevant files, scoped rules, installed skills, suggested commands, and safety warnings without executing skills automatically.
 - **Agent integrations**: installs Claude Code, Cursor, Windsurf, Codex, Antigravity, VS Code tasks, git hooks, and MCP configuration.
 - **Local and measurable**: no API calls for scan, summarize, rank, pack, stats, or benchmark; quality is measured with expected-file evals.
@@ -252,6 +255,40 @@ agentpack guard --agent auto --repair-stale --refresh-context
 `guard` checks pack freshness, task freshness, repo snapshot freshness, and installed agent rules/hooks. With `--repair-stale --refresh-context`, it repairs stale AgentPack rule files and refreshes missing or stale context before returning success. `agentpack pack` also self-heals stale AgentPack rule blocks for the active agent, so older installs that still run `pack` get upgraded opportunistically.
+## Task Router
+AgentPack Router is the MCP-first path for agents that need a task map before loading full context. It returns:
+- files to read first
+- repo and tool rules to apply
+- installed skills to consider
+- commands to consider, never execute automatically
+- safety warnings for external side-effect skills
+- an agent-ready prompt block
+Use MCP when available:
+```text
+route_task("fix flaky payment webhook test")
+```
+Use CLI for inspection or scripting:
+```bash
+agentpack skills scan
+agentpack skills index
+agentpack route --task "fix flaky payment webhook test"
+agentpack route --task "fix flaky payment webhook test" --format json
+```
+Router reads skills and rules from `.claude/skills/`, `~/.claude/skills/`, `~/.codex/skills/`, `~/.agents/skills/`, `.agentpack/skills/`, `.cursor/rules/`, `AGENTS.md`, `CLAUDE.md`, and `GEMINI.md`. Rules are mandatory scoped instructions; skills are optional recommendations. The local `.agentpack/skills_index.json` stores metadata only and omits raw skill/rule bodies.
+Safety defaults:
+- skills are recommended, not executed
+- suggested commands are returned as strings with reasons
+- external side-effect skills, such as deploy or cloud mutation checklists, are warned and not selected unless explicitly allowed in config
 ## Before / After Agent Behavior
 Without AgentPack:
@@ -558,10 +595,14 @@ Command map:
 | `agentpack install` | Refresh or add an agent integration without changing project state |
 | `agentpack repair` | Restore missing or drifted integration files |
 | `agentpack pack` | Generate a ranked context pack for one task |
+| `agentpack route` | Route a task to files, rules, skills, commands, and safety warnings |
+| `agentpack skills scan` | Print discovered local/global skills and rules |
+| `agentpack skills index` | Write `.agentpack/skills_index.json` metadata for faster routing |
 | `agentpack watch` | Keep the context pack fresh while you work |
 | `agentpack doctor` | Audit hooks, agent files, CLI path, and repo health |
 | `agentpack explain` | Understand why a file was selected or omitted |
 | `agentpack benchmark` | Measure recall, precision, and misses against real tasks |
+| `agentpack eval` | Run deterministic failure evals with tests, diff limits, and taxonomy labels |
 | `agentpack tune` | Suggest fixes from recent pack metrics and benchmark misses |
 | `agentpack status` | Inspect current pack freshness and metadata |
 | `agentpack diff` | Show what changed between context snapshots |
@@ -854,6 +895,32 @@ This keeps unrelated dirty files from consuming the whole context budget while p
 ---
+### `agentpack route`
+Route a task without writing context files. This is the CLI debug/admin surface for the same router used by MCP `route_task`.
+```bash
+agentpack route --task "fix flaky payment webhook test"
+agentpack route --task "fix flaky payment webhook test" --format json
+```
+Output includes relevant files, applied rules, recommended skills, suggested commands, safety warnings, and an agent prompt. It uses the existing AgentPack file ranker in memory and does not write `.agentpack/context.md`.
+---
+### `agentpack skills`
+Inspect or index installed skills and rule files.
+```bash
+agentpack skills scan
+agentpack skills index
+```
+`scan` prints discovered artifacts. `index` writes `.agentpack/skills_index.json` with metadata only; raw skill and rule bodies are omitted from the index.
+---
 ### `agentpack quickstart`
 Show the shortest useful path for the current repo.
@@ -945,6 +1012,9 @@ Register in Claude Code settings (`~/.claude/settings.json`):
 | Tool | Description |
 |---|---|
+| `route_task(task)` | Read-only task router. Returns relevant files, applied rules, recommended skills, suggested commands, safety warnings, and an agent prompt as JSON. |
+| `get_skills()` | Return discovered skill/rule inventory as JSON. |
+| `explain_route(task)` | Return route JSON with positive skill score reasons for debugging router choices. |
 | `start_task(task, mode, budget, max_tokens)` | Recommended MCP-first entry point. Writes `.agentpack/task.md`, generates a ranked pack, and returns packed markdown. |
 | `pack_context(task, mode, budget, max_tokens)` | Generate a ranked context pack. If `task` is provided, writes it to `.agentpack/task.md`; if omitted, reads `task.md` or infers from git. |
 | `get_context()` | Return the latest pack. If `.agentpack/task.md` or the repo snapshot differs from the packed metadata, it auto-refreshes before returning; otherwise it prepends a freshness header. |
@@ -1172,6 +1242,82 @@ This command does not pretend a pack is correct. It gives the next thing to insp
 ---
+### `agentpack eval`
+Run deterministic failure evals. AgentPack does not run the coding agent and
+does not use an LLM judge; it verifies the current or replayed worktree with
+commands and diff policies.
+```bash
+agentpack eval --init
+# edit .agentpack/evals.toml with real failures and checks
+agentpack eval
+agentpack eval --case auth-timeout --prove-targets
+agentpack eval --capture auth-timeout --failure-class context --check "pytest tests/test_auth.py -q"
+agentpack eval --watch --until-pass
+agentpack eval --replay --prove-targets
+agentpack eval --variant baseline
+agentpack eval --variant agentpack
+agentpack eval --compare-variants baseline:agentpack
+agentpack eval --ci-template
+agentpack eval --report
+```
+Example case:
+```toml
+[[cases]]
+id = "auth-timeout"
+task = "fix auth token timeout"
+failure_class = "context"
+failure_source = "agent_failed"
+base_ref = "HEAD"
+patch_file = ".agentpack/evals/auth-timeout.patch"
+required_changed_files = ["src/auth/token.py"]
+forbidden_changed_files = ["src/db/**"]
+max_changed_files = 5
+max_changed_lines = 250
+agent = "codex"
+context_file = ".agentpack/context.md"
+context_hash = "..."
+selected_files = ["src/auth/token.py", "tests/test_auth.py"]
+[[cases.checks]]
+name = "tests"
+command = "pytest tests/test_auth.py -q"
+timeout_s = 120
+retries = 1 # optional, marks pass-after-fail checks as flaky
+```
+Use `eval` after an agent run: capture the real failure, add deterministic
+checks such as tests, typecheck, lint, schema validation, API contract tests,
+diff size, forbidden files, or golden outputs, then rerun until the harness
+passes. The model can propose; the harness must verify.
+For hands-free local iteration, keep `agentpack eval --watch --until-pass`
+running in a terminal while the agent or developer edits. It reruns when the
+case file, patch artifacts, golden files, or git diff content changes and stops
+when all deterministic checks pass. `--capture` stores the current patch under
+`.agentpack/evals/<case-id>.patch` plus context metadata; `--replay` checks out
+`base_ref` into an isolated git worktree, applies that patch, and runs the same
+deterministic checks there. To measure AgentPack's contribution, run the same
+case with `--variant baseline` and then with `--variant agentpack`;
+`--compare-variants baseline:agentpack` reports which cases improved, regressed,
+stayed unchanged, or still need both sides. Use `--ci-template` to scaffold a
+GitHub Actions workflow for `benchmarks/evals.toml`.
+Eval files are executable trust boundaries: commands in `checks.command` run
+locally and in CI. Review eval TOML from contributors with the same care as
+shell scripts or workflow files.
+Captured patch artifacts are secret-scanned with the same local redactor used
+for context packs before they are written. If a patch line contains a real
+secret, the artifact stores `[REDACTED:<type>]` and the case records
+`patch_redaction_warnings`. Secret-bearing patches may replay with redacted
+values; replace secrets with safe fixture values when exact replay matters.
+---
 ### `agentpack status`
 Check whether the context pack is stale.

{agentpack_cli-0.3.9 → agentpack_cli-0.3.11}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "agentpack-cli"
-version = "0.3.9"
+version = "0.3.11"
 description = "Local context engine for AI coding agents that ranks relevant files and builds task-focused context packs."
 readme = "README.md"
 requires-python = ">=3.10"

{agentpack_cli-0.3.9 → agentpack_cli-0.3.11}/src/agentpack/__init__.py RENAMED Viewed

@@ -1,3 +1,3 @@
 """AgentPack — task-aware context packing for AI coding agents."""
-__version__ = "0.3.9"
+__version__ = "0.3.11"

{agentpack_cli-0.3.9 → agentpack_cli-0.3.11}/src/agentpack/cli.py RENAMED Viewed

@@ -6,6 +6,7 @@ from agentpack.commands import (
     claude_cmd,
     diff,
     doctor,
+    eval_cmd,
     explain,
     guard,
     hook_cmd,
@@ -18,7 +19,9 @@ from agentpack.commands import (
     pack,
     quickstart,
     repair,
+    route,
     scan,
+    skills,
     stats,
     status,
     summarize,
@@ -55,11 +58,13 @@ for mod in [
     pack,
     install,
     repair,
+    route,
     migrate,
     monitor,
     explain,
     guard,
     doctor,
+    eval_cmd,
     tune,
     watch,
     claude_cmd,
@@ -67,6 +72,7 @@ for mod in [
     mcp_cmd,
     hook_cmd,
     quickstart,
+    skills,
 ]:
     mod.register(app)

agentpack-cli 0.3.9__tar.gz → 0.3.11__tar.gz

agentpack-cli 0.3.9tar.gz → 0.3.11tar.gz