agentpack-cli 0.1.29__tar.gz → 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (93) hide show
  1. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/PKG-INFO +270 -436
  2. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/README.md +269 -435
  3. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/pyproject.toml +1 -1
  4. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/__init__.py +1 -1
  5. agentpack_cli-0.2.0/src/agentpack/analysis/monorepo.py +181 -0
  6. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/ranking.py +164 -2
  7. agentpack_cli-0.2.0/src/agentpack/analysis/repo_map.py +94 -0
  8. agentpack_cli-0.2.0/src/agentpack/analysis/task_classifier.py +48 -0
  9. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/application/pack_service.py +444 -15
  10. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/cli.py +42 -2
  11. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/benchmark.py +183 -8
  12. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/doctor.py +48 -1
  13. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/explain.py +43 -1
  14. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/hook_cmd.py +20 -1
  15. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/init.py +2 -36
  16. agentpack_cli-0.2.0/src/agentpack/commands/install.py +243 -0
  17. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/pack.py +16 -4
  18. agentpack_cli-0.2.0/src/agentpack/commands/repair.py +40 -0
  19. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/scan.py +45 -5
  20. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/stats.py +158 -18
  21. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/status.py +29 -1
  22. agentpack_cli-0.2.0/src/agentpack/commands/tune.py +158 -0
  23. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/config.py +8 -0
  24. agentpack_cli-0.2.0/src/agentpack/core/context_pack.py +535 -0
  25. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/git.py +39 -0
  26. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/models.py +9 -1
  27. agentpack_cli-0.2.0/src/agentpack/integrations/agents.py +233 -0
  28. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/mcp_server.py +39 -0
  29. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/renderers/compact.py +14 -1
  30. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/renderers/markdown.py +25 -4
  31. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/summaries/offline.py +96 -2
  32. agentpack_cli-0.1.29/src/agentpack/commands/install.py +0 -294
  33. agentpack_cli-0.1.29/src/agentpack/core/context_pack.py +0 -250
  34. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/.gitignore +0 -0
  35. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/LICENSE +0 -0
  36. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/__init__.py +0 -0
  37. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/antigravity.py +0 -0
  38. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/base.py +0 -0
  39. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/claude.py +0 -0
  40. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/codex.py +0 -0
  41. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/cursor.py +0 -0
  42. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/detect.py +0 -0
  43. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/generic.py +0 -0
  44. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/adapters/windsurf.py +0 -0
  45. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/__init__.py +0 -0
  46. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/dependency_graph.py +0 -0
  47. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/go_imports.py +0 -0
  48. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/java_imports.py +0 -0
  49. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/js_ts_imports.py +0 -0
  50. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/python_imports.py +0 -0
  51. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/rust_imports.py +0 -0
  52. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/symbols.py +0 -0
  53. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/analysis/tests.py +0 -0
  54. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/application/__init__.py +0 -0
  55. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/__init__.py +0 -0
  56. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/_shared.py +0 -0
  57. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/claude_cmd.py +0 -0
  58. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/diff.py +0 -0
  59. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/mcp_cmd.py +0 -0
  60. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/monitor.py +0 -0
  61. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/quickstart.py +0 -0
  62. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/summarize.py +0 -0
  63. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/commands/watch.py +0 -0
  64. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/__init__.py +0 -0
  65. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/bootstrap.py +0 -0
  66. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/cache.py +0 -0
  67. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/diff.py +0 -0
  68. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/git_hooks.py +0 -0
  69. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/global_install.py +0 -0
  70. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/ignore.py +0 -0
  71. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/merkle.py +0 -0
  72. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/redactor.py +0 -0
  73. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/scanner.py +0 -0
  74. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/snapshot.py +0 -0
  75. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/token_estimator.py +0 -0
  76. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/core/vscode_tasks.py +0 -0
  77. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/data/agentpack.md +0 -0
  78. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/installers/__init__.py +0 -0
  79. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/installers/antigravity.py +0 -0
  80. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/installers/claude.py +0 -0
  81. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/installers/codex.py +0 -0
  82. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/installers/cursor.py +0 -0
  83. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/installers/windsurf.py +0 -0
  84. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/integrations/__init__.py +0 -0
  85. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/integrations/git_hooks.py +0 -0
  86. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/integrations/global_install.py +0 -0
  87. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/integrations/vscode_tasks.py +0 -0
  88. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/renderers/__init__.py +0 -0
  89. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/renderers/receipts.py +0 -0
  90. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/session/__init__.py +0 -0
  91. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/session/state.py +0 -0
  92. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/summaries/__init__.py +0 -0
  93. {agentpack_cli-0.1.29 → agentpack_cli-0.2.0}/src/agentpack/summaries/base.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: agentpack-cli
3
- Version: 0.1.29
3
+ Version: 0.2.0
4
4
  Summary: Task-aware context packing for AI coding agents — Claude, Cursor, Windsurf, Codex, and Antigravity
5
5
  License: MIT
6
6
  License-File: LICENSE
@@ -44,156 +44,130 @@ Description-Content-Type: text/markdown
44
44
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
45
45
  [![CI](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml/badge.svg)](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml)
46
46
 
47
- > **Status: alpha (v0.1.29).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Not yet validated across a wide range of repos. API may change before 1.0.
47
+ > **Status: alpha (v0.2.0).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Not yet validated across a wide range of repos. API may change before 1.0.
48
48
  >
49
49
  > **Platform note:** macOS and Linux are fully supported. Windows support is not yet implemented (git hooks use POSIX shell; the Claude Code session hooks use `python3`/`rm -f`). Contributions welcome.
50
50
 
51
- **Task-aware context packing for AI coding agents.**
51
+ **Local context engine for AI coding agents.**
52
52
 
53
- AgentPack scans a repository, ranks files for the task you are working on, and writes a compact markdown context pack for Claude Code, Cursor, Windsurf, Codex, Antigravity, CI jobs, or any LLM workflow.
53
+ AgentPack builds task-focused context packs for Claude Code, Cursor, Windsurf, Codex, Antigravity, CI jobs, and any LLM workflow that can read markdown. It scans your repo locally, ranks files for the task, compresses the result into a token budget, and keeps the pack fresh through CLI commands, MCP tools, hooks, and agent integrations.
54
54
 
55
- It is useful when the repo is too large to paste, but you still want the agent to start with more than a blank slate.
55
+ AgentPack is useful when a repo is too large to paste, but a blank agent session wastes time rediscovering the same code structure. It is a context preparation tool, not a coding agent.
56
56
 
57
- **What it is**
58
- - A local CLI for building task-focused context packs
59
- - A summary cache, import graph, ranking engine, and token-budget selector
60
- - Optional integrations for popular coding agents
61
- - An eval harness for measuring whether selected files match files you actually changed
57
+ ## Contents
62
58
 
63
- **What it is not**
64
- - Not a coding agent
65
- - Not a semantic code search engine
66
- - Not a replacement for manual inspection on high-stakes changes
67
- - Not yet proven across a large public benchmark suite
59
+ - [Features](#features)
60
+ - [Install](#install)
61
+ - [Quickstart](#quickstart)
62
+ - [Quality Bar](#quality-bar)
63
+ - [Debugging Selection](#debugging-selection)
64
+ - [Supported Integrations](#supported-integrations)
65
+ - [Commands](#commands)
66
+ - [Architecture](#architecture)
67
+ - [Known Limitations](#known-limitations)
68
+ - [Roadmap](#roadmap)
69
+ - [Development](#development)
68
70
 
69
- ---
70
-
71
- ## The problem
72
-
73
- Every time you start a task with an AI coding agent, it has no idea what's in your repo. It either:
71
+ ## Features
74
72
 
75
- 1. **Reads files on demand** (Claude Code, Cursor, Windsurf) dozens of tool calls, paying exploration cost every session, every turn, forever.
76
- 2. **Gets the whole repo dumped in** (repomix, gitingest) 50k–500k tokens of noise, most of it irrelevant to the task at hand.
77
- 3. **Gets nothing** you hand-copy the 5 files you think matter and hope you got it right.
73
+ - **Task-focused packing**: ranks files from git changes, task terms, symbols, imports, related tests, configs, churn, and repo history.
74
+ - **Budget-aware compression**: emits `full`, `diff`, `symbols`, `skeleton`, or `summary` views instead of all-or-nothing file dumps.
75
+ - **Semantic repo map**: adds a compact module-level map before file context so agents orient faster.
76
+ - **Freshness and deltas**: records task source, git state, snapshot hashes, selected-file deltas, and stale-context warnings.
77
+ - **Agent integrations**: installs Claude Code, Cursor, Windsurf, Codex, Antigravity, VS Code tasks, git hooks, and MCP configuration.
78
+ - **Local and measurable**: no API calls for scan, summarize, rank, pack, stats, or benchmark; quality is measured with expected-file evals.
78
79
 
79
- None of these scale. On a 200-file codebase, option 1 wastes 5–10 turns just orienting. Option 2 degrades output quality (LLMs perform worse on long noisy context). Option 3 misses critical dependencies and configs constantly.
80
+ ## Install
80
81
 
81
- **The root cause:** agents don't know *what's relevant to your current task* without doing the work to figure that out — which costs tokens, time, and money on every session.
82
+ ```bash
83
+ pip install agentpack-cli
84
+ agentpack --version
85
+ ```
82
86
 
83
- ---
87
+ Requires Python 3.10+. The PyPI package is `agentpack-cli`; the command is `agentpack`.
84
88
 
85
- ## The solution
89
+ JavaScript-heavy teams can install the npm wrapper:
86
90
 
87
- AgentPack solves this with a one-time offline analysis pass:
91
+ ```bash
92
+ npm install -g @vishal2612200/agentpack
93
+ agentpack --version
94
+ ```
88
95
 
89
- 1. **Scans your repo once** builds a summary cache of every file (signatures, imports, responsibilities). No API calls. Takes a few seconds.
90
- 2. **On each task** — uses git diff, import graph traversal, keyword/concept expansion, implementation-role boosts, and cross-layer relatedness to rank every file.
91
- 3. **Packs a tight context document** — changed files get full content, large changed files get relevant symbol bodies, dependencies and likely implementation files get summaries, everything else gets dropped.
92
- 4. **Explains pack quality** — noisy-pack diagnostics, score receipts, token-precision metrics, and benchmark miss reports show when the pack is broad or missing expected files.
93
- 5. **Stays current** — auto-repacks silently on commit, so next session starts fresh.
96
+ The npm package is a Node launcher around the Python implementation. It installs the matching `agentpack-cli` package into a per-version virtual environment on first run.
94
97
 
95
- The result: your agent starts with a focused map of the relevant code. It should reduce blind exploration, not replace the agent's own file reads or your judgment.
98
+ ## Quickstart
96
99
 
97
100
  ```bash
98
- pip install agentpack-cli
99
-
100
- # Show the fastest path for your repo
101
- agentpack quickstart --task "fix auth token expiry"
102
-
103
- # One-time setup per project
104
101
  cd your-project
105
- agentpack init # creates config/session/task.md + detected agent integration
106
-
107
- # Every terminal session
108
- agentpack watch # keeps context fresh automatically — that's it
102
+ agentpack init --agent codex # or claude, cursor, windsurf, antigravity
103
+ agentpack pack --task "fix auth token expiry"
109
104
  ```
110
105
 
111
- Then open Claude Code / Cursor / Windsurf / Codex / Antigravity and write your task normally. AgentPack keeps `.agentpack/context.md` current.
106
+ This creates `.agentpack/` state, installs the requested agent integration, generates a ranked context pack, and writes the adapter output for that agent. For active local work, keep context fresh with:
107
+
108
+ ```bash
109
+ agentpack watch
110
+ ```
112
111
 
113
- For power users who want background repacking on every commit and cd:
112
+ For a guided setup that explains each next step:
114
113
 
115
114
  ```bash
116
- # Advanced: global automation (opt-in repos only — never touches repos without .agentpack/)
117
- agentpack global-install --dry-run # preview first
118
- agentpack global-install
115
+ agentpack quickstart --task "fix auth token expiry"
119
116
  ```
120
117
 
121
- Supported agents: **Claude Code**, **Cursor**, **Windsurf**, **Codex**, **Antigravity** (Google), or any LLM that can read markdown.
118
+ ## Project Scope
122
119
 
123
- ---
120
+ **AgentPack is:**
121
+
122
+ - A local context engine for building task-focused packs for AI coding agents.
123
+ - A CLI, MCP server, hook runner, and integration layer.
124
+ - A summary cache, import graph, ranking engine, semantic repo map, and token-budget selector.
125
+ - An eval harness for measuring whether selected files match files you actually changed.
124
126
 
125
- ## What to expect
127
+ **AgentPack is not:**
126
128
 
127
- AgentPack's strongest value is repeatable orientation: it gives the agent a compact first-pass map before tool calls begin.
129
+ - A coding agent.
130
+ - A hosted service.
131
+ - A semantic code search engine.
132
+ - A replacement for normal source inspection on critical changes.
133
+ - Proven across a large public benchmark suite yet.
128
134
 
129
- Typical results on large repos:
135
+ ## Quality Bar
136
+
137
+ AgentPack is best treated as a **ranked starting map**. It should reduce repeated orientation work, but the agent and reviewer still own correctness.
130
138
 
131
139
  | Signal | What good looks like |
132
140
  |---|---|
133
- | Token reduction | 90-99% smaller than raw repo text |
141
+ | Token reduction | 90-99% smaller than raw repo text on large repos |
134
142
  | Pack size | Usually 8k-25k tokens for a specific task |
135
- | Pack time | Seconds on warm cache; first summarize pass is slower |
136
- | Recall | Should be high for files you later edit; validate with `agentpack benchmark` |
137
- | Precision | Often modest; summaries are cheap but can still add noise |
138
-
139
- The compression number is easy to verify, but it is not the same as usefulness. The important question is: **did AgentPack include the files you actually needed?**
143
+ | Pack time | Seconds on a warm cache; first summarize pass is slower |
144
+ | Recall | Expected files appear near the top; validate with `agentpack benchmark --misses` |
145
+ | Precision | Good enough to reduce exploration; summaries and repo maps may still include noise |
146
+ | Freshness | Stale packs are clearly marked by task, git, and snapshot checks |
140
147
 
141
- Use the built-in eval flow:
148
+ Use real repo evals instead of trusting compression numbers:
142
149
 
143
150
  ```bash
144
151
  agentpack benchmark --init
145
- # add real historical tasks and files you actually changed
152
+ # add historical tasks and files actually changed
146
153
  agentpack benchmark --compare --misses
154
+ agentpack benchmark --results-template
147
155
  ```
148
156
 
149
- For source checkouts, there is also a small smoke suite:
157
+ ## Debugging Selection
158
+
159
+ When AgentPack misses a file, the next command should explain the miss:
150
160
 
151
161
  ```bash
152
- agentpack benchmark --sample-fixtures --misses
162
+ agentpack benchmark --misses
163
+ agentpack explain --task "fix billing webhook" --file lib/billing/webhook.ts
164
+ agentpack explain --task "fix billing webhook" --omitted
165
+ agentpack explain --task "fix billing webhook" --budget-plan
153
166
  ```
154
167
 
155
- This runs FastAPI, Next.js, and mixed Python/TypeScript fixture tasks. It is a sanity check, not a substitute for real repo evals.
156
-
157
- ### Current quality bar
158
-
159
- AgentPack is best described as a **map, not a compass**. It is already good at token reduction, changed-file inclusion, related tests, imports, configs, and common concepts like auth/cache/rate limiting. Recent ranking work also improves full-stack tasks by pulling service/controller/schema/handler files when UI routes or pages match the same domain.
160
-
161
- Known weak spot: recall can still be low on unfamiliar product domains or cross-language flows. Use `benchmark --misses` and `agentpack explain` when an expected file is absent. Those commands show whether the miss was caused by ignore rules, low score, summary floor, budget cutoff, or missing task signal.
162
-
163
- ### Observed author-run numbers
168
+ `benchmark --misses` reports each expected file that was not selected, including whether it was ignored, scored too low, excluded by summary floor, cut by budget, or absent from the scan. `explain --file` shows the exact score signals for one file. `explain --budget-plan` shows how the token budget was spent across full, diff, symbols, skeleton, and summary modes.
164
169
 
165
- These are local author-session numbers, included as anecdotal context rather than a benchmark claim.
166
-
167
- #### Token Compression
168
-
169
- | Metric | Value |
170
- |--------|-------|
171
- | Sessions | 21 |
172
- | Avg compression | 99.3% |
173
- | Min / Max | 98.7% → 99.9% |
174
- | Total raw (theoretical) | 116.9M tokens |
175
- | Total packed (actual) | 683K tokens |
176
-
177
- Per session: ~4.1M raw repo → ~35K packed context.
178
-
179
- #### Cost (Sonnet 4.6, input tokens only)
180
-
181
- | Scenario | Cost |
182
- |----------|------|
183
- | Full repo dumped each run | ~$350 |
184
- | With AgentPack | ~$2.05 |
185
- | **Realistic** (10% manual cherry-pick baseline) | **~$33 saved** |
186
-
187
- > Honest note: raw_tokens = full repo estimate. Real savings depend on how much context you'd pass manually. Compression ratio (99%+) is verifiable; dollar figure is scenario-dependent.
188
-
189
- #### Quality Signal
190
-
191
- - 42 commits in 7 days (~6/day) vs 4.9/day before
192
- - Shift from single-file fixes → multi-system coordinated fixes
193
- - AgentPack provides dependent files + callers in context → fixes root cause, not symptom
194
- - Correlation observed, causation not isolated
195
-
196
- ---
170
+ This is the core reliability loop: pack, measure recall, inspect misses, then tune task wording, `.agentignore`, or scoring weights.
197
171
 
198
172
  ## When it helps
199
173
 
@@ -217,9 +191,9 @@ These are repo dumpers. They pack a repo (or subset) into a file and hand it to
217
191
 
218
192
  What they don't do: decide what's relevant to *your task*. You specify the scope — files, globs, directories — and they package your decision. If you want "only the files that matter for fixing this auth bug", you have to figure that out yourself. On a 200-file repo, that's 80% of the work.
219
193
 
220
- AgentPack does that selection automatically. You give it a task string; it uses git diff, import graph traversal, and keyword scoring to rank every file, then cuts to fit your token budget. You don't touch globs.
194
+ AgentPack does that selection automatically. You give it a task string; it uses task classification, git diff, import graph traversal, semantic summaries, and keyword scoring to rank every file, then cuts to fit your token budget. You don't touch globs.
221
195
 
222
- The other difference: all three pack uniformly (full content or nothing). AgentPack is selective by inclusion mode — changed files get full content, unchanged deps get summaries, unrelated files get dropped. A repomix dump of a 50k-token repo stays 50k tokens. An agentpack of the same repo for a specific task is typically 8k–20k.
196
+ The other difference: all three pack uniformly (full content or nothing). AgentPack is selective by inclusion mode — changed files can be full source, relevant diff hunks, symbol bodies, interface skeletons, or summaries; unrelated files get dropped. A repomix dump of a 50k-token repo stays 50k tokens. An agentpack of the same repo for a specific task is typically 8k–20k.
223
197
 
224
198
  **Use repomix/gitingest if:** you want to dump an entire small repo into a chat UI for a one-shot question. Zero setup, great for "explain this codebase."
225
199
 
@@ -241,7 +215,7 @@ These tools have native file access via tool calls. Claude reads exactly the fil
241
215
 
242
216
  AgentPack's value here is different: `agentpack init --agent <x>` configures your agent to read or inject a ranked context pack and auto-repack when the repo changes. On large repos where tool-call exploration piles up across turns, this front-loads the cost once instead of paying per-turn.
243
217
 
244
- ### Where agentpack genuinely wins
218
+ ### Where AgentPack Wins
245
219
 
246
220
  | Scenario | repomix | gitingest | code2prompt | aider | agentpack |
247
221
  |---|---|---|---|---|---|
@@ -250,6 +224,7 @@ AgentPack's value here is different: `agentpack init --agent <x>` configures you
250
224
  | Auto task inference from git | ✗ | ✗ | ✗ | partial | ✓ |
251
225
  | Relevance ranking by task | ✗ | ✗ | ✗ | ✗ | ✓ |
252
226
  | Import graph traversal | ✗ | ✗ | ✗ | ✓ | ✓ |
227
+ | Monorepo workspace hints | ✗ | ✗ | ✗ | manual | ✓ |
253
228
  | Token budget enforcement | manual | manual | manual | ✓ | ✓ |
254
229
  | Cursor / Windsurf / Codex / Antigravity install | ✗ | ✗ | ✗ | ✗ | ✓ |
255
230
  | Zero API calls | ✓ | ✓ | ✓ | ✗ | ✓ |
@@ -258,64 +233,17 @@ AgentPack's value here is different: `agentpack init --agent <x>` configures you
258
233
 
259
234
  _*`--agent generic` outputs standard markdown. Claude adapter has richer instructions._
260
235
 
261
- ### What agentpack does NOT do well
236
+ ### What AgentPack Does Not Do Well
262
237
 
263
238
  - **Interactive sessions on small repos**: if your whole repo is <20k tokens, a simple repo dump may be enough
264
239
  - **One-shot public repo questions**: gitingest's "replace hub with ingest" is faster for quick read-only exploration
265
240
  - **Guaranteed source-of-truth selection**: AgentPack ranks likely files; it can miss task-critical files. Use `agentpack benchmark --misses`, `agentpack explain`, and normal `rg`/agent file reads for correctness.
266
241
  - **Deep semantic understanding**: keyword/concept scoring, imports, symbols, and path roles help, but they are not an LLM-level code understanding system
242
+ - **Public proof without real cases**: bundled fixtures are smoke tests. Strong claims need historical tasks from real repos and published results.
267
243
 
268
244
  ---
269
245
 
270
- ## Install
271
-
272
- ```bash
273
- pip install agentpack-cli
274
- ```
275
-
276
- Requires Python 3.10+.
277
-
278
- > **PyPI note:** The package is `agentpack-cli` (the name `agentpack` was already taken). The CLI command is still `agentpack`.
279
-
280
- ### npm wrapper
281
-
282
- AgentPack can also be installed from npm:
283
-
284
- ```bash
285
- npm install -g @vishal2612200/agentpack
286
- agentpack --version
287
- ```
288
-
289
- The npm package is a thin Node.js wrapper around the Python CLI. It requires Node.js 18+ and Python 3.10+, then installs the matching `agentpack-cli` PyPI package into a per-version virtual environment on first run. This keeps the implementation single-source while giving JavaScript-heavy teams a familiar install path.
290
-
291
- ---
292
-
293
- ## Start Once, Then Work Normally
294
-
295
- For a guided two-minute path in any repo:
296
-
297
- ```bash
298
- agentpack quickstart --task "fix auth token expiry"
299
- ```
300
-
301
- It shows the exact commands to initialize, set task text, generate a first pack, inspect stats, start watch mode, and scaffold a small benchmark file for your own tasks.
302
-
303
- The full workflow:
304
-
305
- ```bash
306
- # One-time project setup
307
- agentpack init # creates config/session/task.md + detected agent integration
308
-
309
- # Every terminal session — just one command
310
- agentpack watch # auto-resumes session, refreshes context on file/task changes
311
- ```
312
-
313
- Then open Claude Code / Cursor / Codex and write your coding task normally.
314
-
315
- - AgentPack keeps `.agentpack/context.md` and `.agentpack/context.claude.md` fresh while `watch` is running.
316
- - To change the task: edit `.agentpack/task.md` directly, or tell Claude — it updates the file itself. `watch` picks up the change automatically.
317
-
318
- ### Agent integration matrix
246
+ ## Supported Integrations
319
247
 
320
248
  | Agent | Automation level | Method |
321
249
  |---|---|---|
@@ -326,7 +254,7 @@ Then open Claude Code / Cursor / Codex and write your coding task normally.
326
254
  | Antigravity | Medium | `init` writes `GEMINI.md`, VS Code task + git hooks |
327
255
  | Generic | Basic | `watch` mode + read `context.md` |
328
256
 
329
- ### Honest limitations
257
+ ### Integration limitations
330
258
 
331
259
  - AgentPack cannot intercept prompts inside IDEs — Cursor/Windsurf rely on rules being followed.
332
260
  - Claude wrapper (`agentpack claude`) is the most deterministic integration.
@@ -335,29 +263,6 @@ Then open Claude Code / Cursor / Codex and write your coding task normally.
335
263
 
336
264
  ---
337
265
 
338
- ## Quickstart
339
-
340
- ```bash
341
- pip install agentpack-cli
342
- cd your-project
343
- agentpack init # one-time setup: config/session/task.md + detected agent integration
344
- agentpack watch # in another terminal — keeps context fresh automatically
345
- ```
346
-
347
- Then open your agent and write your task normally.
348
-
349
- **Power users (global automation):**
350
-
351
- ```bash
352
- agentpack global-install --dry-run # preview
353
- agentpack global-install # apply
354
- source ~/.zshrc
355
- ```
356
-
357
- Then opt each project in: `cd your-project && agentpack init`. After that repo hooks or shell hooks keep context fresh, and Claude Code gets prompt-time context hints — no manual steps.
358
-
359
- ---
360
-
361
266
  ## Agent setup
362
267
 
363
268
  `agentpack init` is the normal one-command project setup. It creates `.agentpack/` state and installs the detected agent integration. Re-run it any time; integration writes are idempotent and never clobber unrelated config.
@@ -448,6 +353,7 @@ Builds an offline summary of every file — no API calls, no network. Each summa
448
353
  - What the file does and its responsibility
449
354
  - Exported classes, functions, signatures with extracted bodies
450
355
  - Import dependencies
356
+ - Likely side effects, public API shape, error paths, and test hints
451
357
 
452
358
  Summaries are stored in `.agentpack/cache/` keyed by file hash. Only changed files are re-summarized on the next pack.
453
359
 
@@ -473,7 +379,7 @@ Token counts use tiktoken `cl100k_base` — a close approximation to Claude's ac
473
379
 
474
380
  ## CI/CD: pack per PR
475
381
 
476
- ### agentpack's own CI
382
+ ### AgentPack's Own CI
477
383
 
478
384
  agentpack uses two workflows:
479
385
 
@@ -527,6 +433,34 @@ Reviewers download the artifact and open it in their agent of choice. No repo cl
527
433
 
528
434
  ## Commands
529
435
 
436
+ Most users only need four commands:
437
+
438
+ ```bash
439
+ agentpack init --agent codex
440
+ agentpack pack --task "describe the change"
441
+ agentpack watch
442
+ agentpack doctor --agent all
443
+ ```
444
+
445
+ Command map:
446
+
447
+ | Command | Use when |
448
+ |---|---|
449
+ | `agentpack init` | Set up `.agentpack/` and install one agent integration for a repo |
450
+ | `agentpack install` | Refresh or add an agent integration without changing project state |
451
+ | `agentpack repair` | Restore missing or drifted integration files |
452
+ | `agentpack pack` | Generate a ranked context pack for one task |
453
+ | `agentpack watch` | Keep the context pack fresh while you work |
454
+ | `agentpack doctor` | Audit hooks, agent files, CLI path, and repo health |
455
+ | `agentpack explain` | Understand why a file was selected or omitted |
456
+ | `agentpack benchmark` | Measure recall, precision, and misses against real tasks |
457
+ | `agentpack tune` | Suggest fixes from recent pack metrics and benchmark misses |
458
+ | `agentpack status` | Inspect current pack freshness and metadata |
459
+ | `agentpack diff` | Show what changed between context snapshots |
460
+ | `agentpack monitor` | Review recent pack runs and quality signals |
461
+ | `agentpack scan` | Inspect packable, ignored, binary, and largest files |
462
+ | `agentpack global-install` | Install opt-in global hooks for initialized repos |
463
+
530
464
  ### `agentpack global-install`
531
465
 
532
466
  Install once — works in every repo from that point on. The recommended first step.
@@ -583,13 +517,15 @@ Diagnose your agentpack installation — checks CLI, git template hooks, git con
583
517
 
584
518
  ```bash
585
519
  agentpack doctor
520
+ agentpack doctor --agent codex
521
+ agentpack doctor --agent all
586
522
  ```
587
523
 
588
524
  Example output:
589
525
 
590
526
  ```
591
527
  CLI
592
- ✓ agentpack found at /usr/local/bin/agentpack (0.1.0)
528
+ ✓ agentpack found at /usr/local/bin/agentpack (0.1.x)
593
529
 
594
530
  Git template hooks (~/.git-templates/hooks/)
595
531
  ✓ post-commit
@@ -621,6 +557,7 @@ Some checks failed. Run the suggested commands above to fix.
621
557
  ```
622
558
 
623
559
  The new checks in `doctor`:
560
+ - **Agent matrix audit**: `--agent all` checks Claude, Cursor, Windsurf, Codex, Antigravity, and Generic in one pass, including Codex `.codex/hooks.json` lifecycle hooks.
624
561
  - **Local vs global hooks**: warns when Claude hooks are only in the per-project `.claude/settings.json` — context won't auto-inject in other repos
625
562
  - **Slash command presence**: checks both local (`.claude/commands/`) and global (`~/.claude/commands/`) installations
626
563
  - **Source checkout mismatch**: warns when you're inside an AgentPack source checkout but the `agentpack` executable imports the installed site-packages copy. Use `PYTHONPATH=src python -m agentpack.cli ...` or `pip install -e .` for local development.
@@ -661,7 +598,7 @@ Also installs the detected agent integration:
661
598
 
662
599
  ### `agentpack install`
663
600
 
664
- Repair or reconfigure agent-specific files without reinitializing project state.
601
+ Install or refresh one agent integration without reinitializing project state.
665
602
 
666
603
  ```bash
667
604
  agentpack install # auto-detect IDE
@@ -676,6 +613,18 @@ All installs are idempotent — safe to re-run, merge with existing config, neve
676
613
 
677
614
  ---
678
615
 
616
+ ### `agentpack repair`
617
+
618
+ Repair missing or drifted integration files. It uses the same installer contract as `init` and `install`, but is named for the "make this repo healthy again" workflow.
619
+
620
+ ```bash
621
+ agentpack repair # repair auto-detected agent
622
+ agentpack repair --agent codex # AGENTS.md + .codex/hooks.json + git hooks
623
+ agentpack repair --agent all # repair every supported integration
624
+ ```
625
+
626
+ ---
627
+
679
628
  ### `agentpack summarize`
680
629
 
681
630
  Build or refresh the offline summary cache. **No API calls, ever.**
@@ -696,6 +645,7 @@ Generate a context pack.
696
645
  ```bash
697
646
  agentpack pack --task "fix auth session bug" # auto-detects your IDE
698
647
  agentpack pack --agent claude --task "fix auth bug" # explicit agent
648
+ agentpack pack --workspace apps/web --task "fix web auth"
699
649
 
700
650
  # Only include changes since a git ref
701
651
  agentpack pack --task "review these changes" --since main
@@ -712,6 +662,7 @@ Options:
712
662
  | `--task` | `auto` | Task description, or `auto` to infer from git |
713
663
  | `--mode` | `balanced` | Budget mode: `minimal`, `balanced`, `deep` |
714
664
  | `--budget` | 0 (uses config default 25000) | Token budget |
665
+ | `--workspace` | — | Restrict packing to a monorepo workspace and write `.agentpack/workspaces/<workspace>/context.md` |
715
666
  | `--since` | — | Only include files changed since this git ref |
716
667
  | `--session` | off | Re-pack on every file change (watch mode) |
717
668
  | `--refresh` | off | Force rebuild summaries before packing |
@@ -726,6 +677,18 @@ Options:
726
677
 
727
678
  `pack` also prints diagnostics when the pack looks noisy: very short task text, no changed files, mostly filename matches, mostly summaries, many symbol matches, weak summaries excluded by the score floor, or summaries excluded by the mode cap.
728
679
 
680
+ AgentPack uses budget-aware compression when building context:
681
+
682
+ | Include mode | Used for |
683
+ |--------------|----------|
684
+ | `full` | Small or highly relevant changed files |
685
+ | `diff` | Large changed files where the edit hunk is more useful than the whole file |
686
+ | `symbols` | Focused implementation bodies under budget pressure |
687
+ | `skeleton` | Imports plus public class/function signatures |
688
+ | `summary` | Lower-priority supporting files |
689
+
690
+ This keeps unrelated dirty files from consuming the whole context budget while preserving changed-file recall.
691
+
729
692
  ---
730
693
 
731
694
  ### `agentpack quickstart`
@@ -742,18 +705,12 @@ agentpack quickstart --task "fix auth token expiry" --write
742
705
 
743
706
  ---
744
707
 
745
- ### `agentpack session` _(removed)_
746
-
747
- Session management was removed in v0.1.12. `agentpack init` bootstraps the session automatically. Use `agentpack watch` to keep context current. To change the task, edit `.agentpack/task.md`.
748
-
749
- ---
750
-
751
708
  ### `agentpack watch`
752
709
 
753
710
  Watch for file and task changes, refresh context automatically.
754
711
 
755
712
  ```bash
756
- agentpack watch # uses session agent/mode if session active
713
+ agentpack watch # refresh context on source/task changes
757
714
  agentpack watch --debounce 3.0 # wait 3s after last change before refresh
758
715
  ```
759
716
 
@@ -807,6 +764,10 @@ Register in Claude Code settings (`~/.claude/settings.json`):
807
764
  | `pack_context(task, mode, budget, max_tokens)` | Generate a ranked context pack for a task. Returns packed markdown, truncated to `max_tokens` (default 20,000). |
808
765
  | `get_context()` | Return the latest pre-built pack instantly (no repack). Prepends a freshness/staleness header so you know if it's stale. |
809
766
  | `refresh()` | Refresh using the current `task.md` or git-inferred task. |
767
+ | `explain_file(path, task)` | Show score, inclusion mode, reasons, symbols, imports, and importers for one file. |
768
+ | `get_related_files(path, depth)` | Return import-graph neighbours and related tests for a file. |
769
+ | `get_delta_context(max_files)` | Return the latest selected-file delta plus top current selected files. Useful for cheap prompt-time refresh checks. |
770
+ | `get_stats()` | Return latest pack stats, savings, selection quality, excluded files, and benchmark-style signals. |
810
771
 
811
772
  **Staleness detection:** `get_context()` compares the snapshot hash from when the pack was built against the current repo snapshot. If files changed since last pack, it prepends:
812
773
  ```
@@ -828,6 +789,7 @@ agentpack explain --task "fix auth session bug"
828
789
  agentpack explain --task auto
829
790
  agentpack explain --file src/auth/session.py # per-file score breakdown
830
791
  agentpack explain --omitted # top-10 excluded files
792
+ agentpack explain --budget-plan # modes, token costs, value/token
831
793
  ```
832
794
 
833
795
  Per-file breakdown (`--file`):
@@ -849,7 +811,7 @@ src/auth/session.py
849
811
  symbols: create_session, revoke_session, validate_session
850
812
  ```
851
813
 
852
- Use `--omitted` to see what was left out and why. Use `--file` when a file you expected isn't showing up.
814
+ Use `--omitted` to see what was left out and why. Use `--file` when a file you expected isn't showing up. Use `--budget-plan` to inspect how the compression planner spent the token budget.
853
815
 
854
816
  ---
855
817
 
@@ -861,9 +823,11 @@ Measure token efficiency, file selection quality, and speed across tasks.
861
823
  agentpack benchmark --task "fix auth token expiry" # single task
862
824
  agentpack benchmark --task "fix auth bug" --compare # compare minimal/balanced/deep
863
825
  agentpack benchmark --init # scaffold .agentpack/benchmark.toml
826
+ agentpack benchmark --results-template # scaffold publishable results note
864
827
  agentpack benchmark # run all cases in benchmark.toml
865
828
  agentpack benchmark --sample-fixtures # source checkout demo evals
866
829
  agentpack benchmark --misses # explain expected-file misses
830
+ agentpack benchmark --prove-targets # fail if recall/token precision targets miss
867
831
  ```
868
832
 
869
833
  Output per case:
@@ -904,6 +868,7 @@ Mode comparison: fix auth token expiry
904
868
  task = "fix auth token expiry"
905
869
  mode = "balanced"
906
870
  task_type = "backend-api"
871
+ workspace = "apps/api" # optional, for monorepos
907
872
  expected_files = [
908
873
  "src/auth/token.py",
909
874
  "src/auth/session.py",
@@ -917,6 +882,8 @@ expected_files = [
917
882
 
918
883
  Use `--misses` when recall is low. It prints each expected file that was not selected with status, rank, score, and scoring reasons, which helps separate ignored files, budget cuts, low scores, and missing dependency signals.
919
884
 
885
+ Use `--prove-targets` in CI or release prep when benchmark cases have `expected_files`. By default it requires average recall >=60% and token precision >=50%; tune with `--min-recall` and `--min-token-precision`.
886
+
920
887
  Add `task_type` to group results by workflow area. Benchmark summaries report average precision, recall, F1, and token noise by type, so a repo can show "backend-api is good, frontend-web is noisy" instead of hiding that under one aggregate.
921
888
 
922
889
  ---
@@ -925,6 +892,12 @@ Add `task_type` to group results by workflow area. Benchmark summaries report av
925
892
 
926
893
  Scan the repo and report file statistics.
927
894
 
895
+ ```bash
896
+ agentpack scan
897
+ agentpack scan --largest 20
898
+ agentpack scan --ignored-summary
899
+ ```
900
+
928
901
  ```
929
902
  Files discovered: 1,248
930
903
  Files ignored/binary: 230
@@ -933,6 +906,8 @@ Raw estimated tokens: 940,000
933
906
  Tokens after ignore: 210,000
934
907
  ```
935
908
 
909
+ Use `--largest` to find high-token files still entering packs. Use `--ignored-summary` when repo counts look surprising; it groups ignored and binary files by common directories or file extensions.
910
+
936
911
  ---
937
912
 
938
913
  ### `agentpack stats`
@@ -945,7 +920,7 @@ agentpack stats
945
920
 
946
921
  When a session is active, shows session panel (agent, mode, started, refresh count) above token stats. Also lists top included files from the latest pack and avg recall/precision/F1 over the last 10 runs.
947
922
 
948
- Newer metrics include token-weighted precision. File precision answers "how many selected files were later changed"; token precision answers "how many selected tokens were spent on files later changed." `stats` also breaks token precision down by inclusion mode (`full`, `symbols`, `summary`) so summary noise is visible.
923
+ Newer metrics include token-weighted precision. File precision answers "how many selected files were later changed"; token precision answers "how many selected tokens were spent on files later changed." Context precision also credits obvious read-only support context, such as paired tests beside changed source files. `stats` breaks token precision down by inclusion mode (`full`, `symbols`, `summary`) so summary noise is visible. In monorepos, it also reports selected-file distribution by workspace when workspace metadata exists.
949
924
 
950
925
  To build a real usefulness signal for your repo:
951
926
 
@@ -954,13 +929,31 @@ agentpack benchmark --sample-fixtures
954
929
 
955
930
  agentpack benchmark --init
956
931
  # edit .agentpack/benchmark.toml with real tasks + files you actually changed
957
- agentpack benchmark --compare --misses
932
+ agentpack benchmark --compare --misses --prove-targets
958
933
  ```
959
934
 
960
- `--sample-fixtures` runs bundled FastAPI, Next.js, and mixed Python/TypeScript fixture evals from an AgentPack source checkout. It is a smoke test, not a claim about your repo.
935
+ `--sample-fixtures` runs bundled FastAPI, Next.js, mixed Python/TypeScript, Django REST-style, Go service, and Rails-style fixture evals from an AgentPack source checkout. It is a smoke test, not a claim about your repo.
961
936
 
962
937
  For an 8+ usefulness signal, use `benchmark.toml` with real third-party or customer-style repos: 5-20 historical tasks, `task_type` labels, the files actually changed for each task, and `--compare` results for recall, F1, rank@K, and token noise. That is better than trusting generic benchmarks because it tells you whether AgentPack selects the files that matter in code the package has never seen.
963
938
 
939
+ See [benchmarks/README.md](benchmarks/README.md) for the public smoke-suite fixtures, quality gates, and the recommended miss-debugging workflow.
940
+
941
+ ---
942
+
943
+ ### `agentpack tune`
944
+
945
+ Turn noisy `stats` and `benchmark --misses` output into next actions.
946
+
947
+ ```bash
948
+ agentpack tune
949
+ agentpack tune --write
950
+ agentpack tune --no-benchmark
951
+ ```
952
+
953
+ `tune` reads `.agentpack/metrics.jsonl` and, when present, `.agentpack/benchmark_results.jsonl`. It flags low token precision, zero-value summaries, repeated noisy paths, support-context gaps, and benchmark miss patterns. `--write` saves the same guidance to `.agentpack/tuning.md`.
954
+
955
+ This command does not pretend a pack is correct. It gives the next thing to inspect: lower mode, explain noisy files, adjust `.agentignore`, add benchmark cases, or inspect budget/score misses.
956
+
964
957
  ---
965
958
 
966
959
  ### `agentpack status`
@@ -969,11 +962,14 @@ Check whether the context pack is stale.
969
962
 
970
963
  ```bash
971
964
  agentpack status
965
+ agentpack status --deep
972
966
  # Context pack is up to date.
973
967
  # Task: fix auth session bug
974
968
  # Generated: 2026-04-29T12:00:00Z
975
969
  ```
976
970
 
971
+ `--deep` also prints the active agent, CLI path, current task, and integration health for the detected agent.
972
+
977
973
  ---
978
974
 
979
975
  ### `agentpack diff`
@@ -1004,23 +1000,20 @@ agentpack monitor --clear
1004
1000
  ## How it works
1005
1001
 
1006
1002
  ```
1007
- 1. Scan repo → apply .agentignore → hash every file
1008
- 2. Build current snapshotdiff against previous snapshot
1009
- 3. Get git changed/staged files (+ --since <ref> if specified)
1010
- 4. Build import dependency graph (Python/JS/TS: full; Go/Rust/Java: best-effort)
1011
- 5. Detect related test files
1012
- 6. Extract task keywords + concept synonym expansion
1013
- 7. Enrich keywords from changed file content (high-frequency identifiers)
1014
- 8. Score every file, rank by score
1015
- 9. Select within token budget
1016
- 10. For each selected file:
1017
- changed + smallfull content
1018
- changed + largesymbol bodies (ast.get_source_segment)
1019
- unchanged dep summary + signatures
1020
- low-score file summary only
1021
- 11. Generate context receipts (why each file included/excluded)
1022
- 12. Render markdown for target agent → save context pack
1023
- 13. Save snapshot + metadata + metrics
1003
+ 1. Scan repo → apply .agentignore → skip generated AgentPack outputs → hash files
1004
+ 2. Build offline summariesrole, imports, symbols, side effects, public API, errors, test hints
1005
+ 3. Build import dependency graph → Python/JS/TS full, Go/Rust/Java/Kotlin best-effort
1006
+ 4. Detect changed files → snapshot diff + git working tree + staged + optional --since ref
1007
+ 5. Classify task → bugfix / feature / docs / release / infra / audit / test / ui / refactor
1008
+ 6. Extract weighted task terms → literals, variants, concept synonyms, changed-file identifiers
1009
+ 7. Score every file → changes, task terms, symbols, content, deps, tests, configs, churn
1010
+ 8. Apply history learning → gently downrank files that were repeatedly selected as noise
1011
+ 9. Build semantic repo map → compact module/group map reserved inside the token budget
1012
+ 10. Select by value per token → full / diff / symbols / skeleton / summary / omit
1013
+ 11. For large diffsscore hunks against task keywords and keep the most relevant hunks
1014
+ 12. Redact secrets at materializationbefore content reaches any renderer or adapter
1015
+ 13. Render context freshness, task class, repo map, delta since last pack, receipts, files
1016
+ 14. Persist state adapter output, canonical .agentpack/context.md, snapshot, metadata, metrics
1024
1017
  ```
1025
1018
 
1026
1019
  ---
@@ -1153,7 +1146,11 @@ Works like `.gitignore`. Default rules exclude:
1153
1146
  └────────────────────┬────────────────────┘
1154
1147
 
1155
1148
  ┌────────────────────▼────────────────────┐
1156
- ANALYSIS LAYER
1149
+ SUMMARY + ANALYSIS LAYER
1150
+ │ │
1151
+ │ Summary cache ── role, imports, │
1152
+ │ (offline) symbols, side effects, │
1153
+ │ public API, errors │
1157
1154
  │ │
1158
1155
  │ Import graph ── Python AST │
1159
1156
  │ (6 languages) ─ JS/TS regex │
@@ -1170,16 +1167,7 @@ Works like `.gitignore`. Default rules exclude:
1170
1167
  │ Task keywords ── stopwords + variants│
1171
1168
  │ ── concept synonyms │
1172
1169
  │ ── content enrichment │
1173
- └────────────────────┬────────────────────┘
1174
-
1175
- ┌────────────────────▼────────────────────┐
1176
- │ SUMMARY CACHE (offline, local) │
1177
- │ │
1178
- │ key: path + hash + provider + schema │
1179
- │ hit → instant, zero I/O │
1180
- │ miss → build from AST/regex, cache it │
1181
- │ │
1182
- │ offline ── AST / regex extract │
1170
+ │ Task class ── bugfix/docs/release │
1183
1171
  └────────────────────┬────────────────────┘
1184
1172
 
1185
1173
  ┌────────────────────▼────────────────────┐
@@ -1201,17 +1189,27 @@ Works like `.gitignore`. Default rules exclude:
1201
1189
  │ +50 dep +40 rev-dep │
1202
1190
  │ +35 test +25 config +20 recent │
1203
1191
  │ -50 large unrelated │
1192
+ │ History noise penalty from metrics │
1193
+ └────────────────────┬────────────────────┘
1194
+
1195
+ ┌────────────────────▼────────────────────┐
1196
+ │ REPO MAP │
1197
+ │ │
1198
+ │ Compact semantic map grouped by module │
1199
+ │ Reserved inside the context budget │
1204
1200
  └────────────────────┬────────────────────┘
1205
1201
 
1206
1202
  ┌────────────────────▼────────────────────┐
1207
1203
  │ BUDGET SELECTION │
1208
1204
  │ │
1209
- │ Sort by score, consume until budget
1205
+ │ Sort by changed/task/value-per-token
1210
1206
  │ │
1211
1207
  │ changed + small ──▶ full content │
1212
- │ changed + large ──▶ symbol bodies
1213
- unchanged dep ──▶ summary + sigs
1214
- low score ──▶ summary only
1208
+ │ changed + large ──▶ task-scored diff
1209
+ task symbols ──▶ symbol bodies
1210
+ interface view ──▶ skeleton
1211
+ │ low context ──▶ summary/omit │
1212
+ │ budget fallback ──▶ downgrade first │
1215
1213
  └────────────────────┬────────────────────┘
1216
1214
 
1217
1215
  ┌────────────────────▼────────────────────┐
@@ -1224,6 +1222,8 @@ Works like `.gitignore`. Default rules exclude:
1224
1222
  │ Antigravity adapter ──▶ .agent/skills/agentpack/SKILL.md │
1225
1223
  │ Generic adapter ──▶ context.md │
1226
1224
  │ │
1225
+ │ Freshness + task class + repo map │
1226
+ │ Delta since last pack │
1227
1227
  │ Context receipts (why each file in/out)│
1228
1228
  │ Secret redaction (AWS/GH/OpenAI tokens)│
1229
1229
  └─────────────────────────────────────────┘
@@ -1239,16 +1239,16 @@ src/agentpack/
1239
1239
  agentpack.md # bundled /agentpack slash command for Claude CLI
1240
1240
 
1241
1241
  application/
1242
- pack_service.py # PackPlanner: shared scan→rank→select pipeline
1242
+ pack_service.py # PackPlanner: shared scan→summarize→graph→rank→repo_map→select pipeline
1243
1243
  # PackService: materializes plan → writes context file
1244
1244
  # AdapterRegistry: maps agent names to adapter instances
1245
1245
  # PackRequest / PackResult / PackPlan DTOs
1246
1246
 
1247
1247
  domain/ (via core/models.py)
1248
1248
  FileInfo, ScanResult # scan output (packable / ignored / binary)
1249
- Symbol, FileSummary # summary cache objects
1249
+ Symbol, FileSummary # summary cache objects (role, side_effects, public_api, errors, tests)
1250
1250
  SelectedFile, Receipt # selection output with redaction_warnings
1251
- ContextPack # final artifact with redaction_warnings
1251
+ ContextPack # final artifact with freshness, repo_map, delta_summary, redaction_warnings
1252
1252
  DependencyNode # typed graph node (path, imports, imported_by, tests)
1253
1253
  DependencyGraph # typed graph container (nodes dict + dict-like accessors)
1254
1254
 
@@ -1262,7 +1262,7 @@ src/agentpack/
1262
1262
  git.py # subprocess git + task inference from branch/commits
1263
1263
  merkle.py # root hash: sort(path:hash) → sha256
1264
1264
  cache.py # summary cache keyed path+hash+provider+version
1265
- context_pack.py # select_files: budget selection + secret redaction
1265
+ context_pack.py # select_files: full/diff/symbols/skeleton/summary + hunk scoring + redaction
1266
1266
  token_estimator.py # tiktoken cl100k_base (approximate)
1267
1267
  redactor.py # redact_secrets: fires at content materialization
1268
1268
  bootstrap.py # is_initialized, bootstrap_if_needed
@@ -1277,9 +1277,12 @@ src/agentpack/
1277
1277
  symbols.py # AST symbols + body via ast.get_source_segment
1278
1278
  tests.py # source → test file mapping heuristics
1279
1279
  ranking.py # keyword extraction, concept synonyms, scoring
1280
+ monorepo.py # workspace detection + workspace ownership helpers
1281
+ repo_map.py # compact semantic repo map reserved inside token budget
1282
+ task_classifier.py # coarse task class for freshness/rendering/scoring context
1280
1283
 
1281
1284
  summaries/
1282
- offline.py # zero-API: AST/regex → imports, symbols, summary
1285
+ offline.py # zero-API: AST/regex → imports, symbols, role, side effects, API, errors
1283
1286
  base.py # cache-or-build orchestration (parallel, ThreadPool+ProcessPool)
1284
1287
 
1285
1288
  adapters/ # context rendering only — no installation logic
@@ -1300,15 +1303,18 @@ src/agentpack/
1300
1303
  antigravity.py # AntigravityInstaller: GEMINI.md + auto-repack
1301
1304
 
1302
1305
  integrations/ # system/tool integration (not core domain)
1306
+ agents.py # shared agent install/check/repair contract for all supported agents
1303
1307
  git_hooks.py # install/remove .git/hooks post-commit/merge/checkout
1304
1308
  vscode_tasks.py # install/remove .vscode/tasks.json entries
1305
1309
  global_install.py # global: git template hooks + shell rc hook
1306
1310
 
1307
1311
  renderers/
1308
- markdown.py # renders pre-redacted ContextPack to markdown
1312
+ markdown.py # renders pre-redacted ContextPack to markdown, including freshness/map/delta
1309
1313
  compact.py # compact protocol format for session context files
1310
1314
  receipts.py # context receipt formatter
1311
1315
 
1316
+ mcp_server.py # MCP tools: pack_context, get_context, explain, related, stats, delta
1317
+
1312
1318
  session/
1313
1319
  state.py # SessionState dataclass + load/save/create/stop helpers
1314
1320
  __init__.py # re-exports from state.py
@@ -1316,6 +1322,7 @@ src/agentpack/
1316
1322
  commands/ # CLI only — parse args, call services/installers
1317
1323
  pack.py # agentpack pack → PackService.run()
1318
1324
  install.py # agentpack install / global-install → installers/
1325
+ repair.py # agentpack repair → shared integration repair
1319
1326
  init.py # agentpack init
1320
1327
  quickstart.py # agentpack quickstart — guided first-run commands
1321
1328
  scan.py # agentpack scan
@@ -1326,6 +1333,7 @@ src/agentpack/
1326
1333
  monitor.py # agentpack monitor
1327
1334
  explain.py # agentpack explain
1328
1335
  doctor.py # agentpack doctor
1336
+ tune.py # agentpack tune — tuning suggestions from metrics + benchmark misses
1329
1337
  hook_cmd.py # agentpack hook — Claude prompt hook + stale detection
1330
1338
  mcp_cmd.py # agentpack mcp — MCP server entrypoint
1331
1339
  watch.py # agentpack watch — file watcher with debounce
@@ -1337,212 +1345,21 @@ src/agentpack/
1337
1345
 
1338
1346
  - **Redaction at materialization**: secrets are stripped inside `select_files()` before content reaches any renderer or adapter. Every output format gets redacted content automatically — no per-renderer redaction needed.
1339
1347
  - **`ScanResult` splits cleanly**: `scan()` returns `ScanResult(packable, ignored, binary)` — downstream code only processes `packable` files, eliminating `if f.ignored or f.binary` guards throughout.
1340
- - **`PackPlanner` owns shared planning**: `PackPlanner.plan()` runs scan → summarize → graph → rank → select and returns a `PackPlan`. Both `pack` and `explain` use the same planner — no duplicated pipeline logic, no drift.
1341
- - **`PackService` materializes a plan**: takes a `PackPlan`, builds the `ContextPack` artifact, delegates rendering to `AdapterRegistry`, persists snapshot + metadata + metrics.
1348
+ - **`PackPlanner` owns shared planning**: `PackPlanner.plan()` runs scan → summarize → graph → changes → rank → repo map → select and returns a `PackPlan`. Both `pack` and `explain` use the same planner — no duplicated pipeline logic, no drift.
1349
+ - **`PackService` materializes a plan**: takes a `PackPlan`, computes delta since the previous pack, builds the `ContextPack` artifact, delegates rendering to `AdapterRegistry`, persists snapshot + metadata + metrics.
1350
+ - **Mode selection is value-aware**: changed files can be `full`, `diff`, `symbols`, `skeleton`, or `summary`. Large diffs keep task-relevant hunks first, and tight budgets downgrade files before dropping them.
1351
+ - **Repo maps are first-class context**: `analysis/repo_map.py` builds a compact semantic map before file context, and its token cost is reserved before file selection.
1352
+ - **Metrics feed history learning**: selection accuracy records hit/noise paths, token precision, mode counts, and mode tokens. Later packs gently penalize repeated noisy paths unless they are currently changed.
1353
+ - **Git history feeds recall**: files that historically changed in the same commits as live changed files receive a small boost, helping related tests, schemas, services, and configs surface without forcing full-content inclusion.
1354
+ - **Co-change is guarded by precision history**: one-off co-change neighbors are ignored, and paths repeatedly measured as noise do not get revived by history boosts.
1355
+ - **Precision guardrails adapt to bad history**: when summary token precision stays near zero, later packs raise the summary score floor, cap summaries more aggressively, and suppress summaries entirely for no-live-change packs. Weak filename-only matches are also damped unless other signals confirm them.
1342
1356
  - **`AdapterRegistry` maps agent → adapter**: adding a new agent output format requires one entry in `AdapterRegistry.get()`, not changes to `PackService`.
1343
1357
  - **`detect_agent()` runs at invocation time**: `--agent auto` (the default) calls `detect_agent()` fresh on every `pack` run and git hook execution — so context is always written for the active IDE, even when switching between agents or running in CI.
1344
1358
  - **`DependencyGraph` is typed**: `dependency_graph.build()` returns `DependencyGraph(nodes: dict[str, DependencyNode])` — no more `dict[str, dict]` with stringly-typed keys like `"imported_by"`. Typos are caught at the model layer.
1345
1359
  - **`integrations/` vs `core/`**: git hooks, shell rc patching, and VS Code tasks are infrastructure concerns — they live in `integrations/`, not `core/`. `core/` is pure domain logic.
1346
1360
  - **Adapters render; installers configure**: `adapters/` knows how to write a context file for an agent. `installers/` knows how to configure the agent's tool (CLAUDE.md, .cursorrules, settings.json). They are separate concerns and separate classes.
1347
-
1348
- ---
1349
-
1350
- ## Practical examples
1351
-
1352
- ### Bug fix: "I have a failing test, help me fix it"
1353
-
1354
- ```bash
1355
- # You're debugging a test failure in the auth module
1356
- agentpack pack --task "fix failing test in auth token validation"
1357
- ```
1358
-
1359
- AgentPack selects: the failing test file (modified), `auth/token.py` (dep), `auth/session.py` (dep), `config/settings.py` (config), skips 180 unrelated files. Your agent gets 12k tokens of precisely relevant context and starts debugging immediately.
1360
-
1361
- ---
1362
-
1363
- ### Feature: "Add rate limiting to the API"
1364
-
1365
- ```bash
1366
- # On a feature branch, nothing modified yet
1367
- agentpack pack --task "add rate limiting to REST API endpoints"
1368
- ```
1369
-
1370
- Keyword expansion activates: "rate limiting" → `throttle`, `leaky`, `bucket`, `quota`. AgentPack scores: `middleware/` directory (path keyword `api`), existing `throttle.py` or `leaky_bucket.py` (content keyword), `routes/*.py` (deps). Your agent gets the full middleware stack and starts implementing, not exploring.
1371
-
1372
- ---
1373
-
1374
- ### Code review: "Review my PR before I push"
1375
-
1376
- ```bash
1377
- # Review only what changed vs main
1378
- agentpack pack --task "code review auth refactor" --since main
1379
- ```
1380
-
1381
- Only files touched in this branch are included (full content). Everything else is summaries or omitted. Your agent reviews exactly the diff-visible code, not the whole codebase.
1382
-
1383
- ---
1384
-
1385
- ### Refactor: "Help me refactor the database layer"
1386
-
1387
- ```bash
1388
- agentpack pack --task "refactor database connection pooling" --mode deep
1389
- ```
1390
-
1391
- `--mode deep` adds: related docs, more full-content files, broader dep tree. Use when the task touches many files and you want your agent to see more context upfront.
1392
-
1393
- ---
1394
-
1395
- ### CI: automated context on every PR
1396
-
1397
- Add to `.github/workflows/agentpack-context.yml` — see the full example in [CI/CD: pack per PR](#cicd-pack-per-pr). Reviewers and CI bots get focused context without cloning the repo.
1398
-
1399
- ---
1400
-
1401
- ### Session mode: keep context fresh while you work
1402
-
1403
- ```bash
1404
- # One-time project setup
1405
- agentpack init # creates config/session/task.md + detected agent integration
1406
- # Edit .agentpack/task.md to set your task
1407
-
1408
- # Every terminal session — just one command
1409
- agentpack watch # keeps context fresh automatically
1410
-
1411
- # Change task mid-session: edit .agentpack/task.md directly
1412
- # watch detects the change and refreshes automatically
1413
- ```
1414
-
1415
- ---
1416
-
1417
- ### Debug why a file isn't showing up
1418
-
1419
- ```bash
1420
- agentpack explain --task "fix rate limiting in auth middleware"
1421
- # Top selected files:
1422
- # 1. src/auth/middleware.py score=180 [full] modified, filename keyword match
1423
- # 2. src/auth/limiter.py score=130 [symbols] dep + content keyword "throttle"
1424
- # ...
1425
- # Excluded:
1426
- # - src/payments/billing.py score=8 score too low
1427
- ```
1428
-
1429
- ---
1430
-
1431
- ## Tips & tricks
1432
-
1433
- ### Let `--task auto` do the work
1434
-
1435
- Skip writing a task description — agentpack infers it from your branch name, changed files, and recent commits:
1436
-
1437
- ```bash
1438
- agentpack pack --task auto
1439
- ```
1440
-
1441
- Priority order (strongest → weakest):
1442
-
1443
- | Source | Example output |
1444
- |--------|---------------|
1445
- | `task.md` (explicit) | `"migrate DB schema to multi-tenant"` |
1446
- | branch + staged files | `"feat add-rate-limiting: payments, throttle"` |
1447
- | staged files only | `"payments, throttle"` |
1448
- | branch + unstaged | `"feat add-rate-limiting: session, token"` |
1449
- | branch + latest commit | `"feat add-rate-limiting: fix token expiry check"` |
1450
- | branch name alone | `"feat add-rate-limiting"` |
1451
- | unstaged files | `"session, token"` |
1452
- | recent commit messages | `"fix token expiry check; add pagination"` |
1453
- | recently modified files | `"session, payments"` (noisy — last resort) |
1454
-
1455
- The heuristic that fired is logged: `Auto task (branch+staged): feat add-rate-limiting: payments`.
1456
-
1457
- The more descriptive your branch names (`feat/add-rate-limiting` beats `dev`) and the more you stage before running, the more accurate the inference.
1458
-
1459
- ### Concept synonym expansion
1460
-
1461
- AgentPack expands task keywords automatically — "rate limiting" expands to `throttle`, `leaky`, `bucket`, `quota`, `debounce`; "auth" expands to `jwt`, `bearer`, `token`, `oauth`; "cache" expands to `lru`, `memoize`, `redis`, `ttl`; domain terms such as `kundali` expand toward astrology/chart/compatibility terms. Files that implement a concept but don't use its exact name can still rank.
1462
-
1463
- ### Full-stack role boosts
1464
-
1465
- When a task points at a page, route, or API surface, AgentPack also gives a controlled boost to related implementation roles such as `service`, `controller`, `schema`, `handler`, `repository`, and `client`. This helps full-stack tasks pull backend implementation files instead of only frontend entrypoints.
1466
-
1467
- This is still heuristic. If a service should have appeared and did not, add it as an `expected_files` entry in `benchmark.toml` and run:
1468
-
1469
- ```bash
1470
- agentpack benchmark --compare --misses
1471
- ```
1472
-
1473
- ### Content-based keyword enrichment
1474
-
1475
- When you run `agentpack pack`, changed file content is scanned for high-frequency identifiers. If you're editing `session_manager.py` that mentions `validate_token` 30 times, `validate` and `token` are added as keywords — related files that use the same terms get a score boost even if your task string didn't mention them.
1476
-
1477
- ### Commit the summary cache for instant team packs
1478
-
1479
- ```bash
1480
- agentpack init --share-cache
1481
- git add .agentpack/cache/
1482
- git commit -m "chore: add agentpack summary cache"
1483
- ```
1484
-
1485
- Every teammate and CI job skips the summarize step. `agentpack pack` is significantly faster from a warm cache.
1486
-
1487
- ### Use `--since` for PR reviews
1488
-
1489
- ```bash
1490
- agentpack pack --task "review auth changes" --since main
1491
- ```
1492
-
1493
- Only includes files changed since `main`. Cuts out noise from unrelated work in long-running branches.
1494
-
1495
- ### Tune the budget for your use case
1496
-
1497
- ```bash
1498
- agentpack pack --task "fix bug" --mode minimal # changed files only, fewest tokens
1499
- agentpack pack --task "refactor" --mode deep # everything including docs
1500
- agentpack pack --task "fix bug" --budget 40000 # explicit token cap
1501
- ```
1502
-
1503
- `balanced` (default) is right for most tasks. Use `minimal` for quick fixes, `deep` when architectural context matters.
1504
-
1505
- ### Watch mode for active sessions
1506
-
1507
- ```bash
1508
- agentpack init # one-time setup (creates session/task.md + detected agent integration)
1509
- agentpack watch # in another terminal — auto-resumes each time
1510
- ```
1511
-
1512
- Refreshes `.agentpack/context.md` every time you save a file. Change the task by editing `.agentpack/task.md` directly — or tell Claude and it writes the file itself. `watch` picks up the change automatically.
1513
-
1514
- ### Debug file selection with `explain`
1515
-
1516
- ```bash
1517
- agentpack explain --task "fix auth session bug"
1518
- ```
1519
-
1520
- Shows ranked scores and reasons before committing to a pack. Use when a file you expect isn't appearing.
1521
-
1522
- For repeatable evals, prefer `benchmark --misses` because it compares selected files against the files you actually changed for historical tasks.
1523
-
1524
- ### Check what got included and why
1525
-
1526
- Every pack includes a context receipt explaining each file's inclusion or exclusion:
1527
-
1528
- ```
1529
- - `src/auth.py` included because modified, filename keyword match
1530
- - `tests/test_auth.py` summarized because test for src/auth.py
1531
- - `src/unrelated_big.py` excluded because score too low
1532
- ```
1533
-
1534
- Use this to tune your `.agentignore` or scoring weights when irrelevant files keep appearing.
1535
-
1536
- ### Tune scoring weights per project
1537
-
1538
- If tests are always irrelevant to your tasks, drop their weight. If config files are critical, raise them:
1539
-
1540
- ```toml
1541
- # .agentpack/config.toml
1542
- [scoring]
1543
- related_test = 5 # was 35 — tests rarely relevant
1544
- config_file = 60 # was 25 — configs always matter here
1545
- ```
1361
+ - **Agent integration contract is shared**: `integrations/agents.py` defines install, audit, and repair behavior for Claude, Cursor, Windsurf, Codex, Antigravity, and Generic. `install`, `repair`, `doctor --agent all`, and release verification use the same contract.
1362
+ - **MCP and hooks use deltas when possible**: MCP exposes `get_delta_context()`, and prompt hooks can emit task/top-file/delta hints instead of injecting the full context every time.
1546
1363
 
1547
1364
  ---
1548
1365
 
@@ -1560,7 +1377,8 @@ config_file = 60 # was 25 — configs always matter here
1560
1377
  ## Known limitations
1561
1378
 
1562
1379
  - **Windows**: not supported. Git hooks use POSIX shell (`#!/bin/sh`, `>/dev/null 2>&1 &`). The Claude Code session hooks use `python3` and `rm -f`. Contributions welcome.
1563
- - **Monorepos**: single-root repos only. If you `agentpack pack` from a monorepo root, all packages are scanned together with no workspace awareness. Workaround: `cd packages/my-pkg && agentpack init && agentpack pack`.
1380
+ - **Monorepos**: workspace-aware ranking supports npm/pnpm, Cargo, and `go.work` layouts. `--workspace` creates filtered per-workspace outputs. Package dependency hints currently come from npm/pnpm `package.json`; Cargo/Go workspace membership is detected, but package-manager dependency edges for Cargo/Go are not yet modeled.
1381
+ - **Public benchmark proof**: source-checkout fixture results are useful regressions, not market proof. Use `agentpack benchmark --results-template` to publish real historical task results.
1564
1382
  - **Symbol extraction**: Python (AST, full) and JavaScript/TypeScript (regex, arrow functions + classes) are well-supported. Go, Rust, Java, Kotlin have import graph traversal but no symbol extraction — they fall back to file-level summaries.
1565
1383
  - **Selection recall**: ranking is heuristic. It can miss files when task language differs from code language, when repos have unusual architecture, or when important files are only connected at runtime.
1566
1384
  - **Secret redaction**: covers AWS keys, GitHub tokens, OpenAI/Anthropic keys, JWTs, and private key blocks. Not a substitute for a dedicated secrets scanner on sensitive repos.
@@ -1569,6 +1387,18 @@ config_file = 60 # was 25 — configs always matter here
1569
1387
 
1570
1388
  ---
1571
1389
 
1390
+ ## Roadmap
1391
+
1392
+ Next release target: **0.2.0 = benchmark + recall release**.
1393
+
1394
+ - Expand public source-checkout fixtures and publish reproducible `benchmark --sample-fixtures --compare --misses` output.
1395
+ - Raise recall on real historical tasks while keeping token precision healthy; target 60%+ recall, 50%+ token precision, and balanced packs under 25k tokens.
1396
+ - Improve second-pass expansion beyond current imports, reverse imports, related tests, historical co-change, and workspace hints with framework route/service/schema pairs.
1397
+ - Make MCP pull flows more prominent so agents can ask for `explain_file`, `get_related_files`, and `get_delta_context` instead of relying only on a static startup pack.
1398
+ - Keep integration contracts stable across Claude, Cursor, Windsurf, Codex, Antigravity, and Generic before any 1.0 work.
1399
+
1400
+ ---
1401
+
1572
1402
  ## Optional dependencies
1573
1403
 
1574
1404
  ```bash
@@ -1598,9 +1428,13 @@ python -m ruff check src tests
1598
1428
  python -m build
1599
1429
  npm test --prefix npm
1600
1430
  (cd npm && npm pack --dry-run)
1431
+ pytest tests/test_agent_integration_matrix.py -q
1601
1432
  agentpack benchmark --sample-fixtures --misses
1433
+ agentpack doctor
1602
1434
  ```
1603
1435
 
1436
+ For npm publish, configure GitHub secret `NPM_TOKEN`. `agentpack doctor` warns locally when neither `NPM_TOKEN` nor `NODE_AUTH_TOKEN` is present, and the npm publish workflow fails early with a clear error if the secret is missing.
1437
+
1604
1438
  Good contribution areas:
1605
1439
 
1606
1440
  - More real-world benchmark fixtures and public repo eval cases