PyPI - agentpack-cli - Versions diffs - 0.1.20__tar.gz → 0.1.21__tar.gz - Mend

agentpack-cli 0.1.20tar.gz → 0.1.21tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (84) hide show

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: agentpack-cli
-Version: 0.1.20
+Version: 0.1.21
 Summary: Token-aware context packing for AI coding agents — Claude, Cursor, Windsurf, and Codex
 License: MIT
 License-File: LICENSE
@@ -44,7 +44,7 @@ Description-Content-Type: text/markdown
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![CI](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml/badge.svg)](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml)
-> **Status: alpha (v0.1.20).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Not yet validated across a wide range of repos. API may change before 1.0.
+> **Status: alpha (v0.1.21).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Not yet validated across a wide range of repos. API may change before 1.0.
 >
 > **Platform note:** macOS and Linux are fully supported. Windows support is not yet implemented (git hooks use POSIX shell; the Claude Code session hooks use `python3`/`rm -f`). Contributions welcome.
@@ -72,8 +72,9 @@ AgentPack solves this with a one-time offline analysis pass:
 1. **Scans your repo once** — builds a summary cache of every file (signatures, imports, responsibilities). No API calls. Takes a few seconds.
 2. **On each task** — uses git diff + import graph traversal + keyword scoring to rank every file by relevance to what you're working on.
-3. **Packs a tight context document** — changed files get full content, dependencies get summaries, everything else gets dropped. Typically 8k–20k tokens for a 200-file repo.
-4. **Stays current** — auto-repacks silently on commit, so next session starts fresh.
+3. **Packs a tight context document** — changed files get full content, large changed files get relevant symbol bodies, dependencies get summaries, everything else gets dropped. Typically 8k–20k tokens for a 200-file repo.
+4. **Explains pack quality** — noisy-pack diagnostics, score receipts, and token-precision metrics show when the pack is broad and where token noise lives.
+5. **Stays current** — auto-repacks silently on commit, so next session starts fresh.
 The result: your agent starts every session with a focused, accurate picture of the relevant code — without you doing anything after opt-in.
@@ -546,6 +547,7 @@ Some checks failed. Run the suggested commands above to fix.
 The new checks in `doctor`:
 - **Local vs global hooks**: warns when Claude hooks are only in the per-project `.claude/settings.json` — context won't auto-inject in other repos
 - **Slash command presence**: checks both local (`.claude/commands/`) and global (`~/.claude/commands/`) installations
+- **Source checkout mismatch**: warns when you're inside an AgentPack source checkout but the `agentpack` executable imports the installed site-packages copy. Use `PYTHONPATH=src python -m agentpack.cli ...` or `pip install -e .` for local development.
 ---
@@ -632,9 +634,11 @@ Options:
 | Mode | What's included |
 |------|----------------|
-| `minimal` | Changed files + direct configs only |
-| `balanced` | Changed files + deps + reverse deps + tests + summaries |
-| `deep` | Everything in balanced + docs + more full-content files |
+| `minimal` | Changed files + direct configs, with a small summary cap |
+| `balanced` | Changed files + deps + reverse deps + tests + capped summaries |
+| `deep` | Everything in balanced + docs + more full-content files, uncapped summaries |
+`pack` also prints diagnostics when the pack looks noisy: very short task text, no changed files, mostly filename matches, mostly summaries, many symbol matches, weak summaries excluded by the score floor, or summaries excluded by the mode cap.
 ---
@@ -832,7 +836,9 @@ Show session state, token statistics, and selection accuracy for the last pack.
 agentpack stats
 ```
-When a session is active, shows session panel (agent, mode, started, refresh count) above token stats. Also lists top included files by score and avg recall/precision/F1 over the last 10 runs.
+When a session is active, shows session panel (agent, mode, started, refresh count) above token stats. Also lists top included files from the latest pack and avg recall/precision/F1 over the last 10 runs.
+Newer metrics include token-weighted precision. File precision answers "how many selected files were later changed"; token precision answers "how many selected tokens were spent on files later changed." `stats` also breaks token precision down by inclusion mode (`full`, `symbols`, `summary`) so summary noise is visible.
 ---
@@ -917,7 +923,7 @@ agentpack monitor --clear
 | Large unrelated file | −50 |
 | Ignored/binary | −100 |
-Keyword scoring uses concept synonym expansion — "rate limiting" in the task expands to `throttle`, `leaky`, `bucket`, `quota` etc., so `leaky_bucket.py` ranks correctly even if the file name doesn't literally contain "rate".
+Keyword scoring uses weighted concept synonym expansion — literal task terms are strongest, normalized variants are slightly weaker, and broad concept synonyms are weaker again. "rate limiting" still expands to `throttle`, `leaky`, `bucket`, `quota`, but broad expansions no longer dominate literal task terms. Matching is token-based, so `task` does not accidentally match every `tasks.py`.
 ---
@@ -934,6 +940,10 @@ ignore_file = ".agentignore"
 default_budget = 25000
 default_mode = "balanced"
 max_file_tokens = 4000
+min_summary_score = 60
+max_summary_files_minimal = 15
+max_summary_files_balanced = 40
+max_summary_files_deep = 0
 include_tests = true
 include_configs = true
 include_receipts = true

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/README.md RENAMED Viewed

@@ -5,7 +5,7 @@
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![CI](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml/badge.svg)](https://github.com/vishal2612200/agentpack/actions/workflows/ci.yml)
-> **Status: alpha (v0.1.20).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Not yet validated across a wide range of repos. API may change before 1.0.
+> **Status: alpha (v0.1.21).** Works, tested, used in real sessions. Python and JavaScript/TypeScript are the best-supported languages. Not yet validated across a wide range of repos. API may change before 1.0.
 >
 > **Platform note:** macOS and Linux are fully supported. Windows support is not yet implemented (git hooks use POSIX shell; the Claude Code session hooks use `python3`/`rm -f`). Contributions welcome.
@@ -33,8 +33,9 @@ AgentPack solves this with a one-time offline analysis pass:
 1. **Scans your repo once** — builds a summary cache of every file (signatures, imports, responsibilities). No API calls. Takes a few seconds.
 2. **On each task** — uses git diff + import graph traversal + keyword scoring to rank every file by relevance to what you're working on.
-3. **Packs a tight context document** — changed files get full content, dependencies get summaries, everything else gets dropped. Typically 8k–20k tokens for a 200-file repo.
-4. **Stays current** — auto-repacks silently on commit, so next session starts fresh.
+3. **Packs a tight context document** — changed files get full content, large changed files get relevant symbol bodies, dependencies get summaries, everything else gets dropped. Typically 8k–20k tokens for a 200-file repo.
+4. **Explains pack quality** — noisy-pack diagnostics, score receipts, and token-precision metrics show when the pack is broad and where token noise lives.
+5. **Stays current** — auto-repacks silently on commit, so next session starts fresh.
 The result: your agent starts every session with a focused, accurate picture of the relevant code — without you doing anything after opt-in.
@@ -507,6 +508,7 @@ Some checks failed. Run the suggested commands above to fix.
 The new checks in `doctor`:
 - **Local vs global hooks**: warns when Claude hooks are only in the per-project `.claude/settings.json` — context won't auto-inject in other repos
 - **Slash command presence**: checks both local (`.claude/commands/`) and global (`~/.claude/commands/`) installations
+- **Source checkout mismatch**: warns when you're inside an AgentPack source checkout but the `agentpack` executable imports the installed site-packages copy. Use `PYTHONPATH=src python -m agentpack.cli ...` or `pip install -e .` for local development.
 ---
@@ -593,9 +595,11 @@ Options:
 | Mode | What's included |
 |------|----------------|
-| `minimal` | Changed files + direct configs only |
-| `balanced` | Changed files + deps + reverse deps + tests + summaries |
-| `deep` | Everything in balanced + docs + more full-content files |
+| `minimal` | Changed files + direct configs, with a small summary cap |
+| `balanced` | Changed files + deps + reverse deps + tests + capped summaries |
+| `deep` | Everything in balanced + docs + more full-content files, uncapped summaries |
+`pack` also prints diagnostics when the pack looks noisy: very short task text, no changed files, mostly filename matches, mostly summaries, many symbol matches, weak summaries excluded by the score floor, or summaries excluded by the mode cap.
 ---
@@ -793,7 +797,9 @@ Show session state, token statistics, and selection accuracy for the last pack.
 agentpack stats
 ```
-When a session is active, shows session panel (agent, mode, started, refresh count) above token stats. Also lists top included files by score and avg recall/precision/F1 over the last 10 runs.
+When a session is active, shows session panel (agent, mode, started, refresh count) above token stats. Also lists top included files from the latest pack and avg recall/precision/F1 over the last 10 runs.
+Newer metrics include token-weighted precision. File precision answers "how many selected files were later changed"; token precision answers "how many selected tokens were spent on files later changed." `stats` also breaks token precision down by inclusion mode (`full`, `symbols`, `summary`) so summary noise is visible.
 ---
@@ -878,7 +884,7 @@ agentpack monitor --clear
 | Large unrelated file | −50 |
 | Ignored/binary | −100 |
-Keyword scoring uses concept synonym expansion — "rate limiting" in the task expands to `throttle`, `leaky`, `bucket`, `quota` etc., so `leaky_bucket.py` ranks correctly even if the file name doesn't literally contain "rate".
+Keyword scoring uses weighted concept synonym expansion — literal task terms are strongest, normalized variants are slightly weaker, and broad concept synonyms are weaker again. "rate limiting" still expands to `throttle`, `leaky`, `bucket`, `quota`, but broad expansions no longer dominate literal task terms. Matching is token-based, so `task` does not accidentally match every `tasks.py`.
 ---
@@ -895,6 +901,10 @@ ignore_file = ".agentignore"
 default_budget = 25000
 default_mode = "balanced"
 max_file_tokens = 4000
+min_summary_score = 60
+max_summary_files_minimal = 15
+max_summary_files_balanced = 40
+max_summary_files_deep = 0
 include_tests = true
 include_configs = true
 include_receipts = true

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "agentpack-cli"
-version = "0.1.20"
+version = "0.1.21"
 description = "Token-aware context packing for AI coding agents — Claude, Cursor, Windsurf, and Codex"
 readme = "README.md"
 requires-python = ">=3.10"

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/__init__.py RENAMED Viewed

@@ -1,3 +1,3 @@
 """AgentPack — token-aware context packing for AI coding agents."""
-__version__ = "0.1.20"
+__version__ = "0.1.21"

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/analysis/ranking.py RENAMED Viewed

@@ -181,29 +181,38 @@ CONFIG_NAMES = {
 _DEFAULT_WEIGHTS = ScoringWeights()
-def extract_keywords(task: str) -> set[str]:
+def _add_keyword_weight(weights: dict[str, float], keyword: str, weight: float) -> None:
+    weights[keyword] = max(weights.get(keyword, 0.0), weight)
+def extract_keyword_weights(task: str) -> dict[str, float]:
     words = re.split(r"[^a-zA-Z0-9]+", task.lower())
-    keywords: set[str] = set()
+    keyword_weights: dict[str, float] = {}
     for word in words:
         if len(word) < 3:
             continue
         if word in _STOPWORDS:
             continue
-        keywords.add(word)
+        _add_keyword_weight(keyword_weights, word, 1.0)
         if word in _VARIANTS:
-            keywords.add(_VARIANTS[word])
+            _add_keyword_weight(keyword_weights, _VARIANTS[word], 0.75)
-    # expand via concept map (one level only — no recursion to avoid explosion)
-    expanded: set[str] = set()
-    for kw in keywords:
+    # Expand via concept map one level only. Expanded concepts are weaker than
+    # literal task words so broad terms like "task" do not dominate ranking.
+    expanded: dict[str, float] = {}
+    for kw in keyword_weights:
         if kw in _CONCEPT_MAP:
             for synonym in _CONCEPT_MAP[kw]:
-                expanded.add(synonym)
-                # also apply _VARIANTS to expanded terms
+                _add_keyword_weight(expanded, synonym, 0.35)
                 if synonym in _VARIANTS:
-                    expanded.add(_VARIANTS[synonym])
-    keywords.update(expanded)
-    return keywords
+                    _add_keyword_weight(expanded, _VARIANTS[synonym], 0.35)
+    for kw, weight in expanded.items():
+        _add_keyword_weight(keyword_weights, kw, weight)
+    return keyword_weights
+def extract_keywords(task: str) -> set[str]:
+    return set(extract_keyword_weights(task))
 def enrich_keywords_from_files(
@@ -255,21 +264,62 @@ def enrich_keywords_from_files(
     return keywords | set(top)
-def _path_matches_keywords(path: str, keywords: set[str]) -> bool:
-    path_lower = path.lower()
-    return any(kw in path_lower for kw in keywords)
+def enrich_keyword_weights_from_files(
+    keyword_weights: dict[str, float],
+    changed_paths: set[str],
+    files: list[FileInfo],
+    max_new_keywords: int = 20,
+) -> dict[str, float]:
+    enriched = dict(keyword_weights)
+    enriched_keywords = enrich_keywords_from_files(set(keyword_weights), changed_paths, files, max_new_keywords)
+    for keyword in enriched_keywords - set(keyword_weights):
+        enriched[keyword] = 0.5
+    return enriched
+def _tokens_for_match(text: str) -> set[str]:
+    """Return identifier-ish tokens for exact keyword matching."""
+    spaced = re.sub(r"([a-z0-9])([A-Z])", r"\1 \2", text)
+    raw_tokens = re.split(r"[^a-zA-Z0-9]+", spaced.lower())
+    return {tok for tok in raw_tokens if tok}
+def _keyword_token_weights(keywords: set[str] | dict[str, float]) -> dict[str, float]:
+    if isinstance(keywords, dict):
+        items = keywords.items()
+    else:
+        items = ((keyword, 1.0) for keyword in keywords)
+    token_weights: dict[str, float] = {}
+    for keyword, weight in items:
+        for token in _tokens_for_match(keyword):
+            if len(token) >= 3:
+                token_weights[token] = max(token_weights.get(token, 0.0), weight)
+    return token_weights
+def _match_weight(text: str, keywords: set[str] | dict[str, float]) -> float:
+    token_weights = _keyword_token_weights(keywords)
+    matches = _tokens_for_match(text) & set(token_weights)
+    return max((token_weights[token] for token in matches), default=0.0)
-def _content_matches_keywords(text: str, keywords: set[str]) -> int:
-    text_lower = text.lower()
-    return sum(1 for kw in keywords if kw in text_lower)
+def _path_matches_keywords(path: str, keywords: set[str] | dict[str, float]) -> float:
+    return _match_weight(path, keywords)
-def _symbol_matches_keywords(symbols: list[str], keywords: set[str]) -> bool:
+def _content_matches_keywords(text: str, keywords: set[str] | dict[str, float]) -> tuple[int, float]:
+    token_weights = _keyword_token_weights(keywords)
+    text_tokens = _tokens_for_match(text)
+    matches = text_tokens & set(token_weights)
+    return len(matches), sum(token_weights[token] for token in matches)
+def _symbol_matches_keywords(symbols: list[str], keywords: set[str] | dict[str, float]) -> float:
+    best_weight = 0.0
     for sym in symbols:
-        if any(kw in sym.lower() for kw in keywords):
-            return True
-    return False
+        best_weight = max(best_weight, _match_weight(sym, keywords))
+    return best_weight
 def score_files(
@@ -278,7 +328,7 @@ def score_files(
     staged_paths: set[str],
     recently_modified: list[str],
     dep_graph: "DependencyGraph | dict",
-    keywords: set[str],
+    keywords: set[str] | dict[str, float],
     include_tests: bool = True,
     include_configs: bool = True,
     weights: ScoringWeights | None = None,
@@ -315,8 +365,9 @@ def score_files(
             score += w.staged
             reasons.append("staged")
-        if _path_matches_keywords(fi.path, keywords):
-            score += w.filename_keyword
+        filename_weight = _path_matches_keywords(fi.path, keywords)
+        if filename_weight > 0:
+            score += w.filename_keyword * filename_weight
             reasons.append("filename keyword match")
         node = dep_graph.get(fi.path)
@@ -327,27 +378,28 @@ def score_files(
                 (s["name"] if isinstance(s, dict) else s.name)
                 for s in raw_syms
             ]
-        if _symbol_matches_keywords(sym_names, keywords):
-            score += w.symbol_keyword
+        symbol_weight = _symbol_matches_keywords(sym_names, keywords)
+        if symbol_weight > 0:
+            score += w.symbol_keyword * symbol_weight
             reasons.append("symbol keyword match")
         if fi.content is not None:
-            hits = _content_matches_keywords(fi.content, keywords)
+            hits, hit_weight = _content_matches_keywords(fi.content, keywords)
             if hits > 0:
-                score += min(w.content_keyword_max, hits * w.content_keyword_per_hit)
+                score += min(w.content_keyword_max, hit_weight * w.content_keyword_per_hit)
                 reasons.append(f"content keyword match ({hits})")
         elif fi.abs_path.exists():
             try:
                 text = fi.abs_path.read_text(errors="replace")
-                hits = _content_matches_keywords(text, keywords)
+                hits, hit_weight = _content_matches_keywords(text, keywords)
                 if hits > 0:
-                    score += min(w.content_keyword_max, hits * w.content_keyword_per_hit)
+                    score += min(w.content_keyword_max, hit_weight * w.content_keyword_per_hit)
                     reasons.append(f"content keyword match ({hits})")
             except OSError:
                 pass
         for dep_path in node.imports:
-            if dep_path in changed_paths or _path_matches_keywords(dep_path, keywords):
+            if dep_path in changed_paths or _path_matches_keywords(dep_path, keywords) > 0:
                 score += w.direct_dep
                 reasons.append("direct dependency of changed file")
                 break

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/application/pack_service.py RENAMED Viewed

@@ -16,7 +16,12 @@ from agentpack.core import git
 from agentpack.core.context_pack import select_files, save_pack_metadata
 from agentpack.core.models import ContextPack, DependencyGraph, FileInfo, ScanResult, SelectedFile, Receipt
 from agentpack.core.token_estimator import estimate_tokens
-from agentpack.analysis.ranking import score_files, extract_keywords, enrich_keywords_from_files, boost_paired_tests
+from agentpack.analysis.ranking import (
+    score_files,
+    extract_keyword_weights,
+    enrich_keyword_weights_from_files,
+    boost_paired_tests,
+)
 from agentpack.analysis.tests import find_related_tests
 from agentpack.analysis import dependency_graph as dep_graph_mod
 from agentpack.summaries.base import build_all_summaries
@@ -131,8 +136,9 @@ class FileRanker:
         root: Path | None = None,
     ) -> RankResult:
         from agentpack.core import git as _git
-        keywords = extract_keywords(task)
-        keywords = enrich_keywords_from_files(keywords, changes.all_changed, packable)
+        keyword_weights = extract_keyword_weights(task)
+        keyword_weights = enrich_keyword_weights_from_files(keyword_weights, changes.all_changed, packable)
+        keywords = set(keyword_weights)
         all_paths = {f.path for f in packable}
         for fi in packable:
@@ -149,7 +155,7 @@ class FileRanker:
             staged_paths=changes.git_staged,
             recently_modified=changes.recently_modified,
             dep_graph=dep_graph,
-            keywords=keywords,
+            keywords=keyword_weights,
             include_tests=cfg.context.include_tests,
             include_configs=cfg.context.include_configs,
             weights=cfg.scoring,
@@ -209,6 +215,8 @@ class PackPlanner:
             budget=effective_budget,
             max_file_tokens=cfg.context.max_file_tokens,
             keywords=rank_result.keywords,
+            min_summary_score=cfg.context.min_summary_score,
+            max_summary_files=_summary_cap_for_mode(cfg, request.mode),
         )
         phase_times["select"] = time.perf_counter() - t0
@@ -317,6 +325,8 @@ class PackService:
             selected_count=len(plan.selected),
             changed_count=len(plan.all_changed),
             selected_paths=[sf.path for sf in plan.selected],
+            selected_tokens={sf.path: _sf_tokens(sf) for sf in plan.selected},
+            selected_modes={sf.path: sf.include_mode for sf in plan.selected},
             selected_hints=[{"path": sf.path, "why": sf.reasons[0] if sf.reasons else ""} for sf in plan.selected[:8]],
             current_changed=plan.all_changed,
             excluded_count=len(excluded_receipts),
@@ -347,6 +357,16 @@ def _sf_tokens(sf: SelectedFile) -> int:
     return estimate_tokens("\n".join(parts)) if parts else 50
+def _summary_cap_for_mode(cfg: Any, mode: str) -> int:
+    if mode == "minimal":
+        return cfg.context.max_summary_files_minimal
+    if mode == "balanced":
+        return cfg.context.max_summary_files_balanced
+    if mode == "deep":
+        return cfg.context.max_summary_files_deep
+    return 0
 def _load_last_record(metrics_path: Path) -> dict[str, Any] | None:
     """Return the most recent metrics record that has selected_paths."""
     if not metrics_path.exists():
@@ -390,11 +410,41 @@ def _compute_selection_accuracy(
     recall = len(hits) / len(actual_changed)
     precision = len(hits) / len(prev_selected)
     f1 = (2 * precision * recall / (precision + recall)) if (precision + recall) > 0 else 0.0
-    return {
+    result = {
         "selection_recall": round(recall, 3),
         "selection_precision": round(precision, 3),
         "selection_f1": round(f1, 3),
     }
+    token_map = prev.get("selected_tokens") or {}
+    if isinstance(token_map, dict):
+        total_tokens = sum(v for v in token_map.values() if isinstance(v, int | float))
+        hit_tokens = sum(
+            token_map.get(path, 0)
+            for path in hits
+            if isinstance(token_map.get(path, 0), int | float)
+        )
+        if total_tokens > 0:
+            token_precision = hit_tokens / total_tokens
+            result["selection_token_precision"] = round(token_precision, 3)
+            result["selection_noise_pct"] = round((1 - token_precision) * 100, 1)
+        mode_map = prev.get("selected_modes") or {}
+        if isinstance(mode_map, dict):
+            for mode in ("full", "symbols", "summary"):
+                mode_paths = {path for path, value in mode_map.items() if value == mode}
+                mode_total = sum(
+                    token_map.get(path, 0)
+                    for path in mode_paths
+                    if isinstance(token_map.get(path, 0), int | float)
+                )
+                if mode_total <= 0:
+                    continue
+                mode_hit_tokens = sum(
+                    token_map.get(path, 0)
+                    for path in mode_paths & hits
+                    if isinstance(token_map.get(path, 0), int | float)
+                )
+                result[f"selection_token_precision_{mode}"] = round(mode_hit_tokens / mode_total, 3)
+    return result
 def _record_metrics(
@@ -409,6 +459,8 @@ def _record_metrics(
     selected_count: int,
     changed_count: int,
     selected_paths: list[str],
+    selected_tokens: dict[str, int],
+    selected_modes: dict[str, str],
     current_changed: set[str],
     selected_hints: list[dict] | None = None,
     excluded_count: int = 0,
@@ -428,6 +480,8 @@ def _record_metrics(
         "excluded_files": excluded_count,
         "excluded_paths": excluded_paths or [],
         "selected_paths": selected_paths,
+        "selected_tokens": selected_tokens,
+        "selected_modes": selected_modes,
         "selected_hints": selected_hints or [],
         "phases": {k: round(v, 3) for k, v in phase_times.items()},
         "total_s": round(sum(phase_times.values()), 3),

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/commands/doctor.py RENAMED Viewed

@@ -3,6 +3,7 @@ from __future__ import annotations
 import os
 import shutil
 import subprocess
+import sys
 from pathlib import Path
 import typer
@@ -37,6 +38,15 @@ def register(app: typer.Typer) -> None:
             console.print("  [red]✗[/] agentpack not on PATH — run: pipx install agentpack-cli")
             ok = False
+        try:
+            root = _root()
+            warning = _source_checkout_warning(root, Path(__file__), sys.executable, binary)
+            if warning:
+                console.print(f"  [yellow]![/] {warning}")
+                ok = False
+        except Exception:
+            pass
         # --- Git template hooks ---
         console.print("\n[bold]Git template hooks (~/.git-templates/hooks/)[/]")
         hooks_dir = _GIT_TEMPLATE_DIR / "hooks"
@@ -234,6 +244,30 @@ def _check_agent_file(root: Path, filename: str, agent: str) -> None:
         console.print(f"  [dim]-[/] {filename} not present (optional)")
+def _source_checkout_warning(
+    root: Path,
+    package_file: Path,
+    executable: str,
+    binary: str | None,
+) -> str | None:
+    source_pkg = root / "src" / "agentpack"
+    if not source_pkg.exists():
+        return None
+    try:
+        package_path = package_file.resolve()
+        source_path = source_pkg.resolve()
+    except OSError:
+        return None
+    if package_path.is_relative_to(source_path):
+        return None
+    binary_text = f" via {binary}" if binary else ""
+    return (
+        "source checkout detected, but CLI imports installed package "
+        f"at {package_path}{binary_text}. Use `PYTHONPATH=src python -m agentpack.cli ...` "
+        "or install editable with `pip install -e .`."
+    )
 def _print_summary(ok: bool) -> None:
     console.print("")
     if ok:

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/commands/explain.py RENAMED Viewed

@@ -167,6 +167,8 @@ def register(app: typer.Typer) -> None:
             budget=deep_budget,
             max_file_tokens=cfg.context.max_file_tokens,
             keywords=plan.keywords,
+            min_summary_score=cfg.context.min_summary_score,
+            max_summary_files=0,
         )
         deep_selected_paths = {
             r.path for r in deep_receipts if r.action in ("included", "summarized")

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/commands/pack.py RENAMED Viewed

@@ -133,6 +133,16 @@ def _print_pack_summary(result: PackResult) -> None:
     console.print()
     console.print(Columns([stats, files_tbl], equal=False, expand=False))
+    diagnostics = _pack_diagnostics(result)
+    if diagnostics:
+        diag_text = "\n".join(f"  [yellow]![/] {line}" for line in diagnostics)
+        console.print(Panel(
+            diag_text,
+            title="[bold yellow]Pack diagnostics[/]",
+            border_style="yellow",
+            padding=(0, 1),
+        ))
     if changed_files:
         console.print(f"\n[bold]Changed files[/] ({len(changed_files)}):")
         console.print(changed_lines)
@@ -161,6 +171,37 @@ def _print_pack_summary(result: PackResult) -> None:
     console.print()
+def _pack_diagnostics(result: PackResult) -> list[str]:
+    selected = result.pack.selected_files
+    receipts = result.pack.receipts
+    diagnostics: list[str] = []
+    summary_count = sum(1 for sf in selected if sf.include_mode == "summary")
+    filename_matches = sum(1 for sf in selected if "filename keyword match" in sf.reasons)
+    symbol_matches = sum(1 for sf in selected if "symbol keyword match" in sf.reasons)
+    score_floor_excluded = sum(1 for r in receipts if r.reason == "summary score below floor")
+    summary_cap_excluded = sum(1 for r in receipts if r.reason == "summary cap reached")
+    task_words = [
+        part for part in result.pack.task.replace("_", " ").replace("-", " ").split()
+        if len(part) >= 3
+    ]
+    if len(task_words) <= 3:
+        diagnostics.append("Task is very short; add subsystem, file, or symptom words for better precision.")
+    if not result.changed_files:
+        diagnostics.append("No changed files detected; pack relies mostly on task keywords and cached summaries.")
+    if selected and filename_matches / len(selected) >= 0.6:
+        diagnostics.append("Most selected files matched by filename; task terms may be broad.")
+    if selected and summary_count / len(selected) >= 0.7:
+        diagnostics.append("Pack is mostly summaries; use minimal mode or a more specific task for edit work.")
+    if symbol_matches > 25:
+        diagnostics.append(f"Many symbol matches selected ({symbol_matches}); inspect repeated task terms with explain.")
+    if score_floor_excluded:
+        diagnostics.append(f"{score_floor_excluded} weak summaries excluded by score floor.")
+    if summary_cap_excluded:
+        diagnostics.append(f"{summary_cap_excluded} summaries excluded by mode cap.")
+    return diagnostics[:5]
 def _pack_watch(
     agent: str,
     task: str,

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/commands/stats.py RENAMED Viewed

@@ -45,13 +45,8 @@ def register(app: typer.Typer) -> None:
                     + content.count("Included as: **symbols**")
                 )
-        full_files = [f for f in scan_result.packable
-                      if f.estimated_tokens <= cfg.context.max_file_tokens]
-        manual_estimate = min(after_ignore, sum(f.estimated_tokens for f in full_files[:20]))
-        vs_manual = (1 - packed / manual_estimate) * 100 if manual_estimate > 0 else 0
         # --- Session info ---
-        from agentpack.session.state import load_session, CONTEXT_FILE
+        from agentpack.session.state import load_session
         session = load_session(root)
         if session:
@@ -80,9 +75,19 @@ def register(app: typer.Typer) -> None:
                 except Exception:
                     pass
-        context_path_obj = root / CONTEXT_FILE
-        if context_path_obj.exists():
-            top_files = _parse_top_files(context_path_obj)
+        if meta:
+            context_path_obj = root / meta.get("context_path", "")
+            if context_path_obj.exists():
+                top_files = _parse_top_files(context_path_obj)
+        token_by_path = {f.path: f.estimated_tokens for f in scan_result.packable}
+        top_estimate = sum(token_by_path.get(path, 0) for path, _mode, _why in top_files[:20])
+        if top_estimate <= 0:
+            full_files = [f for f in scan_result.packable
+                          if f.estimated_tokens <= cfg.context.max_file_tokens]
+            top_estimate = sum(f.estimated_tokens for f in full_files[:20])
+        top_estimate = min(after_ignore, top_estimate)
+        vs_top_files = (1 - packed / top_estimate) * 100 if top_estimate > 0 else 0
         # --- Token table ---
         token_tbl = Table(title="Last Context", box=box.SIMPLE, show_header=False, padding=(0, 2))
@@ -92,7 +97,7 @@ def register(app: typer.Typer) -> None:
         token_tbl.add_row("after ignore", f"{after_ignore:,}")
         token_tbl.add_row("packed tokens", f"{packed:,}")
         token_tbl.add_row("vs raw repo", f"[green]{saving:.1f}% smaller[/]")
-        token_tbl.add_row("vs manual (~20 files)", f"[green]{vs_manual:.1f}% smaller[/]")
+        token_tbl.add_row("vs top-20 full files", f"[green]{vs_top_files:.1f}% smaller[/]")
         token_tbl.add_row("files ignored", f"{ignored_count:,}")
         token_tbl.add_row("files full", f"{included_count:,}")
         token_tbl.add_row("files summarized", f"{summarized_count:,}")
@@ -115,17 +120,34 @@ def register(app: typer.Typer) -> None:
             avg_recall = sum(r["selection_recall"] for r in accuracy_rows) / len(accuracy_rows)
             avg_precision = sum(r["selection_precision"] for r in accuracy_rows) / len(accuracy_rows)
             avg_f1 = sum(r["selection_f1"] for r in accuracy_rows) / len(accuracy_rows)
+            token_rows = [r for r in accuracy_rows if "selection_token_precision" in r]
+            avg_token_precision = (
+                sum(r["selection_token_precision"] for r in token_rows) / len(token_rows)
+                if token_rows else None
+            )
+            mode_token_precision: dict[str, float] = {}
+            for mode in ("full", "symbols", "summary"):
+                key = f"selection_token_precision_{mode}"
+                rows = [r for r in accuracy_rows if key in r]
+                if rows:
+                    mode_token_precision[mode] = sum(r[key] for r in rows) / len(rows)
             console.print()
             acc_tbl = Table(title=f"Selection Accuracy (last {len(accuracy_rows)} runs)", box=box.SIMPLE, show_header=False, padding=(0, 2))
             acc_tbl.add_column(style="dim")
             acc_tbl.add_column(justify="right", style="bold")
             acc_tbl.add_row("avg recall", f"{avg_recall:.1%}")
             acc_tbl.add_row("avg precision", f"{avg_precision:.1%}")
+            if avg_token_precision is not None:
+                acc_tbl.add_row("avg token precision", f"{avg_token_precision:.1%}")
+                for mode, value in mode_token_precision.items():
+                    acc_tbl.add_row(f"{mode} token precision", f"{value:.1%}")
             acc_tbl.add_row("avg F1", f"{avg_f1:.1%}")
             console.print(acc_tbl)
             console.print("[dim]recall = how many changed files were in the previous pack[/]")
+            if avg_token_precision is not None:
+                console.print("[dim]token precision = share of previous pack tokens spent on files later changed[/]")
-        console.print("[dim]'manual' = hand-picking 20 most relevant full files[/]")
+        console.print("[dim]'top-20 full files' = raw full contents for top included files, capped at 20[/]")
 def _load_accuracy_rows(metrics_path: Path, n: int = 10) -> list[dict]:

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/core/config.py RENAMED Viewed

@@ -22,6 +22,10 @@ class ContextConfig(BaseModel):
     default_budget: int = 25000
     default_mode: str = "balanced"
     max_file_tokens: int = 4000
+    min_summary_score: float = 60
+    max_summary_files_minimal: int = 15
+    max_summary_files_balanced: int = 40
+    max_summary_files_deep: int = 0
     include_tests: bool = True
     include_configs: bool = True
     include_receipts: bool = True
@@ -91,6 +95,10 @@ exclude_globs = []
 default_budget = 25000   # token budget per pack
 default_mode = "balanced"  # minimal | balanced | deep
 max_file_tokens = 4000   # files larger than this are summarised, not inlined
+min_summary_score = 60   # unchanged summary files below this score are excluded
+max_summary_files_minimal = 15   # 0 = no cap
+max_summary_files_balanced = 40  # 0 = no cap
+max_summary_files_deep = 0       # deep mode stays uncapped
 include_tests = true
 include_configs = true
 include_receipts = true

{agentpack_cli-0.1.20 → agentpack_cli-0.1.21}/src/agentpack/core/context_pack.py RENAMED Viewed

@@ -124,11 +124,14 @@ def select_files(
     budget: int,
     max_file_tokens: int,
     keywords: set[str] | None = None,
+    min_summary_score: float = 0,
+    max_summary_files: int = 0,
 ) -> tuple[list[SelectedFile], list[Receipt]]:
     opts = _MODE_WEIGHTS[mode]
     selected: list[SelectedFile] = []
     receipts: list[Receipt] = []
     tokens_used = 0
+    summaries_used = 0
     kw = keywords or set()
     for fi, score, reasons in sorted(scored, key=lambda x: -x[1]):
@@ -142,6 +145,12 @@ def select_files(
         is_changed = fi.path in changed_paths
         summary_data = summaries.get(fi.path)
+        will_be_summary = not is_changed and not (
+            opts["extra_full"] and fi.estimated_tokens <= max_file_tokens
+        )
+        if will_be_summary and score < min_summary_score:
+            receipts.append(Receipt(path=fi.path, action="excluded", reason="summary score below floor"))
+            continue
         # Determine inclusion mode
         if is_changed and fi.estimated_tokens <= max_file_tokens:
@@ -163,11 +172,17 @@ def select_files(
             content = None
             tok = min(fi.estimated_tokens, 200)
+        if mode_str == "summary" and max_summary_files > 0 and summaries_used >= max_summary_files:
+            receipts.append(Receipt(path=fi.path, action="excluded", reason="summary cap reached"))
+            continue
         if tokens_used + tok > budget:
             receipts.append(Receipt(path=fi.path, action="excluded", reason="budget exhausted"))
             continue
         tokens_used += tok
+        if mode_str == "summary":
+            summaries_used += 1
         # Build symbol list
         syms: list[Symbol] = []