PyPI - ata-coder - Versions diffs - 2.4.6__tar.gz → 2.4.7__tar.gz - Mend

ata-coder 2.4.6tar.gz → 2.4.7tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (142) hide show

{ata_coder-2.4.6/ata_coder.egg-info → ata_coder-2.4.7}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: ata-coder
-Version: 2.4.6
+Version: 2.4.7
 Summary: ATA Coder — AI-powered coding assistant
 Author: ATA Coder Team
 License-Expression: MIT
@@ -21,13 +21,15 @@ Requires-Dist: pytest-timeout>=2.0; extra == "dev"
 Requires-Dist: tiktoken>=0.5.0; extra == "dev"
 Dynamic: license-file
-# ATA Coder v2.4.6
+# ATA Coder v2.4.7
 **AI-powered coding assistant — async, AST-aware, single config file.**
 [English](#english) | [中文](#中文)
-> **v2.4.6** — 🔐 **OS-Native Credential Store**: API key encrypted at rest via Windows DPAPI / macOS Keychain / Linux secret-tool. Auto-migrates plaintext keys. Zero dependencies.
+> **v2.4.7** — ⚡ **Context Memory Refactor**: O(1) token tracking, ContextManager, section-level prompt caching, pre-tokenized TF-IDF, LRU token cache. ~60% less overhead in the hot loop.
+>
+> > **v2.4.6** — 🔐 **OS-Native Credential Store**: API key encrypted at rest via Windows DPAPI / macOS Keychain / Linux secret-tool. Auto-migrates plaintext keys. Zero dependencies.
 >
 > > **v2.4.5** — 🤖 **Self-Bootstrapped Audit**: ATA Coder found 19 bugs in its own source code and fixed them all — thread safety, SSRF IPv6, rate limiter leak, auth hardening, DRY refactoring, CI coverage. 12 files changed.
 >
@@ -550,7 +552,9 @@ All 6 findings in this release were discovered by **ATA Coder scanning its own s
 ## 中文
-> **v2.4.6** — 🔐 **操作系统凭据存储**: API Key 通过 Windows DPAPI / macOS Keychain / Linux secret-tool 加密存储。自动迁移明文密钥。零依赖。
+> **v2.4.7** — ⚡ **上下文记忆重构**: O(1) Token 追踪、ContextManager、章节级提示缓存、预分词 TF-IDF、LRU Token 缓存。热路径开销降低约 60%。
+>
+> > **v2.4.6** — 🔐 **操作系统凭据存储**: API Key 通过 Windows DPAPI / macOS Keychain / Linux secret-tool 加密存储。自动迁移明文密钥。零依赖。
 >
 > > **v2.4.5** — 🤖 **自举审计**: ATA Coder 审计自身源码发现 19 个 bug 并全部修复 — 线程安全、SSRF IPv6、速率限制器泄漏、认证加固、DRY 重构、CI 覆盖。12 个文件变更。
 >

{ata_coder-2.4.6 → ata_coder-2.4.7}/README.md RENAMED Viewed

@@ -1,10 +1,12 @@
-# ATA Coder v2.4.6
+# ATA Coder v2.4.7
 **AI-powered coding assistant — async, AST-aware, single config file.**
 [English](#english) | [中文](#中文)
-> **v2.4.6** — 🔐 **OS-Native Credential Store**: API key encrypted at rest via Windows DPAPI / macOS Keychain / Linux secret-tool. Auto-migrates plaintext keys. Zero dependencies.
+> **v2.4.7** — ⚡ **Context Memory Refactor**: O(1) token tracking, ContextManager, section-level prompt caching, pre-tokenized TF-IDF, LRU token cache. ~60% less overhead in the hot loop.
+>
+> > **v2.4.6** — 🔐 **OS-Native Credential Store**: API key encrypted at rest via Windows DPAPI / macOS Keychain / Linux secret-tool. Auto-migrates plaintext keys. Zero dependencies.
 >
 > > **v2.4.5** — 🤖 **Self-Bootstrapped Audit**: ATA Coder found 19 bugs in its own source code and fixed them all — thread safety, SSRF IPv6, rate limiter leak, auth hardening, DRY refactoring, CI coverage. 12 files changed.
 >
@@ -527,7 +529,9 @@ All 6 findings in this release were discovered by **ATA Coder scanning its own s
 ## 中文
-> **v2.4.6** — 🔐 **操作系统凭据存储**: API Key 通过 Windows DPAPI / macOS Keychain / Linux secret-tool 加密存储。自动迁移明文密钥。零依赖。
+> **v2.4.7** — ⚡ **上下文记忆重构**: O(1) Token 追踪、ContextManager、章节级提示缓存、预分词 TF-IDF、LRU Token 缓存。热路径开销降低约 60%。
+>
+> > **v2.4.6** — 🔐 **操作系统凭据存储**: API Key 通过 Windows DPAPI / macOS Keychain / Linux secret-tool 加密存储。自动迁移明文密钥。零依赖。
 >
 > > **v2.4.5** — 🤖 **自举审计**: ATA Coder 审计自身源码发现 19 个 bug 并全部修复 — 线程安全、SSRF IPv6、速率限制器泄漏、认证加固、DRY 重构、CI 覆盖。12 个文件变更。
 >

{ata_coder-2.4.6 → ata_coder-2.4.7}/agent.py RENAMED Viewed

@@ -44,6 +44,7 @@ from .agent_compact import CompactionMixin
 from .agent_tools import ToolExecutionMixin
 from .agent_routing import ModelRoutingMixin
 from .agent_extension import ExtensionMixin
+from .context_manager import ContextManager
 # ── Event types & Agent state ──────────────────────────────────────────
 from .core import (  # noqa: F401 — re-exported for external use
@@ -148,6 +149,10 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
         self._cached_system_prompt: str | None = None  # invalidated on new build / compact
         self._cached_allowed_tools: set[str] | None = None  # invalidated on skill change
+        # ── Context manager (O(1) token tracking, segment-split, adaptive compact) ──
+        self._context_manager = ContextManager(self.config.agent)
+        self._summary_llm = None  # lazily created summarisation client
         # Build the combined tool list
         self._all_tools = list(TOOL_DEFINITIONS)
         if self.mcp:
@@ -227,7 +232,7 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
         if not reset_context and self._state.messages:
             # Append new user message to existing conversation; keep system
             # prompt and all prior messages intact.
-            self._state.messages.append({"role": "user", "content": task})
+            self._append_message({"role": "user", "content": task})
             # Rebuild system prompt for updated memory context but don't
             # replace the original system message (memory/git context may
             # have changed, but conversation integrity is paramount).
@@ -291,10 +296,12 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
             self._cached_system_prompt = system_prompt  # pre-seed cache
             self._cached_allowed_tools = None  # invalidate on new run
-            self._state.messages = [
+            initial_msgs = [
                 {"role": "system", "content": system_prompt},
                 {"role": "user", "content": task},
             ]
+            self._state.messages = initial_msgs
+            self._context_manager.replace_all(initial_msgs)
             logger.info("Agent run: skill=%s, model=%s, session=%s, task=%.100s",
                          self.skills.active_skill.name if self.skills and self.skills.active_skill else "default",
@@ -352,24 +359,19 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
             # Clawd: model is generating, show thinking animation
             get_clawd().thinking()
-            # Auto-compact when approaching the effective context limit.
-            # effective_context_tokens (default 200k) reflects the range where
-            # the model actually pays attention, not the theoretical 1M window.
-            # We compact at 80% of effective limit, which is well below the
-            # theoretical max_context_tokens.
-            est_tokens = self.get_token_estimate()
-            max_tokens = self.config.agent.max_context_tokens
-            effective = self.config.agent.effective_context_tokens
-            if est_tokens > effective:
+            # Auto-compact when approaching the effective context limit (O(1) check).
+            if self._context_manager.should_compact():
+                est = self._context_manager.token_total
+                max_t = self.config.agent.max_context_tokens
                 logger.warning("Token budget: %d/%d effective (%.0f%% of %d max), auto-compacting",
-                             est_tokens, effective, est_tokens / max(max_tokens, 1) * 100, max_tokens)
+                             est, self.config.agent.effective_context_tokens,
+                             est / max(max_t, 1) * 100, max_t)
                 await self.compact()
-                # Re-estimate AFTER compaction — the message list has changed
-                est_tokens = self.get_token_estimate()
             # Hard ceiling: if compaction didn't help enough, force-truncate
-            if est_tokens > max_tokens * 0.95:
+            if self._context_manager.needs_force_truncate():
                 logger.critical("Hard token ceiling: %d > 95%% of %d max. Force-truncating.",
-                               est_tokens, max_tokens)
+                               self._context_manager.token_total,
+                               self.config.agent.max_context_tokens)
                 self._force_truncate()
             # Get allowed tools from multi-skill intersection
@@ -437,7 +439,7 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
                 }
                 if response.get("reasoning_content"):
                     assistant_msg["reasoning_content"] = response["reasoning_content"]
-                self._state.messages.append(assistant_msg)
+                self._append_message(assistant_msg)
                 for tc, result in zip(tool_calls, results, strict=True):
                     self._warn_if_large_result(result, tc["function"]["name"])
                     self._store_tool_result(result, tc["id"])
@@ -463,7 +465,7 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
                 }
                 if response.get("reasoning_content"):
                     assistant_msg["reasoning_content"] = response["reasoning_content"]
-                self._state.messages.append(assistant_msg)
+                self._append_message(assistant_msg)
                 for tc, result in zip(tool_calls, batch_results, strict=True):
                     self._store_tool_result(result, tc["id"])
@@ -579,25 +581,23 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
         Mirrors the main run() loop: skill tool filtering, token compaction,
         consecutive-failure detection, and circuit breaker.
         """
-        self._state.messages.append({"role": "user", "content": message})
+        self._append_message({"role": "user", "content": message})
         SAFETY_LIMIT = 999  # circuit breaker
         _consecutive_failures = 0
         _MAX_CONSECUTIVE_FAILURES = 5
         while self._state.tool_call_count < SAFETY_LIMIT:
-            # ── Token budget: auto-compact when approaching the limit ────
-            est_tokens = self.get_token_estimate()
-            max_tokens = self.config.agent.max_context_tokens
-            effective = self.config.agent.effective_context_tokens
-            if est_tokens > effective:
+            # ── Token budget: auto-compact when approaching the limit (O(1)) ──
+            if self._context_manager.should_compact():
                 logger.warning("chat(): token budget %d/%d effective, auto-compacting",
-                             est_tokens, effective)
+                             self._context_manager.token_total,
+                             self.config.agent.effective_context_tokens)
                 await self.compact()
-                est_tokens = self.get_token_estimate()
-            if est_tokens > max_tokens * 0.95:
+            if self._context_manager.needs_force_truncate():
                 logger.critical("chat(): hard ceiling %d > 95%% of %d, force-truncating",
-                               est_tokens, max_tokens)
+                               self._context_manager.token_total,
+                               self.config.agent.max_context_tokens)
                 self._force_truncate()
             # ── Skill tool filtering ────────────────────────────────────
@@ -639,7 +639,7 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
                 batch_results.append(result)
                 self._warn_if_large_result(result, tool_name)
-                self._state.messages.append({
+                self._append_message({
                     "role": "assistant",
                     "content": text or None,
                     "tool_calls": [tc],
@@ -850,8 +850,15 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
     # ── Change tracking helper → agent_tools.py (ToolExecutionMixin._read_old_content)
+    def _append_message(self, msg: Message) -> None:
+        """Append a message to state AND context manager (O(1) token update)."""
+        self._state.messages.append(msg)
+        self._context_manager.append(msg)
     def get_token_estimate(self) -> int:
-        """Estimate total tokens in the conversation."""
+        """O(1) token total from ContextManager. Falls back to LLM count if stale."""
+        if self._context_manager.messages:
+            return self._context_manager.token_total
         return self.llm.count_tokens_approx(self._state.messages)
     def get_conversation_summary(self) -> str:
@@ -880,5 +887,7 @@ class CoderAgent(CompactionMixin, ToolExecutionMixin,
         # Clawd: final SessionEnd
         get_clawd().shutdown()
         await self.llm.close()
+        if self._summary_llm:
+            await self._summary_llm.close()
         if self.mcp:
             await self.mcp.stop_all()

ata_coder-2.4.7/agent_compact.py ADDED Viewed

@@ -0,0 +1,159 @@
+"""Context compaction and token budget management — mixin for CoderAgent.
+Delegates all context operations to ContextManager.  This mixin is now a
+thin wrapper that provides the same public API while eliminating duplicated
+logic, avoiding deep copies, and reusing the summarisation LLM client.
+"""
+import copy
+import logging
+from .types import Message
+from .clawd_integration import get_clawd
+from .model_router import get_subagent_model
+logger = logging.getLogger(__name__)
+class CompactionMixin:
+    """Context window compaction — thin wrapper around ContextManager."""
+    # ── Compaction token budget (class-level defaults, overridable) ───────
+    RECENT_TOKEN_BUDGET = 80_000   # max tokens to keep in the recent segment
+    COMPACT_IF_FEWER_THAN = 6      # skip compaction if fewer than this many msgs
+    # ── Core compaction ───────────────────────────────────────────────────
+    async def compact(self) -> str:
+        """Compact conversation by summarising old messages.
+        Strategy: keep system prompt + recent messages up to
+        RECENT_TOKEN_BUDGET tokens, summarise everything in between using
+        a cheap LLM call.  Falls back to a lightweight extractive summary
+        if the API call fails.
+        Delegates segment-splitting to ContextManager to avoid the
+        duplicated walk-backwards logic that was previously shared with
+        _force_truncate.
+        """
+        cm = self._context_manager
+        if not cm.can_compact():
+            return "Already compact."
+        # Clawd: PreCompact
+        get_clawd().compact()
+        system_msg, recent, archive = cm.split_into_segments()
+        if not archive:
+            return "Already compact (all messages fit in recent budget)."
+        # Extract summary metadata from the archive segment
+        tool_count = sum(1 for m in archive if m.get("tool_calls"))
+        user_msgs = [m.get("content", "")[:200] for m in archive if m.get("role") == "user"]
+        file_ops = cm.collect_file_ops(archive)
+        summary = await self._summarise_messages(archive, file_ops, user_msgs, tool_count)
+        old_count = len(cm.messages)
+        old_tokens = cm.token_total
+        truncated: list[Message] = []
+        if system_msg:
+            truncated.append(system_msg)
+        truncated.append({
+            "role": "user",
+            "content": "[Conversation summary]\n" + summary,
+        })
+        truncated.append({
+            "role": "assistant",
+            "content": "Understood. I'll continue with the remaining context using the summary above.",
+        })
+        truncated.extend(recent)
+        cm.replace_all(truncated)
+        self._cached_system_prompt = None  # system msg may have shifted
+        self._state.messages = cm.messages  # sync for backward compat
+        new_tokens = cm.token_total
+        logger.info("Compacted: %d→%d msgs, ~%d→%d tokens (files: %d, tools: %d)",
+                    old_count, len(truncated), old_tokens, new_tokens,
+                    len(file_ops), tool_count)
+        return (f"Compacted from {old_count}→{len(truncated)} messages "
+                f"(~{old_tokens:,}→~{new_tokens:,} tokens, {len(file_ops)} files, {tool_count} tool calls).")
+    def _force_truncate(self) -> None:
+        """Drop the oldest non-system messages when we exceed 95% of max tokens.
+        Called only as a last resort after compaction has already run.
+        Delegates to ContextManager.build_truncated_list() — no more
+        duplicated walk-backwards.
+        """
+        cm = self._context_manager
+        if len(cm.messages) <= 6:
+            return
+        truncated, result = cm.build_truncated_list()
+        cm.replace_all(truncated)
+        self._cached_system_prompt = None
+        self._state.messages = cm.messages  # sync
+        logger.warning("Force-truncated: %d → %d messages (~%d tokens kept)",
+                       result.old_count, result.new_count, result.new_tokens)
+    # ── Token estimation helpers (delegate to ContextManager) ────────────
+    def _estimate_message_tokens(self, msg: Message) -> int:
+        """Rough token estimate for a single message (via ContextManager cache)."""
+        return self._context_manager.get_msg_tokens(msg)
+    @staticmethod
+    def _collect_file_ops(messages: list[Message]) -> list[str]:
+        """Collect files modified in a message list (static, delegates to CM)."""
+        from .context_manager import ContextManager
+        return ContextManager.collect_file_ops(messages)
+    # ── Summarisation (reuses a single cheap LLM client) ──────────────────
+    async def _summarise_messages(self, archive: list[Message], file_ops: list[str],
+                                  user_msgs: list[str], tool_count: int) -> str:
+        """Generate a summary of the archive conversation segment.
+        Attempts a cheap LLM call first; falls back to a lightweight extractive
+        summary so the user never loses context entirely.  The summarisation
+        client is created once and reused across compactions.
+        """
+        # ── LLM-based summary (best effort) ──────────────────────────────
+        try:
+            summary_prompt = (
+                "Summarise this conversation segment in 3-5 bullet points. "
+                "Focus on: what the user asked, what files were changed, what "
+                "decisions were made, and any unresolved issues. "
+                "Be concise — this summary will replace the full conversation "
+                "history to save context tokens.\n\n"
+                f"Files modified: {', '.join(file_ops) if file_ops else 'none'}\n"
+                f"Tool calls: {tool_count}\n"
+                f"User requests: {'; '.join(user_msgs[:5])}\n"
+            )
+            sc = getattr(self, '_summary_llm', None)
+            if sc is None:
+                from .llm_client import LLMClient
+                summary_config = copy.deepcopy(self.llm.config)
+                summary_config.model = get_subagent_model()
+                sc = LLMClient(summary_config)
+                self._summary_llm = sc  # cache for reuse
+            resp = await sc.chat([{"role": "user", "content": summary_prompt}], tools=[])
+            llm_summary = (resp.get("content") or "").strip()
+            if llm_summary:
+                parts = [llm_summary]
+                if file_ops:
+                    parts.append(f"\nFiles touched: {', '.join(file_ops[:10])}")
+                return "\n".join(parts)
+        except Exception:
+            logger.debug("LLM summarisation unavailable, using extractive fallback")
+        # ── Extractive fallback ─────────────────────────────────────────
+        parts = [f"Summarised {len(archive)} messages ({tool_count} tool calls)."]
+        if user_msgs:
+            parts.append(f"Topics: {'; '.join(user_msgs[:5])}")
+        if file_ops:
+            parts.append(f"Files modified: {', '.join(file_ops[:10])}")
+        return "\n".join(parts)

{ata_coder-2.4.6 → ata_coder-2.4.7/ata_coder.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: ata-coder
-Version: 2.4.6
+Version: 2.4.7
 Summary: ATA Coder — AI-powered coding assistant
 Author: ATA Coder Team
 License-Expression: MIT
@@ -21,13 +21,15 @@ Requires-Dist: pytest-timeout>=2.0; extra == "dev"
 Requires-Dist: tiktoken>=0.5.0; extra == "dev"
 Dynamic: license-file
-# ATA Coder v2.4.6
+# ATA Coder v2.4.7
 **AI-powered coding assistant — async, AST-aware, single config file.**
 [English](#english) | [中文](#中文)
-> **v2.4.6** — 🔐 **OS-Native Credential Store**: API key encrypted at rest via Windows DPAPI / macOS Keychain / Linux secret-tool. Auto-migrates plaintext keys. Zero dependencies.
+> **v2.4.7** — ⚡ **Context Memory Refactor**: O(1) token tracking, ContextManager, section-level prompt caching, pre-tokenized TF-IDF, LRU token cache. ~60% less overhead in the hot loop.
+>
+> > **v2.4.6** — 🔐 **OS-Native Credential Store**: API key encrypted at rest via Windows DPAPI / macOS Keychain / Linux secret-tool. Auto-migrates plaintext keys. Zero dependencies.
 >
 > > **v2.4.5** — 🤖 **Self-Bootstrapped Audit**: ATA Coder found 19 bugs in its own source code and fixed them all — thread safety, SSRF IPv6, rate limiter leak, auth hardening, DRY refactoring, CI coverage. 12 files changed.
 >
@@ -550,7 +552,9 @@ All 6 findings in this release were discovered by **ATA Coder scanning its own s
 ## 中文
-> **v2.4.6** — 🔐 **操作系统凭据存储**: API Key 通过 Windows DPAPI / macOS Keychain / Linux secret-tool 加密存储。自动迁移明文密钥。零依赖。
+> **v2.4.7** — ⚡ **上下文记忆重构**: O(1) Token 追踪、ContextManager、章节级提示缓存、预分词 TF-IDF、LRU Token 缓存。热路径开销降低约 60%。
+>
+> > **v2.4.6** — 🔐 **操作系统凭据存储**: API Key 通过 Windows DPAPI / macOS Keychain / Linux secret-tool 加密存储。自动迁移明文密钥。零依赖。
 >
 > > **v2.4.5** — 🤖 **自举审计**: ATA Coder 审计自身源码发现 19 个 bug 并全部修复 — 线程安全、SSRF IPv6、速率限制器泄漏、认证加固、DRY 重构、CI 覆盖。12 个文件变更。
 >

{ata_coder-2.4.6 → ata_coder-2.4.7}/ata_coder.egg-info/SOURCES.txt RENAMED Viewed

@@ -14,6 +14,7 @@ anthropic_client.py
 change_tracker.py
 clawd_integration.py
 config.py
+context_manager.py
 event_queue.py
 extension.py
 fool_proof.py
@@ -65,6 +66,7 @@ utils.py
 ./change_tracker.py
 ./clawd_integration.py
 ./config.py
+./context_manager.py
 ./event_queue.py
 ./extension.py
 ./fool_proof.py

{ata_coder-2.4.6 → ata_coder-2.4.7}/config.py RENAMED Viewed

@@ -120,6 +120,13 @@ class AgentConfig:
                 self, "effective_context_tokens",
                 max(10000, int(self.max_context_tokens * 0.9)),
             )
+    # Compaction/context budgets (passed to ContextManager)
+    recent_token_budget: int = field(
+        default_factory=lambda: int(_from_settings("recent_token_budget", 80000))
+    )
+    compact_if_fewer_than: int = field(
+        default_factory=lambda: int(_from_settings("compact_if_fewer_than", 6))
+    )
     max_message_output_chars: int = field(
         default_factory=lambda: int(_from_settings("max_message_output_chars", 8000))
     )

ata-coder 2.4.6__tar.gz → 2.4.7__tar.gz

ata-coder 2.4.6tar.gz → 2.4.7tar.gz