PyPI - kon-coding-agent - Versions diffs - 0.3.3__tar.gz → 0.3.5__tar.gz - Mend

kon-coding-agent 0.3.3tar.gz → 0.3.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (139) hide show

kon_coding_agent-0.3.5/.github/workflows/test.yml ADDED Viewed

@@ -0,0 +1,33 @@
+name: Test
+on: [push, pull_request]
+permissions:
+  contents: read
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.12","3.13", "3.14"]
+    steps:
+    - uses: actions/checkout@v3
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v4
+      with:
+        python-version: ${{ matrix.python-version }}
+        cache: pip
+        cache-dependency-path: pyproject.toml
+    - name: Cache models
+      uses: actions/cache@v3
+      with:
+        path: ~/.cache
+        key: ${{ runner.os }}-torch-
+    - name: Install dependencies
+      run: |
+        pip install . --group dev
+    - name: Run tests
+      run: |
+        python -m pytest -s

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/.kon/skills/kon-release-publish/SKILL.md RENAMED Viewed

@@ -33,38 +33,44 @@ Use this skill when the user asks to cut a new Kon version, tag it, publish to P
    - `git status --short --branch` must be clean (or confirm with user)
    - `git tag --list` and `git log --oneline <prev_tag>..HEAD` to summarize changes
-2. **Version bump**
+2. **Update CHANGELOG.md**
+   - Replace the `## [Unreleased]` section's `- No changes yet.` with a new versioned heading: `## <version> - YYYY-MM-DD`
+   - Use `git log --oneline <prev_tag>..HEAD` to categorize changes into `### Added`, `### Changed`, `### Fixed` sections
+   - Credit external contributors with `- @username`
+   - Commit message: `docs: update changelog for <version>`
+3. **Version bump**
    - Update version in all 3 files above
-3. **Quality gates**
+4. **Quality gates**
    - `uv run ruff format .`
    - `uv run ruff check .`
    - `uv run pyright .`
    - `uv run pytest`
-4. **Commit**
+5. **Commit**
    - Commit message: `build: bump version to <version>`
-5. **Tag**
+6. **Tag**
    - Annotated tag: `git tag -a v<version> -m "v<version> ..."`
-   - Include concise “changes since previous tag” bullets
+   - Include concise "changes since previous tag" bullets
-6. **Push**
+7. **Push**
    - `git push origin main`
    - `git push origin v<version>`
-7. **Build + verify artifacts**
+8. **Build + verify artifacts**
    - `rm -rf dist && uv build`
    - `uv run python -m twine check dist/*`
-8. **Publish to PyPI**
+9. **Publish to PyPI**
    - Prefer token file if present (example `~/.pypi-token`):
    - `TWINE_USERNAME=__token__ TWINE_PASSWORD="$(< ~/.pypi-token)" uv run python -m twine upload dist/*`
    - Verify:
      - `https://pypi.org/project/kon-coding-agent/<version>/`
      - `https://pypi.org/pypi/kon-coding-agent/json` reports latest version
-9. **Create GitHub release**
+10. **Create GitHub release**
    - If token exists at `~/.github-token`, call Releases API:
    - `POST /repos/<owner>/<repo>/releases` with:
      - `tag_name: v<version>`
@@ -83,6 +89,7 @@ Use this skill when the user asks to cut a new Kon version, tag it, publish to P
 ## Output checklist to report
+- Changelog updated for `<version>`
 - Version bumped in all files
 - Checks passed
 - Commit hash

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/AGENTS.md RENAMED Viewed

@@ -4,6 +4,7 @@
 - Don't add trivial docstrings. Only add docstrings when explaining complex functionality.
 - This project uses `uv`. Run `uv run ruff format .` after editing or creating any files.
+- If generating and running a Python script, use `uv run python` instead of `python`.
 ## Testing

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/CHANGELOG.md RENAMED Viewed

@@ -6,6 +6,70 @@ All notable changes to this project will be documented in this file.
 - No changes yet.
+## 0.3.5 - 2026-04-18
+### Added
+- Standalone session HTML export with self-contained styling.
+- Configurable request timeout for API calls - @jspruit.
+- GitHub CI tests for Python 3.12–3.13 - @sukhbinder.
+- Width-aware popup lists and queue display.
+### Changed
+- Highlight color applied to the second column in floating lists.
+- Batch scroll during streaming, cache query_one lookups, pause spinner timer when idle.
+- Diff line length capped at 200 characters.
+### Fixed
+- Persist thinking level in session header and change all defaults to high.
+- Normalize OpenAI provider imports.
+- Widen resume popup labels.
+- Remove unused UI app import.
+### Performance
+- Added `gc.freeze()` and `PAUSE_GC_ON_SCROLL` to reduce GC stutters.
+## 0.3.4 - 2026-04-10
+### Fixed
+- Fixed Windows UTF-8 encoding errors in file operations - @sukhbinder.
+- Fixed local Gemma model thinking block compatibility.
+- Removed duplicate force-include for builtin skills in build config.
+### Changed
+- Updated local model documentation.
+## 0.3.3 - 2026-04-08
+### Added
+- Added queued agent steering between turns - @0xku.
+- Added bundled `/init` slash command for project scaffolding - @0xku.
+- Added help info for queue and steer queue commands.
+- Added GLM-5.1 support for zai provider.
+### Changed
+- Improved read tool directory listings.
+- Updated README with steer queue documentation.
+- Added `$` icon for bash, `%` for web tools, and `←` for edit tool.
+- Used muted color for shortcut key hints in exit/delete prompts.
+### Fixed
+- Let ESC interrupt retry backoff immediately - @0xku.
+- Fixed OpenAI login stdin leak by removing orphaned thread - @Meltedd.
+- Fixed OpenAI and Anthropic local compat with auth flags.
+- Fixed interrupt handling before handoff thread switch.
+- Fixed subprocess communication drain on cancellation/timeout.
+- Added zipfile path traversal validation.
+- Removed token throughput metrics.
 ## 0.3.2 - 2026-03-22
 ### Added

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: kon-coding-agent
-Version: 0.3.3
+Version: 0.3.5
 Summary: Minimal coding agent
 License-File: LICENSE
 Requires-Python: >=3.12

kon_coding_agent-0.3.5/docs/local-models.md ADDED Viewed

@@ -0,0 +1,80 @@
+# Local Models
+This document provides detailed information about running and configuring local models with Kon.
+## Tested Models
+> Tested on llama server build b8740
+| Model | Quantization | Context Length | TPS | System Specs |
+| ----- | -------------- | -------------- | --- | ------------ |
+| `zai-org/glm-4.7-flash` | Q4_K_M | 65,536 | N/A | i7-14700F × 28, 64GB RAM, 24GB VRAM (RTX 3090) |
+| `unsloth/Qwen3.5-27B-GGUF` | Q4_K_M | 32,768 | ~30 | i7-14700F × 28, 64GB RAM, 24GB VRAM (RTX 3090) |
+| `unsloth/gemma-4-26B-A4B-it-GGUF` | UD-Q4_K_M | 32,768 | ~100 | i7-14700F × 28, 64GB RAM, 24GB VRAM (RTX 3090) |
+Run Qwen3.5 27B on an RTX 3090 with a 32k context window using llama-server:
+```bash
+/path-to-llama-server/llama-server \
+  --model /path-to-model/Qwen3.5-27B-Q4_K_M.gguf \
+  --port 5000 \
+  --ctx-size 32768 \
+  --gpu-layers all \
+  --threads 8 \
+  --threads-batch 8 \
+  --batch-size 1024 \
+  --ubatch-size 512 \
+  --flash-attn on
+```
+On this machine, that setup generates at roughly 30 tokens per second.
+Then start Kon for a one-off local session:
+```bash
+kon --model unsloth/Qwen3.5-27B-GGUF --provider openai \
+  --base-url http://localhost:5000/v1 \
+  --openai-compat-auth none
+```
+Run Gemma 4 26B A4B on the same machine using llama-server:
+```bash
+/path-to-llama-server/llama-server \
+  --model /path-to-model/gemma-4-26B-A4B-it-UD-Q4_K_M.gguf \
+  --port 5000 \
+  --ctx-size 32768 \
+  --gpu-layers all \
+  --threads 8 \
+  --threads-batch 8 \
+  --batch-size 1024 \
+  --ubatch-size 512 \
+  --flash-attn on \
+  --temperature 1.5
+```
+Then start Kon against that local server:
+```bash
+kon --model unsloth/gemma-4-26B-A4B-it-GGUF --provider openai \
+  --base-url http://localhost:5000/v1 \
+  --openai-compat-auth none
+```
+To avoid passing provider, model, and auth flags every time you start Kon, you can define your local setup in `~/.kon/config.toml`. This also allows you to tune compaction to trigger at a specific point relative to your model's context window.
+If this is your default setup, put it in `~/.kon/config.toml` instead:
+```toml
+[llm]
+default_provider = "openai"
+default_model = "unsloth/gemma-4-26B-A4B-it-GGUF"
+default_base_url = "http://localhost:5000/v1"
+[llm.auth]
+openai_compat = "none" # or "auto"
+[compaction]
+# Set this close to your model's context size (e.g., 30000 for a 32k window)
+buffer_tokens = 27768 # 32768 - 5000 (safety margin)
+```

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/pyproject.toml RENAMED Viewed

@@ -14,7 +14,7 @@ default = true
 [project]
 name = "kon-coding-agent"
-version = "0.3.3"
+version = "0.3.5"
 description = "Minimal coding agent"
 readme = "README.md"
 requires-python = ">=3.12"

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/src/kon/config.py RENAMED Viewed

@@ -76,6 +76,7 @@ class LLMConfig(BaseModel):
     default_thinking_level: str
     system_prompt: SystemPromptConfig
     tool_call_idle_timeout_seconds: float = 180
+    request_timeout_seconds: float = 600
     auth: AuthConfig = AuthConfig()

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/src/kon/defaults/config.toml RENAMED Viewed

@@ -11,6 +11,9 @@ default_thinking_level = "high"
 # Abort a tool call if it stays idle for this long during a turn.
 # Helps prevent stalled tool executions from hanging the agent loop.
 tool_call_idle_timeout_seconds = 180
+# HTTP request timeout for LLM API calls (in seconds).
+# Local models (e.g. llama.cpp) may need a higher value for long compaction requests.
+request_timeout_seconds = 600
 [llm.auth]
 # Auth policy for OpenAI-compatible and Anthropic-compatible endpoints.

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/src/kon/llm/base.py RENAMED Viewed

@@ -77,7 +77,7 @@ class ProviderConfig:
     model: str = ""
     max_tokens: int = 8192
     temperature: float | None = None
-    thinking_level: str = "medium"
+    thinking_level: str = "high"
     provider: str | None = None
     session_id: str | None = None
     openai_compat_auth_mode: AuthMode = "auto"

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/src/kon/llm/providers/openai_completions.py RENAMED Viewed

@@ -10,6 +10,8 @@ from openai.types.chat import (
     ChatCompletionToolParam,
 )
+from kon import config as kon_config
 from ...core.types import (
     AssistantMessage,
     ImageContent,
@@ -30,7 +32,7 @@ from ...core.types import (
     Usage,
     UserMessage,
 )
-from ..base import BaseProvider, LLMStream, ProviderConfig, resolve_api_key
+from ..base import BaseProvider, LLMStream, ProviderConfig, is_local_base_url, resolve_api_key
 from .openai_compat import supports_developer_role
 from .sanitize import sanitize_surrogates
@@ -41,12 +43,13 @@ class OpenAICompletionsCompat:
     supports_developer_role: bool = True
     supports_reasoning_effort: bool = True
     max_tokens_field: Literal["max_tokens", "max_completion_tokens"] = "max_completion_tokens"
-    thinking_format: Literal["openai", "zai", "qwen"] = "openai"
+    thinking_format: Literal["openai", "zai", "qwen", "llama_gemma"] = "openai"
-def _detect_compat(provider: str, base_url: str) -> OpenAICompletionsCompat:
+def _detect_compat(provider: str, base_url: str, model: str = "") -> OpenAICompletionsCompat:
     normalized_provider = provider.lower()
     normalized_base_url = base_url.lower()
+    normalized_model = model.lower()
     is_zai = (
         normalized_provider == "zai"
         or normalized_provider == "zhipu"
@@ -61,6 +64,13 @@ def _detect_compat(provider: str, base_url: str) -> OpenAICompletionsCompat:
             thinking_format="zai",
         )
+    if is_local_base_url(base_url) and "gemma" in normalized_model:
+        return OpenAICompletionsCompat(
+            supports_developer_role=supports_developer_role(provider, base_url),
+            supports_reasoning_effort=False,
+            thinking_format="llama_gemma",
+        )
     return OpenAICompletionsCompat(
         supports_developer_role=supports_developer_role(provider, base_url)
     )
@@ -91,8 +101,14 @@ class OpenAICompletionsProvider(BaseProvider):
                 "Set OPENAI_API_KEY or ZAI_API_KEY environment variable, "
                 'or configure llm.auth.openai_compat = "auto"/"none" for local endpoints.'
             )
-        self._client = AsyncOpenAI(api_key=api_key, base_url=config.base_url)
-        self._compat = _detect_compat(config.provider or "", config.base_url or "")
+        self._client = AsyncOpenAI(
+            api_key=api_key,
+            base_url=config.base_url,
+            timeout=kon_config.llm.request_timeout_seconds,
+        )
+        self._compat = _detect_compat(
+            config.provider or "", config.base_url or "", config.model or ""
+        )
     async def _stream_impl(
         self,
@@ -138,7 +154,7 @@ class OpenAICompletionsProvider(BaseProvider):
         if compat.thinking_format == "zai":
             if thinking_level and thinking_level != "none":
                 extra_body["thinking"] = {"type": "enabled"}
-        elif compat.thinking_format == "qwen":
+        elif compat.thinking_format in {"qwen", "llama_gemma"}:
             extra_body["enable_thinking"] = bool(thinking_level and thinking_level != "none")
         elif (
             self.supports_reasoning_effort
@@ -238,11 +254,16 @@ class OpenAICompletionsProvider(BaseProvider):
         if system_prompt:
             role = "developer" if (compat and compat.supports_developer_role) else "system"
+            prompt_content = sanitize_surrogates(system_prompt)
+            if (
+                compat
+                and compat.thinking_format == "llama_gemma"
+                and self.config.thinking_level != "none"
+                and not prompt_content.startswith("<|think|>")
+            ):
+                prompt_content = "<|think|>" + prompt_content
             result.append(
-                cast(
-                    ChatCompletionMessageParam,
-                    {"role": role, "content": sanitize_surrogates(system_prompt)},
-                )
+                cast(ChatCompletionMessageParam, {"role": role, "content": prompt_content})
             )
         pending_images: list[ImageContent] = []

{kon_coding_agent-0.3.3 → kon_coding_agent-0.3.5}/src/kon/llm/providers/openai_responses.py RENAMED Viewed

@@ -4,6 +4,8 @@ from typing import Any
 from openai import APIStatusError, AsyncOpenAI, RateLimitError
+from kon import config as kon_config
 from ...core.types import (
     AssistantMessage,
     ImageContent,
@@ -51,6 +53,7 @@ class OpenAIResponsesProvider(BaseProvider):
                 api_key=self.config.api_key,
                 base_url=self.config.base_url,
                 default_headers=self._headers,
+                timeout=kon_config.llm.request_timeout_seconds,
             )
         return self._client

kon-coding-agent 0.3.3__tar.gz → 0.3.5__tar.gz

kon-coding-agent 0.3.3tar.gz → 0.3.5tar.gz