PyPI - zapcode-ai - Versions diffs - 1.3.0__tar.gz → 1.4.0__tar.gz - Mend

zapcode-ai 1.3.0tar.gz → 1.4.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

{zapcode_ai-1.3.0 → zapcode_ai-1.4.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: zapcode-ai
-Version: 1.3.0
+Version: 1.4.0
 Summary: AI SDK integration for Zapcode — let LLMs write and execute TypeScript safely
 Project-URL: Homepage, https://github.com/TheUncharted/zapcode
 Project-URL: Repository, https://github.com/TheUncharted/zapcode
@@ -47,10 +47,10 @@ Description-Content-Type: text/markdown
 AI agents are more capable when they **write code** instead of chaining tool calls. Code gives agents loops, conditionals, variables, and composition — things that tool chains simulate poorly.
-- [CodeMode](https://blog.cloudflare.com/codemode-ai-agent-coding) — Cloudflare on why agents should write code
-- [Programmatic Tool Calling](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/tool-use-examples#programmatic-tool-calling) — Anthropic's approach
-- [Code Execution with MCP](https://www.anthropic.com/engineering/code-execution-mcp) — Anthropic engineering
-- [Smol Agents](https://huggingface.co/docs/smolagents/en/index) — Hugging Face's code-first agents
+- [Codemode](https://blog.cloudflare.com/code-mode/) from Cloudflare
+- [Programmatic Tool Calling](https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling) from Anthropic
+- [Code Execution with MCP](https://www.anthropic.com/engineering/code-execution-with-mcp) from Anthropic
+- [Smol Agents](https://github.com/huggingface/smolagents) from Hugging Face
 **But running AI-generated code is dangerous and slow.**
@@ -176,7 +176,7 @@ if (!state.completed) {
 }
 ```
-See [`examples/typescript/basic.ts`](examples/typescript/basic.ts) for more.
+See [`examples/typescript/basic/main.ts`](examples/typescript/basic/main.ts) for more.
 ### Python
@@ -213,7 +213,7 @@ if state.get("suspended"):
     result = restored.resume({"condition": "Clear", "temp": 26})
 ```
-See [`examples/python/basic.py`](examples/python/basic.py) for more.
+See [`examples/python/basic/main.py`](examples/python/basic/main.py) for more.
 <details>
 <summary><strong>Rust</strong></summary>
@@ -251,7 +251,7 @@ if let VmState::Suspended { snapshot, .. } = state {
 }
 ```
-See [`examples/rust/basic.rs`](examples/rust/basic.rs) for more.
+See [`examples/rust/basic/basic.rs`](examples/rust/basic/basic.rs) for more.
 </details>
 <details>
@@ -272,7 +272,7 @@ console.log(result.output);  // 120
 </script>
 ```
-See [`examples/wasm/index.html`](examples/wasm/index.html) for a full playground.
+See [`examples/wasm/basic/index.html`](examples/wasm/basic/index.html) for a full playground.
 </details>
 ## AI Agent Usage
@@ -326,7 +326,7 @@ const { text } = await generateText({
 Under the hood: the LLM writes TypeScript code that calls your tools → Zapcode executes it in a sandbox → tool calls suspend the VM → your `execute` functions run on the host → results flow back in. All in ~2µs startup + tool execution time.
-See [`examples/typescript/ai-agent-zapcode-ai.ts`](examples/typescript/ai-agent-zapcode-ai.ts) for the full working example.
+See [`examples/typescript/ai-agent/ai-agent-zapcode-ai.ts`](examples/typescript/ai-agent/ai-agent-zapcode-ai.ts) for the full working example.
 <details>
 <summary><strong>Anthropic SDK</strong></summary>
@@ -391,7 +391,7 @@ while state.get("suspended"):
 print(state["output"])
 ```
-See [`examples/typescript/ai-agent-anthropic.ts`](examples/typescript/ai-agent-anthropic.ts) and [`examples/python/ai_agent_anthropic.py`](examples/python/ai_agent_anthropic.py).
+See [`examples/typescript/ai-agent/ai-agent-anthropic.ts`](examples/typescript/ai-agent/ai-agent-anthropic.ts) and [`examples/python/ai-agent/ai_agent_anthropic.py`](examples/python/ai-agent/ai_agent_anthropic.py).
 </details>
 <details>
@@ -478,6 +478,63 @@ langchain_tool = b.custom["langchain"]
 The adapter receives an `AdapterContext` with everything needed: system prompt, tool name, tool JSON schema, and a `handleToolCall` function. Return whatever shape your SDK expects.
 </details>
+## Auto-Fix, Debug & Execution Tracing
+### Auto-fix (`autoFix`)
+When enabled, execution errors are returned as tool results instead of throwing — letting the LLM see the error and self-correct on the next step.
+**TypeScript:**
+```typescript
+const { system, tools } = zapcode({
+  autoFix: true,
+  tools: { /* ... */ },
+});
+```
+**Python:**
+```python
+zap = zapcode(auto_fix=True, tools={...})
+```
+### Execution Trace
+Every execution produces a trace tree with timing for each phase (parse → compile → execute). Use `printTrace()` / `print_trace()` to display the full session trace, or `getTrace()` / `get_trace()` to access the trace programmatically.
+**TypeScript:**
+```typescript
+const { system, tools, printTrace, getTrace } = zapcode({
+  autoFix: true,
+  tools: { /* ... */ },
+});
+// After running...
+printTrace();
+// ✓ zapcode.session  12.3ms
+//   ✓ execute_code    8.1ms
+//     ✓ parse          0.2ms
+//     ✓ compile        0.1ms
+//     ✓ execute        7.8ms
+const trace = getTrace(); // TraceSpan tree
+```
+**Python:**
+```python
+zap = zapcode(auto_fix=True, tools={...})
+# After running...
+zap.print_trace()
+trace = zap.get_trace()  # TraceSpan tree
+```
+### Debug Logging
+For detailed logging of generated code, tool calls, and output, see the debug-tracing examples which show how to inspect each execution step:
+- [TypeScript debug-tracing example](examples/typescript/debug-tracing/main.ts)
+- [Python debug-tracing example](examples/python/debug-tracing/main.py)
 ## What Zapcode Can and Cannot Do
 **Can do:**

{zapcode_ai-1.3.0 → zapcode_ai-1.4.0}/README.md RENAMED Viewed

@@ -21,10 +21,10 @@
 AI agents are more capable when they **write code** instead of chaining tool calls. Code gives agents loops, conditionals, variables, and composition — things that tool chains simulate poorly.
-- [CodeMode](https://blog.cloudflare.com/codemode-ai-agent-coding) — Cloudflare on why agents should write code
-- [Programmatic Tool Calling](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/tool-use-examples#programmatic-tool-calling) — Anthropic's approach
-- [Code Execution with MCP](https://www.anthropic.com/engineering/code-execution-mcp) — Anthropic engineering
-- [Smol Agents](https://huggingface.co/docs/smolagents/en/index) — Hugging Face's code-first agents
+- [Codemode](https://blog.cloudflare.com/code-mode/) from Cloudflare
+- [Programmatic Tool Calling](https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling) from Anthropic
+- [Code Execution with MCP](https://www.anthropic.com/engineering/code-execution-with-mcp) from Anthropic
+- [Smol Agents](https://github.com/huggingface/smolagents) from Hugging Face
 **But running AI-generated code is dangerous and slow.**
@@ -150,7 +150,7 @@ if (!state.completed) {
 }
 ```
-See [`examples/typescript/basic.ts`](examples/typescript/basic.ts) for more.
+See [`examples/typescript/basic/main.ts`](examples/typescript/basic/main.ts) for more.
 ### Python
@@ -187,7 +187,7 @@ if state.get("suspended"):
     result = restored.resume({"condition": "Clear", "temp": 26})
 ```
-See [`examples/python/basic.py`](examples/python/basic.py) for more.
+See [`examples/python/basic/main.py`](examples/python/basic/main.py) for more.
 <details>
 <summary><strong>Rust</strong></summary>
@@ -225,7 +225,7 @@ if let VmState::Suspended { snapshot, .. } = state {
 }
 ```
-See [`examples/rust/basic.rs`](examples/rust/basic.rs) for more.
+See [`examples/rust/basic/basic.rs`](examples/rust/basic/basic.rs) for more.
 </details>
 <details>
@@ -246,7 +246,7 @@ console.log(result.output);  // 120
 </script>
 ```
-See [`examples/wasm/index.html`](examples/wasm/index.html) for a full playground.
+See [`examples/wasm/basic/index.html`](examples/wasm/basic/index.html) for a full playground.
 </details>
 ## AI Agent Usage
@@ -300,7 +300,7 @@ const { text } = await generateText({
 Under the hood: the LLM writes TypeScript code that calls your tools → Zapcode executes it in a sandbox → tool calls suspend the VM → your `execute` functions run on the host → results flow back in. All in ~2µs startup + tool execution time.
-See [`examples/typescript/ai-agent-zapcode-ai.ts`](examples/typescript/ai-agent-zapcode-ai.ts) for the full working example.
+See [`examples/typescript/ai-agent/ai-agent-zapcode-ai.ts`](examples/typescript/ai-agent/ai-agent-zapcode-ai.ts) for the full working example.
 <details>
 <summary><strong>Anthropic SDK</strong></summary>
@@ -365,7 +365,7 @@ while state.get("suspended"):
 print(state["output"])
 ```
-See [`examples/typescript/ai-agent-anthropic.ts`](examples/typescript/ai-agent-anthropic.ts) and [`examples/python/ai_agent_anthropic.py`](examples/python/ai_agent_anthropic.py).
+See [`examples/typescript/ai-agent/ai-agent-anthropic.ts`](examples/typescript/ai-agent/ai-agent-anthropic.ts) and [`examples/python/ai-agent/ai_agent_anthropic.py`](examples/python/ai-agent/ai_agent_anthropic.py).
 </details>
 <details>
@@ -452,6 +452,63 @@ langchain_tool = b.custom["langchain"]
 The adapter receives an `AdapterContext` with everything needed: system prompt, tool name, tool JSON schema, and a `handleToolCall` function. Return whatever shape your SDK expects.
 </details>
+## Auto-Fix, Debug & Execution Tracing
+### Auto-fix (`autoFix`)
+When enabled, execution errors are returned as tool results instead of throwing — letting the LLM see the error and self-correct on the next step.
+**TypeScript:**
+```typescript
+const { system, tools } = zapcode({
+  autoFix: true,
+  tools: { /* ... */ },
+});
+```
+**Python:**
+```python
+zap = zapcode(auto_fix=True, tools={...})
+```
+### Execution Trace
+Every execution produces a trace tree with timing for each phase (parse → compile → execute). Use `printTrace()` / `print_trace()` to display the full session trace, or `getTrace()` / `get_trace()` to access the trace programmatically.
+**TypeScript:**
+```typescript
+const { system, tools, printTrace, getTrace } = zapcode({
+  autoFix: true,
+  tools: { /* ... */ },
+});
+// After running...
+printTrace();
+// ✓ zapcode.session  12.3ms
+//   ✓ execute_code    8.1ms
+//     ✓ parse          0.2ms
+//     ✓ compile        0.1ms
+//     ✓ execute        7.8ms
+const trace = getTrace(); // TraceSpan tree
+```
+**Python:**
+```python
+zap = zapcode(auto_fix=True, tools={...})
+# After running...
+zap.print_trace()
+trace = zap.get_trace()  # TraceSpan tree
+```
+### Debug Logging
+For detailed logging of generated code, tool calls, and output, see the debug-tracing examples which show how to inspect each execution step:
+- [TypeScript debug-tracing example](examples/typescript/debug-tracing/main.ts)
+- [Python debug-tracing example](examples/python/debug-tracing/main.py)
 ## What Zapcode Can and Cannot Do
 **Can do:**

{zapcode_ai-1.3.0 → zapcode_ai-1.4.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "zapcode-ai"
-version = "1.3.0" # x-release-please-version
+version = "1.4.0" # x-release-please-version
 description = "AI SDK integration for Zapcode — let LLMs write and execute TypeScript safely"
 readme = "README.md"
 requires-python = ">=3.10"

{zapcode_ai-1.3.0 → zapcode_ai-1.4.0}/src/zapcode_ai/__init__.py RENAMED Viewed

@@ -29,6 +29,8 @@ Works with any AI SDK:
 from __future__ import annotations
+import json
+import time
 from dataclasses import dataclass, field
 from typing import Any, Callable, Awaitable
@@ -55,12 +57,27 @@ class ToolDefinition:
     execute: Callable[..., Any]  # (args: dict) -> Any or awaitable
+@dataclass
+class TraceSpan:
+    """A single span in the execution trace. OTel-compatible shape."""
+    name: str
+    start_time: float  # ms since epoch
+    end_time: float = 0.0
+    duration_ms: float = 0.0
+    status: str = "ok"  # "ok" or "error"
+    attributes: dict[str, Any] = field(default_factory=dict)
+    children: list[TraceSpan] = field(default_factory=list)
 @dataclass
 class ExecutionResult:
     """Result of executing guest code."""
+    code: str
     output: Any
     stdout: str
     tool_calls: list[dict[str, Any]]
+    error: str | None = None
+    trace: TraceSpan | None = None
 # ---------------------------------------------------------------------------
@@ -142,6 +159,39 @@ Rules:
     return "\n\n".join(parts)
+# ---------------------------------------------------------------------------
+# Trace helpers
+# ---------------------------------------------------------------------------
+def _create_span(name: str, attributes: dict[str, Any] | None = None) -> TraceSpan:
+    return TraceSpan(
+        name=name,
+        start_time=time.time() * 1000,
+        attributes=attributes or {},
+    )
+def _end_span(span: TraceSpan, status: str | None = None) -> TraceSpan:
+    span.end_time = time.time() * 1000
+    span.duration_ms = span.end_time - span.start_time
+    if status:
+        span.status = status
+    return span
+def _print_trace(span: TraceSpan, indent: int = 0) -> None:
+    prefix = "" if indent == 0 else "│ " * (indent - 1) + "├─ "
+    icon = "✗" if span.status == "error" else "✓"
+    duration = "<1ms" if span.duration_ms < 1 else f"{span.duration_ms:.0f}ms"
+    attrs = " ".join(
+        f"{k}={str(v)[:80]}" for k, v in span.attributes.items()
+        if not k.startswith("zapcode.code")  # don't dump full code in trace
+    )
+    print(f"{prefix}{icon} {span.name} ({duration}){' ' + attrs if attrs else ''}")
+    for child in span.children:
+        _print_trace(child, indent + 1)
 # ---------------------------------------------------------------------------
 # Execution engine
 # ---------------------------------------------------------------------------
@@ -152,48 +202,100 @@ def _execute_code(
     *,
     memory_limit_bytes: int | None = None,
     time_limit_ms: int | None = None,
+    debug: bool = False,
+    auto_fix: bool = False,
 ) -> ExecutionResult:
     tool_names = list(tool_defs.keys())
     tool_calls: list[dict[str, Any]] = []
+    tracing = debug or auto_fix
-    kwargs: dict[str, Any] = {"external_functions": tool_names}
-    if time_limit_ms is not None:
-        kwargs["time_limit_ms"] = time_limit_ms
-    if memory_limit_bytes is not None:
-        kwargs["memory_limit_bytes"] = memory_limit_bytes
-    sandbox = Zapcode(code, **kwargs)
-    state = sandbox.start()
-    while state.get("suspended"):
-        fn_name = state["function_name"]
-        args = state["args"]
-        tool_def = tool_defs.get(fn_name)
-        if not tool_def:
-            raise ValueError(
-                f"Guest code called unknown function '{fn_name}'. "
-                f"Available: {', '.join(tool_names)}"
-            )
-        # Build named args from positional args
-        param_names = list(tool_def.parameters.keys())
-        named_args = {
-            param_names[i]: args[i]
-            for i in range(min(len(param_names), len(args)))
-        }
+    exec_span = _create_span("execute", {"zapcode.code": code}) if tracing else None
-        result = tool_def.execute(named_args)
-        tool_calls.append({"name": fn_name, "args": args, "result": result})
+    try:
+        kwargs: dict[str, Any] = {"external_functions": tool_names}
+        if time_limit_ms is not None:
+            kwargs["time_limit_ms"] = time_limit_ms
+        if memory_limit_bytes is not None:
+            kwargs["memory_limit_bytes"] = memory_limit_bytes
-        snapshot: ZapcodeSnapshot = state["snapshot"]
-        state = snapshot.resume(result)
+        sandbox = Zapcode(code, **kwargs)
+        state = sandbox.start()
-    return ExecutionResult(
-        output=state.get("output"),
-        stdout=state.get("stdout", ""),
-        tool_calls=tool_calls,
-    )
+        while state.get("suspended"):
+            fn_name = state["function_name"]
+            args = state["args"]
+            tool_def = tool_defs.get(fn_name)
+            if not tool_def:
+                raise ValueError(
+                    f"Guest code called unknown function '{fn_name}'. "
+                    f"Available: {', '.join(tool_names)}"
+                )
+            # Build named args from positional args
+            param_names = list(tool_def.parameters.keys())
+            named_args = {
+                param_names[i]: args[i]
+                for i in range(min(len(param_names), len(args)))
+            }
+            tool_span = _create_span("tool_call", {
+                "zapcode.tool.name": fn_name,
+                "zapcode.tool.args": json.dumps(args, default=str),
+            }) if tracing else None
+            result = tool_def.execute(named_args)
+            tool_calls.append({"name": fn_name, "args": args, "result": result})
+            if tool_span:
+                tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str)
+                _end_span(tool_span)
+                exec_span.children.append(tool_span)
+            snapshot: ZapcodeSnapshot = state["snapshot"]
+            state = snapshot.resume(result)
+        stdout = state.get("stdout", "")
+        if exec_span:
+            exec_span.attributes["zapcode.output"] = json.dumps(state.get("output"), default=str)
+            if stdout:
+                exec_span.attributes["zapcode.stdout"] = stdout
+            _end_span(exec_span)
+        if debug and exec_span:
+            _print_trace(exec_span)
+        return ExecutionResult(
+            code=code,
+            output=state.get("output"),
+            stdout=stdout,
+            tool_calls=tool_calls,
+            trace=exec_span,
+        )
+    except Exception as err:
+        error_msg = str(err)
+        if exec_span:
+            exec_span.attributes["zapcode.error"] = error_msg
+            _end_span(exec_span, "error")
+        if not auto_fix:
+            if debug and exec_span:
+                _print_trace(exec_span)
+            raise
+        if debug and exec_span:
+            _print_trace(exec_span)
+        return ExecutionResult(
+            code=code,
+            output=None,
+            stdout="",
+            tool_calls=tool_calls,
+            error=f"Execution failed: {error_msg}. Please fix your code and try again.",
+            trace=exec_span,
+        )
 # ---------------------------------------------------------------------------
@@ -241,6 +343,12 @@ class ZapcodeAI:
     custom: dict[str, Any] = field(default_factory=dict)
     """Output from custom adapters, keyed by adapter name."""
+    get_trace: Callable[[], TraceSpan | None] = field(default=lambda: None)
+    """Get the full session trace tree. Available when debug or auto_fix is enabled."""
+    print_trace: Callable[[], None] = field(default=lambda: None)
+    """Print the full session trace tree to the console."""
 # ---------------------------------------------------------------------------
 # Main entry point
@@ -252,6 +360,8 @@ def zapcode(
     system: str | None = None,
     memory_limit_bytes: int | None = None,
     time_limit_ms: int = 10_000,
+    debug: bool = False,
+    auto_fix: bool = False,
     adapters: list[Adapter] | None = None,
 ) -> ZapcodeAI:
     """
@@ -263,6 +373,11 @@ def zapcode(
     - `handle_tool_call(code)` → Universal handler for any SDK
     - `custom` → Output from custom adapters
+    Args:
+        debug: Log generated code, tool calls, and output to the console.
+        auto_fix: When True, execution errors are returned as tool results
+            instead of raising. The LLM sees the error and can self-correct.
     Example with Anthropic SDK::
         from zapcode_ai import zapcode, ToolDefinition, ParamDef
@@ -293,13 +408,30 @@ def zapcode(
                 print(result.output)
     """
     system_prompt = _build_system_prompt(tools, system)
+    tracing = debug or auto_fix
+    # Session-level trace collects all attempts
+    session_trace: TraceSpan | None = (
+        _create_span("session", {"zapcode.tools": ", ".join(tools.keys())})
+        if tracing else None
+    )
+    attempt_count = 0
     def handle_tool_call(code: str) -> ExecutionResult:
-        return _execute_code(
+        nonlocal attempt_count
+        attempt_count += 1
+        result = _execute_code(
             code, tools,
             memory_limit_bytes=memory_limit_bytes,
             time_limit_ms=time_limit_ms,
+            debug=debug,
+            auto_fix=auto_fix,
         )
+        if session_trace and result.trace:
+            result.trace.name = f"attempt_{attempt_count}"
+            result.trace.attributes["zapcode.attempt"] = attempt_count
+            session_trace.children.append(result.trace)
+        return result
     # Anthropic SDK format
     anthropic_tools = [
@@ -335,12 +467,28 @@ def zapcode(
         for adapter in adapters:
             custom[adapter.name] = adapter.adapt(ctx)
+    def get_trace() -> TraceSpan | None:
+        if not session_trace:
+            return None
+        status = "ok" if any(c.status == "ok" for c in session_trace.children) else "error"
+        _end_span(session_trace, status)
+        return session_trace
+    def print_session_trace() -> None:
+        trace = get_trace()
+        if trace:
+            print("\n─── Zapcode Trace ───")
+            _print_trace(trace)
+            print("─────────────────────\n")
     return ZapcodeAI(
         system=system_prompt,
         anthropic_tools=anthropic_tools,
         openai_tools=openai_tools,
         handle_tool_call=handle_tool_call,
         custom=custom,
+        get_trace=get_trace,
+        print_trace=print_session_trace,
     )
@@ -350,6 +498,8 @@ def execute(
     *,
     memory_limit_bytes: int | None = None,
     time_limit_ms: int | None = None,
+    debug: bool = False,
+    auto_fix: bool = False,
 ) -> ExecutionResult:
     """
     Execute TypeScript code directly in a Zapcode sandbox with tool resolution.
@@ -374,4 +524,6 @@ def execute(
         code, tools,
         memory_limit_bytes=memory_limit_bytes,
         time_limit_ms=time_limit_ms,
+        debug=debug,
+        auto_fix=auto_fix,
     )

{zapcode_ai-1.3.0 → zapcode_ai-1.4.0}/.gitignore RENAMED Viewed

File without changes

zapcode-ai 1.3.0__tar.gz → 1.4.0__tar.gz

zapcode-ai 1.3.0tar.gz → 1.4.0tar.gz