PyPI - ai-agent-inspector - Versions diffs - 1.0.0__tar.gz → 1.1.0__tar.gz - Mend

ai-agent-inspector 1.0.0tar.gz → 1.1.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (36) hide show

{ai_agent_inspector-1.0.0/ai_agent_inspector.egg-info → ai_agent_inspector-1.1.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: ai-agent-inspector
-Version: 1.0.0
+Version: 1.1.0
 Summary: Framework-agnostic observability for AI agents
 Author-email: Agent Inspector Team <team@agentinspector.dev>
 License: MIT
@@ -27,6 +27,7 @@ Requires-Dist: uvicorn[standard]>=0.24.0
 Requires-Dist: cryptography>=41.0.0
 Requires-Dist: jinja2>=3.1.0
 Requires-Dist: python-dotenv>=1.0.0
+Requires-Dist: openai>=2.16.0
 Provides-Extra: langchain
 Requires-Dist: langchain>=0.1.0; extra == "langchain"
 Provides-Extra: otel
@@ -154,7 +155,7 @@ Traditional tools model systems as function calls and spans. Agent Inspector mod
 ### Storage
 - **SQLite** – WAL mode for concurrent access; runs and steps tables; indexes on run_id and timestamp.
-- **Pruning** – CLI `prune --retention-days N` and optional `--vacuum`; API/DB support for retention.
+- **Pruning** – CLI `prune --retention-days N` and optional `--retention-max-bytes BYTES`, `--vacuum`; API/DB support for retention by age and by size.
 - **Backup** – CLI `backup /path/to/backup.db` for full DB copy.
 - **Export to JSON** – **API** `GET /v1/runs/{run_id}/export` returns run metadata + timeline with decoded event data; **CLI** `agent-inspector export <run_id> [--output file.json]` and `agent-inspector export --all [--limit N] [--output file.json]` for backup or migration.
@@ -506,6 +507,7 @@ export TRACE_ENCRYPTION_KEY=your-secret-key-here
 # Storage
 export TRACE_DB_PATH=agent_inspector.db
 export TRACE_RETENTION_DAYS=30
+export TRACE_RETENTION_MAX_BYTES=
 # API
 export TRACE_API_HOST=127.0.0.1
@@ -604,24 +606,78 @@ search_flights_agent("Find flights from SFO to JFK")
 This example makes real LLM calls and runs multiple scenarios.
 ```bash
-cp .env.example .env
+cp examples/.env.example examples/.env
 ```
-Set these in `.env`:
-- `OPENAI_BASE_URL`
-- `OPENAI_API_KEY`
-- `OPENAI_MODEL`
+Set these in `examples/.env`:
+- `OPENAI_API_KEY` - Your API key
+- `OPENAI_BASE_URL` - API endpoint (e.g., `https://api.openai.com/v1` or your custom provider)
+- `OPENAI_MODEL` - Model name (e.g., `gpt-4o-mini`, `glm-4.7`)
+- `OPENAI_TEMPERATURE` - Temperature setting (default: 0.2)
+- `OPENAI_TIMEOUT` - Timeout in seconds (default: 120)
+Install dependencies:
+```bash
+uv add openai python-dotenv
+```
 Run a single question:
 ```bash
-python examples/real_agent.py "What is 13 * (7 + 5)?"
+uv run python examples/real_agent.py "What is 13 * (7 + 5)?"
 ```
 Run the full scenario suite:
 ```bash
-python examples/real_agent.py --suite
+uv run python examples/real_agent.py --suite
+```
+### Multi-Agent Example
+This example demonstrates a realistic multi-agent customer support system with:
+- **Agent spawning** with different models per agent
+- **Intelligent routing** to specialized agents (billing, technical, triage, manager)
+- **Tool execution** with realistic operations (profile lookup, billing history, system logs)
+- **Agent communication** with handoffs for escalations
+- **Detailed responses** with contextual, professional customer service replies
+- **Escalation workflow** where complex issues get manager oversight
+```bash
+cp examples/.env.example examples/.env
+```
+Configure in `examples/.env`:
+- `OPENAI_API_KEY` - Your API key
+- `OPENAI_BASE_URL` - API endpoint
+- `OPENAI_MODEL` - Default model for all agents
+- `MODEL_TRIAGE` - Model for triage agent (optional, falls back to `OPENAI_MODEL`)
+- `MODEL_BILLING` - Model for billing agent (optional)
+- `MODEL_TECHNICAL` - Model for technical agent (optional)
+- `MODEL_MANAGER` - Model for manager agent (optional)
+Install dependencies:
+```bash
+uv add openai python-dotenv
+```
+Run in simulated mode (no API needed):
+```bash
+python examples/multi_agent.py
 ```
+Run with real LLM calls:
+```bash
+uv run python examples/multi_agent.py
+```
+The example traces:
+- Customer requests with routing analysis
+- Agent-specific tool usage with realistic results
+- Detailed, contextual responses for each customer issue
+- Escalation flows with manager handoffs
+- Task assignment and completion tracking
+Note: Without `openai` package and valid API key, this example will use simulated responses with realistic agent behavior. Install `openai` with `uv add openai` and configure `OPENAI_API_KEY` in `examples/.env` for real LLM calls. Use `uv run python` to execute the script with uv's virtual environment.
 ### With LangChain (Automatic)
 ```python
@@ -716,6 +772,30 @@ with trace.run("planning_agent", user_id="user123") as main_ctx:
     trace.final(answer="I've booked your flight. Confirmation: CONF-12345")
 ```
+### Async / asyncio
+Context is propagated via `contextvars`, so tracing works with asyncio as long as each task has its own `trace.run()` (one run per task). Do not share a single run across concurrent tasks.
+```python
+import asyncio
+from agent_inspector import trace
+async def agent_task(name: str, query: str):
+    with trace.run(name):
+        trace.llm(model="gpt-4", prompt=query, response=f"Processed: {query}")
+        trace.final(answer=f"Done: {query}")
+    return name
+async def main():
+    results = await asyncio.gather(
+        agent_task("agent_1", "Query A"),
+        agent_task("agent_2", "Query B"),
+    )
+    return results
+asyncio.run(main())
+```
 ### Memory Operations
 ```python
@@ -836,35 +916,41 @@ result = chain.run("Your query", callbacks=callbacks)
 ### Creating Custom Adapters
-Create a new adapter by extending `BaseCallbackHandler` (for LangChain-like frameworks) or by using the Trace SDK directly:
+Use the Trace SDK directly when your framework has no LangChain-style callback API. Checklist:
+1. **Entry point** – Wrap agent execution in `trace.run("run_name")` so there is an active context.
+2. **LLM calls** – Where your framework invokes the model, call `context.llm(model=..., prompt=..., response=...)`.
+3. **Tool calls** – Where tools are executed, call `context.tool(tool_name=..., tool_args=..., tool_result=...)`.
+4. **Final answer** – When the agent finishes, call `context.final(answer=...)`.
+5. **Errors** – On failure, call `context.error(error_type=..., error_message=..., critical=...)`.
+Template:
 ```python
-from agent_inspector import Trace, get_trace
+from agent_inspector import trace, get_trace
 class CustomAdapter:
-    def __init__(self, trace: Trace = None):
-        self.trace = trace or get_trace()
-    def on_llm_call(self, model, prompt, response):
-        """Handle LLM calls in your framework."""
+    def __init__(self, trace_instance=None):
+        self.trace = trace_instance or get_trace()
+    def on_llm_call(self, model: str, prompt: str, response: str):
         context = self.trace.get_active_context()
         if context:
             context.llm(model=model, prompt=prompt, response=response)
-    def on_tool_call(self, tool_name, args, result):
-        """Handle tool calls in your framework."""
+    def on_tool_call(self, tool_name: str, tool_args: dict, tool_result: str):
         context = self.trace.get_active_context()
         if context:
-            context.tool(tool_name=tool_name, tool_args=args, tool_result=result)
+            context.tool(tool_name=tool_name, tool_args=tool_args, tool_result=tool_result)
-# Use your adapter
-with trace.run("custom_agent"):
+# Use: always run inside trace.run() so get_active_context() returns a context
+with trace.run("my_agent"):
     adapter = CustomAdapter()
-    # Your framework code
     adapter.on_llm_call("gpt-4", "Hello", "Hi there!")
 ```
+For LangChain-like frameworks, extend `BaseCallbackHandler` and pass the handler into the framework's callback list; see the LangChain adapter source for the pattern.
 ---
 ## Development
@@ -952,8 +1038,12 @@ Releases are automated with [Release Please](https://github.com/googleapis/relea
 - **fix:** – bug fix (bumps patch version)
 - **feat!:** or **BREAKING CHANGE:** – breaking change (bumps major version)
+To force a specific version, add `Release-As: X.Y.Z` in the commit message footer (e.g. `Release-As: 1.1.0`).
 When you merge the Release PR, a tag is created and the [publish workflow](.github/workflows/publish.yml) publishes to PyPI (OIDC).
+**First release:** Release Please only creates Release PRs for commits *since* the latest release. If you have no release yet, create the initial tag so it has a baseline: `git tag v1.0.0 && git push origin v1.0.0`. After that, new `feat:`/`fix:` commits (not `docs:` or `chore:`) will get Release PRs.
 ---
 ## Contributing
@@ -1011,8 +1101,8 @@ agent-inspector server [--host HOST] [--port PORT]
 # View statistics
 agent-inspector stats
-# Prune old traces
-agent-inspector prune [--retention-days N] [--vacuum]
+# Prune old traces (optionally by size: --retention-max-bytes BYTES)
+agent-inspector prune [--retention-days N] [--retention-max-bytes BYTES] [--vacuum]
 # Vacuum database
 agent-inspector vacuum
@@ -1047,6 +1137,12 @@ Agent Inspector is designed for minimal overhead:
 - Background thread: ~5MB (batch processing)
 - Database: Varies with trace volume
+### Scaling and alerting
+- **Single process / moderate load** – Use the default SQLite storage with sampling and retention (e.g. `retention_days`, optional `retention_max_bytes`). Suitable for one or a few worker processes.
+- **High throughput or many writers** – Use an OTLP or custom exporter to send traces to a central backend (e.g. Jaeger, Tempo, Grafana). The built-in UI and API then serve only that process; aggregate viewing is in your backend.
+- **Alerting** – The SDK does not push alerts. Use the API from your own checks: e.g. `GET /v1/stats` for `failed_runs`, `recent_runs_24h`, or `queue.events_dropped` (when the default Trace is in use), and alert when thresholds are exceeded. Optionally run `agent-inspector prune` on a schedule to enforce retention.
 ---
 ## Security

{ai_agent_inspector-1.0.0 → ai_agent_inspector-1.1.0}/README.md RENAMED Viewed

@@ -107,7 +107,7 @@ Traditional tools model systems as function calls and spans. Agent Inspector mod
 ### Storage
 - **SQLite** – WAL mode for concurrent access; runs and steps tables; indexes on run_id and timestamp.
-- **Pruning** – CLI `prune --retention-days N` and optional `--vacuum`; API/DB support for retention.
+- **Pruning** – CLI `prune --retention-days N` and optional `--retention-max-bytes BYTES`, `--vacuum`; API/DB support for retention by age and by size.
 - **Backup** – CLI `backup /path/to/backup.db` for full DB copy.
 - **Export to JSON** – **API** `GET /v1/runs/{run_id}/export` returns run metadata + timeline with decoded event data; **CLI** `agent-inspector export <run_id> [--output file.json]` and `agent-inspector export --all [--limit N] [--output file.json]` for backup or migration.
@@ -459,6 +459,7 @@ export TRACE_ENCRYPTION_KEY=your-secret-key-here
 # Storage
 export TRACE_DB_PATH=agent_inspector.db
 export TRACE_RETENTION_DAYS=30
+export TRACE_RETENTION_MAX_BYTES=
 # API
 export TRACE_API_HOST=127.0.0.1
@@ -557,24 +558,78 @@ search_flights_agent("Find flights from SFO to JFK")
 This example makes real LLM calls and runs multiple scenarios.
 ```bash
-cp .env.example .env
+cp examples/.env.example examples/.env
 ```
-Set these in `.env`:
-- `OPENAI_BASE_URL`
-- `OPENAI_API_KEY`
-- `OPENAI_MODEL`
+Set these in `examples/.env`:
+- `OPENAI_API_KEY` - Your API key
+- `OPENAI_BASE_URL` - API endpoint (e.g., `https://api.openai.com/v1` or your custom provider)
+- `OPENAI_MODEL` - Model name (e.g., `gpt-4o-mini`, `glm-4.7`)
+- `OPENAI_TEMPERATURE` - Temperature setting (default: 0.2)
+- `OPENAI_TIMEOUT` - Timeout in seconds (default: 120)
+Install dependencies:
+```bash
+uv add openai python-dotenv
+```
 Run a single question:
 ```bash
-python examples/real_agent.py "What is 13 * (7 + 5)?"
+uv run python examples/real_agent.py "What is 13 * (7 + 5)?"
 ```
 Run the full scenario suite:
 ```bash
-python examples/real_agent.py --suite
+uv run python examples/real_agent.py --suite
+```
+### Multi-Agent Example
+This example demonstrates a realistic multi-agent customer support system with:
+- **Agent spawning** with different models per agent
+- **Intelligent routing** to specialized agents (billing, technical, triage, manager)
+- **Tool execution** with realistic operations (profile lookup, billing history, system logs)
+- **Agent communication** with handoffs for escalations
+- **Detailed responses** with contextual, professional customer service replies
+- **Escalation workflow** where complex issues get manager oversight
+```bash
+cp examples/.env.example examples/.env
+```
+Configure in `examples/.env`:
+- `OPENAI_API_KEY` - Your API key
+- `OPENAI_BASE_URL` - API endpoint
+- `OPENAI_MODEL` - Default model for all agents
+- `MODEL_TRIAGE` - Model for triage agent (optional, falls back to `OPENAI_MODEL`)
+- `MODEL_BILLING` - Model for billing agent (optional)
+- `MODEL_TECHNICAL` - Model for technical agent (optional)
+- `MODEL_MANAGER` - Model for manager agent (optional)
+Install dependencies:
+```bash
+uv add openai python-dotenv
+```
+Run in simulated mode (no API needed):
+```bash
+python examples/multi_agent.py
 ```
+Run with real LLM calls:
+```bash
+uv run python examples/multi_agent.py
+```
+The example traces:
+- Customer requests with routing analysis
+- Agent-specific tool usage with realistic results
+- Detailed, contextual responses for each customer issue
+- Escalation flows with manager handoffs
+- Task assignment and completion tracking
+Note: Without `openai` package and valid API key, this example will use simulated responses with realistic agent behavior. Install `openai` with `uv add openai` and configure `OPENAI_API_KEY` in `examples/.env` for real LLM calls. Use `uv run python` to execute the script with uv's virtual environment.
 ### With LangChain (Automatic)
 ```python
@@ -669,6 +724,30 @@ with trace.run("planning_agent", user_id="user123") as main_ctx:
     trace.final(answer="I've booked your flight. Confirmation: CONF-12345")
 ```
+### Async / asyncio
+Context is propagated via `contextvars`, so tracing works with asyncio as long as each task has its own `trace.run()` (one run per task). Do not share a single run across concurrent tasks.
+```python
+import asyncio
+from agent_inspector import trace
+async def agent_task(name: str, query: str):
+    with trace.run(name):
+        trace.llm(model="gpt-4", prompt=query, response=f"Processed: {query}")
+        trace.final(answer=f"Done: {query}")
+    return name
+async def main():
+    results = await asyncio.gather(
+        agent_task("agent_1", "Query A"),
+        agent_task("agent_2", "Query B"),
+    )
+    return results
+asyncio.run(main())
+```
 ### Memory Operations
 ```python
@@ -789,35 +868,41 @@ result = chain.run("Your query", callbacks=callbacks)
 ### Creating Custom Adapters
-Create a new adapter by extending `BaseCallbackHandler` (for LangChain-like frameworks) or by using the Trace SDK directly:
+Use the Trace SDK directly when your framework has no LangChain-style callback API. Checklist:
+1. **Entry point** – Wrap agent execution in `trace.run("run_name")` so there is an active context.
+2. **LLM calls** – Where your framework invokes the model, call `context.llm(model=..., prompt=..., response=...)`.
+3. **Tool calls** – Where tools are executed, call `context.tool(tool_name=..., tool_args=..., tool_result=...)`.
+4. **Final answer** – When the agent finishes, call `context.final(answer=...)`.
+5. **Errors** – On failure, call `context.error(error_type=..., error_message=..., critical=...)`.
+Template:
 ```python
-from agent_inspector import Trace, get_trace
+from agent_inspector import trace, get_trace
 class CustomAdapter:
-    def __init__(self, trace: Trace = None):
-        self.trace = trace or get_trace()
-    def on_llm_call(self, model, prompt, response):
-        """Handle LLM calls in your framework."""
+    def __init__(self, trace_instance=None):
+        self.trace = trace_instance or get_trace()
+    def on_llm_call(self, model: str, prompt: str, response: str):
         context = self.trace.get_active_context()
         if context:
             context.llm(model=model, prompt=prompt, response=response)
-    def on_tool_call(self, tool_name, args, result):
-        """Handle tool calls in your framework."""
+    def on_tool_call(self, tool_name: str, tool_args: dict, tool_result: str):
         context = self.trace.get_active_context()
         if context:
-            context.tool(tool_name=tool_name, tool_args=args, tool_result=result)
+            context.tool(tool_name=tool_name, tool_args=tool_args, tool_result=tool_result)
-# Use your adapter
-with trace.run("custom_agent"):
+# Use: always run inside trace.run() so get_active_context() returns a context
+with trace.run("my_agent"):
     adapter = CustomAdapter()
-    # Your framework code
     adapter.on_llm_call("gpt-4", "Hello", "Hi there!")
 ```
+For LangChain-like frameworks, extend `BaseCallbackHandler` and pass the handler into the framework's callback list; see the LangChain adapter source for the pattern.
 ---
 ## Development
@@ -905,8 +990,12 @@ Releases are automated with [Release Please](https://github.com/googleapis/relea
 - **fix:** – bug fix (bumps patch version)
 - **feat!:** or **BREAKING CHANGE:** – breaking change (bumps major version)
+To force a specific version, add `Release-As: X.Y.Z` in the commit message footer (e.g. `Release-As: 1.1.0`).
 When you merge the Release PR, a tag is created and the [publish workflow](.github/workflows/publish.yml) publishes to PyPI (OIDC).
+**First release:** Release Please only creates Release PRs for commits *since* the latest release. If you have no release yet, create the initial tag so it has a baseline: `git tag v1.0.0 && git push origin v1.0.0`. After that, new `feat:`/`fix:` commits (not `docs:` or `chore:`) will get Release PRs.
 ---
 ## Contributing
@@ -964,8 +1053,8 @@ agent-inspector server [--host HOST] [--port PORT]
 # View statistics
 agent-inspector stats
-# Prune old traces
-agent-inspector prune [--retention-days N] [--vacuum]
+# Prune old traces (optionally by size: --retention-max-bytes BYTES)
+agent-inspector prune [--retention-days N] [--retention-max-bytes BYTES] [--vacuum]
 # Vacuum database
 agent-inspector vacuum
@@ -1000,6 +1089,12 @@ Agent Inspector is designed for minimal overhead:
 - Background thread: ~5MB (batch processing)
 - Database: Varies with trace volume
+### Scaling and alerting
+- **Single process / moderate load** – Use the default SQLite storage with sampling and retention (e.g. `retention_days`, optional `retention_max_bytes`). Suitable for one or a few worker processes.
+- **High throughput or many writers** – Use an OTLP or custom exporter to send traces to a central backend (e.g. Jaeger, Tempo, Grafana). The built-in UI and API then serve only that process; aggregate viewing is in your backend.
+- **Alerting** – The SDK does not push alerts. Use the API from your own checks: e.g. `GET /v1/stats` for `failed_runs`, `recent_runs_24h`, or `queue.events_dropped` (when the default Trace is in use), and alert when thresholds are exceeded. Optionally run `agent-inspector prune` on a schedule to enforce retention.
 ---
 ## Security

{ai_agent_inspector-1.0.0 → ai_agent_inspector-1.1.0}/agent_inspector/__init__.py RENAMED Viewed

@@ -43,6 +43,11 @@ from .core.interfaces import Exporter, Sampler
 from .core.trace import (
     Trace,
     TraceContext,
+    agent_communication,
+    agent_handoff,
+    agent_join,
+    agent_leave,
+    agent_spawn,
     error,
     final,
     get_trace,
@@ -51,6 +56,8 @@ from .core.trace import (
     memory_write,
     run,
     set_trace,
+    task_assign,
+    task_complete,
     tool,
 )
@@ -71,6 +78,11 @@ except ImportError:
 # Event types
 from .core.events import (
+    AgentCommunicationEvent,
+    AgentHandoffEvent,
+    AgentJoinEvent,
+    AgentLeaveEvent,
+    AgentSpawnEvent,
     BaseEvent,
     ErrorEvent,
     EventStatus,
@@ -81,6 +93,8 @@ from .core.events import (
     MemoryWriteEvent,
     RunEndEvent,
     RunStartEvent,
+    TaskAssignmentEvent,
+    TaskCompletionEvent,
     ToolCallEvent,
 )
@@ -100,6 +114,14 @@ __all__ = [
     "final",
     "get_trace",
     "set_trace",
+    # Multi-agent tracing
+    "agent_spawn",
+    "agent_join",
+    "agent_leave",
+    "agent_communication",
+    "agent_handoff",
+    "task_assign",
+    "task_complete",
     # Configuration
     "TraceConfig",
     "Profile",
@@ -121,6 +143,14 @@ __all__ = [
     "MemoryWriteEvent",
     "ErrorEvent",
     "FinalAnswerEvent",
+    # Multi-agent event types
+    "AgentSpawnEvent",
+    "AgentJoinEvent",
+    "AgentLeaveEvent",
+    "AgentCommunicationEvent",
+    "AgentHandoffEvent",
+    "TaskAssignmentEvent",
+    "TaskCompletionEvent",
     # Adapters (optional)
     "enable_langchain",
     # API server (optional)

{ai_agent_inspector-1.0.0 → ai_agent_inspector-1.1.0}/agent_inspector/cli.py RENAMED Viewed

@@ -113,18 +113,30 @@ def cmd_prune(args):
     # Override with CLI arguments
     if args.retention_days is not None:
         config.retention_days = args.retention_days
+    if getattr(args, "retention_max_bytes", None) is not None:
+        config.retention_max_bytes = args.retention_max_bytes
     # Initialize database
     db = Database(config)
     db.initialize()
-    # Prune old runs
+    # Prune by age first
     deleted_count = db.prune_old_runs(retention_days=config.retention_days)
+    size_deleted = 0
+    # Then prune by size if configured
+    max_bytes = config.retention_max_bytes
+    if max_bytes is not None and max_bytes > 0:
+        size_deleted = db.prune_by_size(max_bytes)
+        deleted_count += size_deleted
+        if size_deleted > 0:
+            print(f"✅ Pruned {size_deleted} runs by size (max_bytes={max_bytes})")
     if deleted_count > 0:
-        print(f"✅ Pruned {deleted_count} old runs")
+        if size_deleted == 0:  # only age-based pruning had effect
+            print(f"✅ Pruned {deleted_count} old runs")
-        # Optionally vacuum to reclaim space
+        # Optionally vacuum to reclaim space (prune_by_size already vacuums)
         if args.vacuum:
             print("💾 Running VACUUM to reclaim disk space...")
             if db.vacuum():
@@ -408,6 +420,13 @@ Examples:
         type=int,
         help="Retention period in days (default: from config)",
     )
+    prune_parser.add_argument(
+        "--retention-max-bytes",
+        type=int,
+        default=None,
+        metavar="BYTES",
+        help="Prune oldest runs until DB size is at or below BYTES (optional; from config if set)",
+    )
     prune_parser.add_argument(
         "--vacuum",
         action="store_true",

ai-agent-inspector 1.0.0__tar.gz → 1.1.0__tar.gz

ai-agent-inspector 1.0.0tar.gz → 1.1.0tar.gz