PyPI - fableforge-agent-telemetry - Versions diffs - 0.1.0__tar.gz - Mend

fableforge-agent-telemetry 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

fableforge_agent_telemetry-0.1.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 FableForge Contributors
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

fableforge_agent_telemetry-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,21 @@
+Metadata-Version: 2.4
+Name: fableforge-agent-telemetry
+Version: 0.1.0
+Summary: Datadog for AI agents — real-time observability, token tracking, cost estimation, error rates
+Requires-Python: >=3.10
+License-File: LICENSE
+Requires-Dist: fastapi>=0.110.0
+Requires-Dist: uvicorn[standard]>=0.27.0
+Requires-Dist: clickhouse-driver>=0.2.6
+Requires-Dist: sqlalchemy>=2.0.0
+Requires-Dist: pydantic>=2.5.0
+Requires-Dist: plotly>=5.18.0
+Requires-Dist: click>=8.1.0
+Requires-Dist: tiktoken>=0.6.0
+Requires-Dist: rich>=13.7.0
+Requires-Dist: jinja2>=3.1.0
+Provides-Extra: dev
+Requires-Dist: pytest>=8.0.0; extra == "dev"
+Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
+Requires-Dist: httpx>=0.27.0; extra == "dev"
+Dynamic: license-file

fableforge_agent_telemetry-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,269 @@
+# AgentTelemetry
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/) [![Tests](https://img.shields.io/badge/tests-0-yellow.svg)](tests/)
+**Datadog for AI agents** — real-time observability, token tracking, cost estimation, and error rates for autonomous agent sessions.
+## Features
+- **Multi-format trace ingestion** — Parse traces from Glint-Research, armand0e, and v-Fable formats with auto-detection
+- **Token tracking** — Count tokens with tiktoken, estimate costs with real per-model pricing
+- **Cost estimation** — Detailed breakdowns for Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku, GPT-4, GPT-4o, GPT-4o-mini, Qwen3-Coder
+- **Error tracking** — Automated error classification (bash, edit, timeout, rate limit, auth, context overflow, etc.) and recovery rate calculation
+- **Interactive dashboard** — FastAPI + Plotly charts for session timelines, heatmaps, cost pies, and error breakdowns
+- **Dual storage** — ClickHouse for production, SQLite for local/dev with automatic fallback
+- **CLI** — Analyze traces, start dashboards, and generate reports from the command line
+## Installation
+```bash
+pip install -e .
+# With ClickHouse support (production):
+pip install -e ".[clickhouse]"
+# Development:
+pip install -e ".[dev]"
+```
+## Quick Start
+### Analyze a Trace File
+```bash
+# Auto-detect format and analyze
+agenttelemetry analyze trace.jsonl
+# Specify format explicitly
+agenttelemetry analyze trace.jsonl --format glint
+# Analyze without storing
+agenttelemetry analyze trace.jsonl --no-store
+```
+### Cost Report
+```bash
+# Detailed cost breakdown
+agenttelemetry cost trace.jsonl
+# Output:
+Model:           gpt-4
+Input tokens:       1,234,567  ($0.037037)
+Output tokens:        98,765  ($0.005926)
+Cache read:          500,000  ($0.000000)
+Cache creation:           0  ($0.000000)
+─────────────────────────────────────
+Total:           $0.042963
+```
+### Error Report
+```bash
+# Show errors with classification and recovery rates
+agenttelemetry errors trace.jsonl
+```
+### Token Counting
+```bash
+# Count tokens in a string
+agenttelemetry tokens "Hello, world!" --model gpt-4
+```
+### Dashboard
+```bash
+# Start the interactive dashboard
+agenttelemetry dashboard
+# Custom host/port
+agenttelemetry dashboard --host 0.0.0.0 --port 9000
+```
+Open `http://127.0.0.1:8088/dashboard` to see session metrics, timelines, heatmaps, and cost reports.
+## Analyzing Fable5 Traces
+Fable5 (v-Fable) traces can be analyzed directly:
+```bash
+# Ingest a Fable5 session trace
+agenttelemetry analyze ~/.fable/sessions/2025-01-15-abc123.jsonl
+# View cost breakdown
+agenttelemetry cost ~/.fable/sessions/2025-01-15-abc123.jsonl
+# Check errors and recovery
+agenttelemetry errors ~/.fable/sessions/2025-01-15-abc123.jsonl
+# Start dashboard with ingested data
+agenttelemetry dashboard
+# Then open http://127.0.0.1:8088/dashboard
+```
+### v-Fable Trace Format
+The v-Fable format uses these fields per JSONL line:
+```json
+{
+  "kind": "tool_use",
+  "timestamp": "2025-01-15T10:30:00Z",
+  "session_id": "abc123",
+  "span_id": "span-001",
+  "parent_span_id": null,
+  "tool_name": "Bash",
+  "tokens": {"prompt": 1500, "completion": 800, "cache_read": 200, "cache_write": 50},
+  "duration_ms": 2345.6,
+  "cost_usd": 0.0234,
+  "model": "claude-3.5-sonnet",
+  "status": "success",
+  "error_message": null
+}
+```
+### Glint-Research Format
+```json
+{
+  "type": "tool_call",
+  "timestamp": "2025-01-15T10:30:00Z",
+  "session_id": "glint-session-1",
+  "span_id": "span-001",
+  "tool": "Edit",
+  "usage": {
+    "input_tokens": 2000,
+    "output_tokens": 500,
+    "cache_read_input_tokens": 300,
+    "cache_creation_input_tokens": 100
+  },
+  "duration_ms": 1500.0,
+  "model": "gpt-4o",
+  "error": null
+}
+```
+### armand0e Format
+Uses paired `invocation`/`response` events:
+```json
+{"event": "invocation", "id": "span-001", "session": "arm-session", "tool": {"name": "Write", "input": {"path": "/tmp/file.py"}}, "model": "claude-3.5-sonnet"}
+{"event": "response", "id": "span-001", "tokens": {"in": 1500, "out": 800, "cached": 200}, "latency_ms": 1800.0, "model": "claude-3.5-sonnet"}
+```
+## Pricing Reference
+| Model | Input ($/1M tok) | Output ($/1M tok) | Cache Read ($/1M) | Cache Write ($/1M) |
+|---|---|---|---|---|
+| Claude 3.5 Sonnet | $3.00 | $15.00 | $0.30 | $3.75 |
+| Claude 3 Opus | $15.00 | $75.00 | $1.50 | $18.75 |
+| Claude 3 Haiku | $0.25 | $1.25 | $0.03 | $0.30 |
+| GPT-4 | $30.00 | $60.00 | — | — |
+| GPT-4o | $2.50 | $10.00 | $1.25 | — |
+| GPT-4o-mini | $0.15 | $0.60 | $0.075 | — |
+| Qwen3-Coder | $0.50 | $2.00 | $0.10 | $0.50 |
+## Architecture
+```
+agent_telemetry/
+├── __init__.py          # Package exports
+├── models.py            # Pydantic models (Span, SessionMetrics, ToolMetrics, CostReport)
+├── collector.py         # Trace ingestion (parse_glint_trace, parse_armand0e_trace, parse_vfable_trace)
+├── token_tracker.py     # Token counting + cost estimation with real pricing
+├── error_tracker.py      # Error detection, classification, recovery rate
+├── storage.py           # ClickHouse + SQLite dual storage
+├── dashboard.py         # FastAPI dashboard with Plotly charts
+└── cli.py               # Click CLI (analyze, dashboard, cost, errors, tokens)
+```
+## API Endpoints
+| Endpoint | Description |
+|---|---|
+| `GET /dashboard` | Main dashboard with session list |
+| `GET /sessions/{id}` | Session detail with metrics table |
+| `GET /sessions/{id}/timeline` | Tool call timeline (Plotly bar chart) |
+| `GET /sessions/{id}/heatmap` | Tool usage heatmap (Plotly) |
+| `GET /cost/report` | Cost breakdown across all sessions |
+| `GET /cost/report?session_id=X` | Cost breakdown for a specific session |
+| `GET /errors/report` | Error report across all sessions |
+| `GET /errors/report?session_id=X` | Error report for a specific session |
+## Python API
+```python
+from agent_telemetry.collector import ingest_trace, calculate_metrics
+from agent_telemetry.token_tracker import estimate_cost, count_tokens
+from agent_telemetry.error_tracker import generate_error_report
+from agent_telemetry.storage import TelemetryStorage
+# Analyze a trace
+spans = ingest_trace("trace.jsonl", fmt="vfable")
+metrics = calculate_metrics(spans)
+# Estimate cost
+cost = estimate_cost(10000, 5000, model="claude-3.5-sonnet", cache_read=3000)
+print(f"Total: ${cost.total_cost:.6f}")
+# Store in database
+storage = TelemetryStorage()
+storage.store_spans(spans)
+storage.store_session_metrics(metrics["session"])
+# Generate error report
+report = generate_error_report("session-123", spans=spans)
+print(f"Errors: {report.total_errors}, Recovery rate: {report.recovery_rate:.0%}")
+# Query spans
+session_spans = storage.query_spans(session_id="session-123")
+tool_spans = storage.query_spans(tool_name="Bash")
+```
+## Development
+```bash
+# Install dev dependencies
+pip install -e ".[dev]"
+# Run tests
+pytest tests/
+# Run specific test file
+pytest tests/test_token_tracker.py -v
+```
+## License
+MIT
+## Ecosystem
+Part of the [FableForge](../) ecosystem — 21 open-source projects built from 210K real agent traces:
+| Project | Description |
+| --- | --- |
+| **[Anvil](../anvil)** | Self-verified coding agent |
+| **[VerifyLoop](../verifyloop)** | Plan→Execute→Verify→Recover framework |
+| **[ErrorRecovery](../error-recovery)** | Self-healing middleware (3,725 error patterns) |
+| **[FableForge-14B](../fableforge-14b)** | The fine-tuned 14B model (4-stage training) |
+| **[ShellWhisperer](../shell-whisperer)** | 1.5B edge agent (phone/RPi, 50ms) |
+| **[ReasonCritic](../reason-critic)** | Verification model (130 benchmark tasks) |
+| **[TraceCompiler](../trace-compiler)** | Compile traces → LoRA skills |
+| **[AgentRuntime](../agent-runtime)** | Persistent agent daemon (systemd for AI) |
+| **[AgentSwarm](../agent-swarm)** | Multi-agent from real trace transitions |
+| **[AgentTelemetry](../agent-telemetry)** | Datadog for agents (token tracking, costs) |
+| **[BenchAgent](../bench-agent)** | HumanEval for tool-use (107 tasks) |
+| **[AgentDev](../agent-dev)** | VSCode extension with verification |
+| **[TraceViz](../trace-viz)** | Trace replay visualizer (Next.js) |
+| **[AgentSkills](../agent-skills)** | npm for agent behaviors |
+| **[AgentCurriculum](../agent-curriculum)** | 5-stage progressive training |
+| **[AgentFuzzer](../agent-fuzzer)** | Adversarial testing for agents |
+| **[AgentConstitution](../agent-constitution)** | Safety guardrails from traces |
+| **[CostOptimizer](../cost-optimizer)** | Token cost reduction (50-80%) |
+| **[AgentProfiler](../agent-profiler)** | Behavioral fingerprinting |
+| **[TrajectoryDistiller](../trajectory-distiller)** | Trace→training data pipeline |
+| **[Fable5-Dataset](../fable5-dataset)** | HuggingFace dataset release |

fableforge_agent_telemetry-0.1.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,37 @@
+[build-system]
+requires = ["setuptools>=68.0", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "fableforge-agent-telemetry"
+version = "0.1.0"
+description = "Datadog for AI agents — real-time observability, token tracking, cost estimation, error rates"
+requires-python = ">=3.10"
+dependencies = [
+    "fastapi>=0.110.0",
+    "uvicorn[standard]>=0.27.0",
+    "clickhouse-driver>=0.2.6",
+    "sqlalchemy>=2.0.0",
+    "pydantic>=2.5.0",
+    "plotly>=5.18.0",
+    "click>=8.1.0",
+    "tiktoken>=0.6.0",
+    "rich>=13.7.0",
+    "jinja2>=3.1.0",
+]
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-asyncio>=0.23.0",
+    "httpx>=0.27.0",
+]
+[project.scripts]
+agenttelemetry = "agent_telemetry.cli:cli"
+[tool.setuptools.packages.find]
+where = ["src"]
+[tool.pytest.ini_options]
+testpaths = ["tests"]

fableforge_agent_telemetry-0.1.0/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

fableforge_agent_telemetry-0.1.0/src/agent_telemetry/__init__.py ADDED Viewed

@@ -0,0 +1,7 @@
+"""AgentTelemetry — Datadog for AI agents."""
+__version__ = "0.1.0"
+from agent_telemetry.models import Span, SessionMetrics, ToolMetrics, CostReport
+__all__ = ["Span", "SessionMetrics", "ToolMetrics", "CostReport", "__version__"]

fableforge_agent_telemetry-0.1.0/src/agent_telemetry/cli.py ADDED Viewed

@@ -0,0 +1,212 @@
+"""CLI for AgentTelemetry — analyze traces, view costs, start dashboard."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+from typing import Optional
+import click
+from rich.console import Console
+from rich.table import Table
+from agent_telemetry.collector import (
+    auto_detect_format,
+    calculate_metrics,
+    ingest_trace,
+)
+from agent_telemetry.error_tracker import classify_error, generate_error_report
+from agent_telemetry.models import Span
+from agent_telemetry.storage import TelemetryStorage
+from agent_telemetry.token_tracker import estimate_cost, format_cost_table
+console = Console()
+@click.group()
+@click.version_option(version="0.1.0")
+def cli() -> None:
+    """AgentTelemetry — Datadog for AI agents."""
+@cli.command()
+@click.argument("trace_file", type=click.Path(exists=True))
+@click.option("--format", "fmt", type=click.Choice(["glint", "armand0e", "vfable", "auto"]), default="auto", help="Trace format")
+@click.option("--store/--no-store", default=True, help="Store results in database")
+def analyze(trace_file: str, fmt: str, store: bool) -> None:
+    """Analyze a trace file and display metrics."""
+    if fmt == "auto":
+        fmt = auto_detect_format(trace_file)
+        console.print(f"[dim]Detected format: {fmt}[/dim]")
+    spans = ingest_trace(trace_file, fmt=fmt)
+    if not spans:
+        console.print("[red]No spans found in trace file.[/red]")
+        sys.exit(1)
+    console.print(f"[green]Loaded {len(spans)} spans[/green]")
+    metrics_result = calculate_metrics(spans)
+    session = metrics_result["session"]
+    tools = metrics_result["tools"]
+    console.print(f"\n[bold]Session:[/bold] {session.session_id}")
+    console.print(f"[bold]Model:[/bold]    {session.model}")
+    console.print(f"[bold]Duration:[/bold]  {session.duration_seconds:.1f}s")
+    metrics_table = Table(title="Session Metrics", show_header=True)
+    metrics_table.add_column("Metric", style="cyan")
+    metrics_table.add_column("Value", justify="right")
+    metrics_table.add_row("Total Tokens", f"{session.total_tokens:,}")
+    metrics_table.add_row("Total Cost", f"${session.total_cost:.6f}")
+    metrics_table.add_row("Tool Calls", str(session.tool_calls))
+    metrics_table.add_row("Errors", str(session.error_count))
+    metrics_table.add_row("Avg Duration", f"{session.avg_tool_duration_ms:.0f}ms")
+    metrics_table.add_row("P50 Duration", f"{session.p50_duration_ms:.0f}ms")
+    metrics_table.add_row("P95 Duration", f"{session.p95_duration_ms:.0f}ms")
+    metrics_table.add_row("P99 Duration", f"{session.p99_duration_ms:.0f}ms")
+    metrics_table.add_row("Cache Hit Rate", f"{session.cache_hit_rate:.1%}")
+    console.print(metrics_table)
+    tool_table = Table(title="Tool Metrics", show_header=True)
+    tool_table.add_column("Tool", style="cyan")
+    tool_table.add_column("Calls", justify="right")
+    tool_table.add_column("Avg ms", justify="right")
+    tool_table.add_column("P95 ms", justify="right")
+    tool_table.add_column("Error Rate", justify="right")
+    tool_table.add_column("Cost", justify="right", style="green")
+    for name, tm in sorted(tools.items()):
+        tool_table.add_row(
+            name,
+            str(tm.call_count),
+            f"{tm.avg_duration_ms:.0f}",
+            f"{tm.p95_duration_ms:.0f}",
+            f"{tm.error_rate:.1%}",
+            f"${tm.total_cost_usd:.6f}",
+        )
+    console.print(tool_table)
+    if store:
+        storage = TelemetryStorage()
+        storage.store_spans(spans)
+        storage.store_session_metrics(session)
+        console.print(f"\n[dim]Stored {len(spans)} spans in database[/dim]")
+@cli.command()
+@click.option("--host", default="127.0.0.1", help="Host to bind to")
+@click.option("--port", default=8088, type=int, help="Port to bind to")
+def dashboard(host: str, port: int) -> None:
+    """Start the interactive dashboard server."""
+    import uvicorn
+    from agent_telemetry.dashboard import app
+    console.print(f"[green]Starting AgentTelemetry dashboard on http://{host}:{port}[/green]")
+    console.print("[dim]Press Ctrl+C to stop[/dim]")
+    uvicorn.run(app, host=host, port=port)
+@cli.command()
+@click.argument("trace_file", type=click.Path(exists=True))
+@click.option("--format", "fmt", type=click.Choice(["glint", "armand0e", "vfable", "auto"]), default="auto", help="Trace format")
+def cost(trace_file: str, fmt: str) -> None:
+    """Show cost breakdown for a trace file."""
+    if fmt == "auto":
+        fmt = auto_detect_format(trace_file)
+        console.print(f"[dim]Detected format: {fmt}[/dim]")
+    spans = ingest_trace(trace_file, fmt=fmt)
+    if not spans:
+        console.print("[red]No spans found in trace file.[/red]")
+        sys.exit(1)
+    models: dict[str, list[Span]] = {}
+    for s in spans:
+        models.setdefault(s.model, []).append(s)
+    breakdowns = []
+    for model, model_spans in sorted(models.items()):
+        bd = estimate_cost(
+            sum(s.input_tokens for s in model_spans),
+            sum(s.output_tokens for s in model_spans),
+            model,
+            sum(s.cache_read for s in model_spans),
+            sum(s.cache_creation for s in model_spans),
+        )
+        breakdowns.append(bd)
+    console.print(format_cost_table(breakdowns))
+    total = sum(b.total_cost for b in breakdowns)
+    console.print(f"\n[bold green]Grand Total: ${total:.6f}[/bold green]")
+@cli.command()
+@click.argument("trace_file", type=click.Path(exists=True))
+@click.option("--format", "fmt", type=click.Choice(["glint", "armand0e", "vfable", "auto"]), default="auto", help="Trace format")
+def errors(trace_file: str, fmt: str) -> None:
+    """Show error report for a trace file."""
+    if fmt == "auto":
+        fmt = auto_detect_format(trace_file)
+        console.print(f"[dim]Detected format: {fmt}[/dim]")
+    spans = ingest_trace(trace_file, fmt=fmt)
+    if not spans:
+        console.print("[red]No spans found in trace file.[/red]")
+        sys.exit(1)
+    session_id = spans[0].session_id
+    report = generate_error_report(session_id, spans=spans)
+    console.print(f"\n[bold]Error Report: {session_id}[/bold]")
+    console.print(f"Total Errors:  {report.total_errors}")
+    console.print(f"Recovered:      {report.recovered_errors}")
+    console.print(f"Recovery Rate:  {report.recovery_rate:.0%}")
+    if report.errors_by_type:
+        type_table = Table(title="Errors by Type", show_header=True)
+        type_table.add_column("Error Type", style="red")
+        type_table.add_column("Count", justify="right")
+        for etype, count in sorted(report.errors_by_type.items(), key=lambda x: -x[1]):
+            type_table.add_row(etype, str(count))
+        console.print(type_table)
+    if report.errors:
+        error_table = Table(title="Error Details", show_header=True)
+        error_table.add_column("Span ID", style="dim")
+        error_table.add_column("Type", style="red")
+        error_table.add_column("Tool", style="cyan")
+        error_table.add_column("Message", max_width=60)
+        error_table.add_column("Recovered")
+        for e in report.errors[:50]:
+            recovered = "[green]✓[/green]" if e.recovered else "[red]✗[/red]"
+            error_table.add_row(
+                e.span_id[:12] + "...",
+                e.error_type,
+                e.tool_name,
+                e.error_message[:60],
+                recovered,
+            )
+        console.print(error_table)
+@cli.command()
+@click.argument("text")
+@click.option("--model", default="gpt-4", help="Model name for token counting")
+def tokens(text: str, model: str) -> None:
+    """Count tokens in a text string."""
+    from agent_telemetry.token_tracker import count_tokens
+    n = count_tokens(text, model)
+    console.print(f"[bold]{n:,}[/bold] tokens ({model})")
+    bd = estimate_cost(n, 0, model)
+    console.print(f"Input cost (no output): ${bd.input_cost:.6f}")
+if __name__ == "__main__":
+    cli()