npm - @cephalization/phoenix-insight - Versions diffs - 0.1.0 → 0.3.0 - Mend

@cephalization/phoenix-insight 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +454 -337
package/dist/cli.js +64 -9
package/dist/snapshot/context.js +171 -62
package/dist/snapshot/utils.js +112 -0
package/dist/tsconfig.esm.tsbuildinfo +1 -1
package/package.json +1 -1
package/src/cli.ts +106 -34
package/src/snapshot/context.ts +200 -75
package/src/snapshot/utils.ts +140 -0

package/README.md CHANGED Viewed

@@ -2,9 +2,9 @@
 [![npm version](https://img.shields.io/npm/v/@cephalization/phoenix-insight.svg)](https://www.npmjs.com/package/@cephalization/phoenix-insight)
-A filesystem-native AI agent CLI for querying Phoenix instances using the "bash + files" paradigm inspired by [Vercel's agent architecture](https://vercel.com/blog/how-to-build-agents-with-filesystems-and-bash).
+Phoenix Insight brings AI-powered analysis to your [Phoenix](https://github.com/Arize-ai/phoenix) observability data using the ["bash + files" paradigm](https://vercel.com/blog/how-to-build-agents-with-filesystems-and-bash). Instead of hiding data behind opaque API calls, Phoenix Insight materializes your traces, experiments, datasets, and prompts as a structured filesystem. An AI agent then explores this data using standard bash commands you already know: `cat`, `grep`, `jq`, `awk`, and more.
-Phoenix Insight transforms your Phoenix observability data into a structured filesystem, then uses an AI agent with bash tools to analyze it through natural language queries. This approach provides transparency, flexibility, and power that traditional APIs can't match.
+This filesystem-native approach provides transparency that traditional APIs can't match. Every query the agent runs is visible and reproducible. You can inspect the exact files it reads, copy its commands, and run them yourself. The data is just files, and the analysis is just bash, making AI-driven observability debuggable, auditable, and extensible with any tool in your Unix toolkit.
 ## Installation
@@ -22,139 +22,51 @@ npx @cephalization/phoenix-insight "your query"
 ## Quick Start
 ```bash
-# Start interactive mode (no arguments needed)
+# Start interactive mode
 phoenix-insight
-# Query Phoenix data with natural language
+# Analyze Phoenix data with natural language
 phoenix-insight "What are the most common errors in the last hour?"
-# Local mode with persistent storage
-phoenix-insight --local "analyze trace patterns"
-# Force fresh data
-phoenix-insight --refresh "show me the slowest endpoints"
 # Show help
 phoenix-insight help
 ```
-## How It Works
-Phoenix Insight operates in three phases:
-1. **Data Ingestion**: Fetches data from your Phoenix instance and creates a structured filesystem snapshot
-2. **AI Analysis**: An AI agent explores the data using bash commands (cat, grep, jq, awk, etc.)
-3. **Natural Language Results**: The agent synthesizes findings into clear, actionable insights
-### Filesystem Structure
-Phoenix data is organized into an intuitive REST-like hierarchy:
-```
-/phoenix/
-  _context.md                       # Start here! Human-readable summary
-  /projects/
-    index.jsonl                     # All projects
-    /{project_name}/
-      metadata.json                 # Project details
-      /spans/
-        index.jsonl                 # Trace spans (sampled)
-  /datasets/
-    index.jsonl                     # All datasets
-    /{dataset_name}/
-      metadata.json
-      examples.jsonl
-  /experiments/
-    index.jsonl                     # All experiments
-    /{experiment_id}/
-      metadata.json
-      runs.jsonl
-  /prompts/
-    index.jsonl                     # All prompts
-    /{prompt_name}/
-      metadata.json
-      /versions/
-        /{version}.md               # Prompt templates as markdown
-  /traces/                          # Fetched on-demand
-    /{trace_id}/
-      spans.jsonl
-      metadata.json
-  /_meta/
-    snapshot.json                   # Snapshot metadata
-```
-## Execution Modes
-Phoenix Insight supports two execution modes:
-### Sandbox Mode (default)
+## CLI Examples
-Uses [just-bash](https://github.com/vercel-labs/just-bash) for complete isolation:
-- **In-memory filesystem**: No disk writes
-- **Simulated bash**: 50+ built-in commands
-- **Zero risk**: Cannot access your system
-- **Perfect for**: CI/CD, demos, safe exploration
-### Local Mode (--local)
-Uses real bash and persistent storage:
-- **Persistent data**: Snapshots saved to `~/.phoenix-insight/`
-- **Full bash power**: All system commands available
-- **Incremental updates**: Only fetches new data
-- **Perfect for**: Power users, complex analysis, custom tools
-## Usage Examples
+This section covers all CLI usage, progressing from basic to advanced scenarios.
 ### Basic Queries
 ```bash
-# Analyze errors
+# Ask a question about your Phoenix data
 phoenix-insight "What types of errors are occurring most frequently?"
-# Performance analysis
+# Analyze performance patterns
 phoenix-insight "Find the slowest traces and identify patterns"
-# Experiment comparison
+# Compare experiments
 phoenix-insight "Compare success rates across recent experiments"
-# Dataset exploration
+# Explore datasets
 phoenix-insight "Show me statistics about my datasets"
 ```
-### Advanced Options
-```bash
-# Connect to remote Phoenix instance
-phoenix-insight "analyze traces" \
-  --base-url https://phoenix.example.com \
-  --api-key your-api-key
-# Increase span fetch limit (default: 1000 per project)
-phoenix-insight "deep trace analysis" --limit 5000
-# Stream responses in real-time
-phoenix-insight "complex analysis task" --stream
-# Use local mode for persistent storage
-phoenix-insight "experimental query" --local
-# Enable observability tracing (sends traces to Phoenix)
-phoenix-insight "analyze performance" --trace
-```
 ### Interactive Mode
-Start an interactive REPL session for multiple queries:
+Start an interactive REPL session for multiple queries without re-fetching data:
 ```bash
 # Start interactive mode (default when no query is provided)
-$ phoenix-insight
+phoenix-insight
 # Or explicitly with --interactive flag
-$ phoenix-insight --interactive
+phoenix-insight --interactive
+```
+Within interactive mode:
+```
 phoenix> What projects have the most spans?
 [Agent analyzes and responds...]
@@ -175,19 +87,23 @@ phoenix> exit
 Create or update snapshots separately from queries:
 ```bash
-# Create initial snapshot
+# Create initial snapshot (explicit command)
+phoenix-insight snapshot create
+# Create initial snapshot (shorthand, same as 'snapshot create')
 phoenix-insight snapshot
-# Force refresh (ignore cache)
+# Force refresh snapshot (ignore cache)
 phoenix-insight snapshot --refresh
-# Snapshot from specific Phoenix instance
-phoenix-insight snapshot \
-  --base-url https://phoenix.example.com \
-  --api-key your-api-key
+# Snapshot from a specific Phoenix instance
+phoenix-insight snapshot --base-url https://phoenix.example.com --api-key your-api-key
-# Enable observability tracing for snapshot process
-phoenix-insight snapshot --trace
+# Get the path to the latest snapshot
+phoenix-insight snapshot latest
+# List all available snapshots
+phoenix-insight snapshot list
 # Clean up local snapshots
 phoenix-insight prune
@@ -196,319 +112,345 @@ phoenix-insight prune
 phoenix-insight prune --dry-run
 ```
-### On-Demand Data Fetching
-The agent can fetch additional data during analysis:
+### Local Mode
 ```bash
-# In your query, the agent might discover it needs more data:
-"I need more spans to complete this analysis. Let me fetch them..."
-px-fetch-more spans --project my-project --limit 500
-# Or fetch a specific trace:
-"I'll get the full trace to understand the error..."
-px-fetch-more trace --trace-id abc123
+# Run a query in local mode (persistent storage, full bash capabilities)
+phoenix-insight --local "analyze trace patterns"
 ```
-## Configuration
+### Connection Options
-Phoenix Insight uses a layered configuration system with the following priority (highest to lowest):
+Connect to different Phoenix instances:
-1. **CLI arguments** - Options passed directly to the command
-2. **Environment variables** - `PHOENIX_*` environment variables
-3. **Config file** - JSON file at `~/.phoenix-insight/config.json`
+```bash
+# Connect to a remote Phoenix instance
+phoenix-insight --base-url https://phoenix.example.com "analyze traces"
-### Config File
+# Authenticate with an API key
+phoenix-insight --base-url https://phoenix.example.com --api-key your-api-key "show errors"
+```
-On first run, Phoenix Insight automatically creates a default config file at `~/.phoenix-insight/config.json` with all default values. You can edit this file to customize your settings.
+### Data Fetching Options
-**Config file location:**
+```bash
+# Increase span fetch limit (default: 1000 per project)
+phoenix-insight --limit 5000 "deep trace analysis"
-- Default: `~/.phoenix-insight/config.json`
-- Override with env var: `PHOENIX_INSIGHT_CONFIG=/path/to/config.json`
-- Override with CLI flag: `--config /path/to/config.json`
+# Force refresh of cached data
+phoenix-insight --refresh "show me the latest errors"
+```
-**Example config.json with all options:**
+### Output Options
-```json
-{
-  "baseUrl": "http://localhost:6006",
-  "apiKey": "your-api-key",
-  "limit": 1000,
-  "stream": true,
-  "mode": "sandbox",
-  "refresh": false,
-  "trace": false
-}
+```bash
+# Disable streaming for batch processing (streaming is enabled by default)
+phoenix-insight --no-stream "generate report" > report.txt
 ```
-| Config Key | Type                     | Default                 | Description                                   |
-| ---------- | ------------------------ | ----------------------- | --------------------------------------------- |
-| `baseUrl`  | string                   | `http://localhost:6006` | Phoenix server URL                            |
-| `apiKey`   | string                   | (none)                  | API key for authentication                    |
-| `limit`    | number                   | `1000`                  | Maximum spans to fetch per project            |
-| `stream`   | boolean                  | `true`                  | Enable streaming responses from the agent     |
-| `mode`     | `"sandbox"` \| `"local"` | `"sandbox"`             | Execution mode                                |
-| `refresh`  | boolean                  | `false`                 | Force refresh of snapshot data                |
-| `trace`    | boolean                  | `false`                 | Enable tracing of agent operations to Phoenix |
+### Observability
-### Environment Variables
+```bash
+# Enable tracing of agent operations to Phoenix
+phoenix-insight --trace "analyze performance"
+```
-| Variable                  | Config Key | Default                 | Description                |
-| ------------------------- | ---------- | ----------------------- | -------------------------- |
-| `PHOENIX_BASE_URL`        | `baseUrl`  | `http://localhost:6006` | Phoenix server URL         |
-| `PHOENIX_API_KEY`         | `apiKey`   | (none)                  | API key for authentication |
-| `PHOENIX_INSIGHT_LIMIT`   | `limit`    | `1000`                  | Max spans per project      |
-| `PHOENIX_INSIGHT_STREAM`  | `stream`   | `true`                  | Enable streaming           |
-| `PHOENIX_INSIGHT_MODE`    | `mode`     | `sandbox`               | Execution mode             |
-| `PHOENIX_INSIGHT_REFRESH` | `refresh`  | `false`                 | Force refresh snapshot     |
-| `PHOENIX_INSIGHT_TRACE`   | `trace`    | `false`                 | Enable tracing             |
-| `PHOENIX_INSIGHT_CONFIG`  | -          | -                       | Custom config file path    |
-| `DEBUG`                   | -          | `0`                     | Show detailed error info   |
+### On-Demand Data Fetching
-### Commands
+Within interactive mode, the agent can fetch additional data during analysis:
-Phoenix Insight provides several commands:
+```bash
+px-fetch-more spans --project my-project --limit 500
+px-fetch-more trace --trace-id abc123
+```
-- **Default (interactive mode)**: `phoenix-insight` - Start interactive REPL when no query is provided
-- **Query mode**: `phoenix-insight "your query"` - Analyze Phoenix data with natural language
-- **`help`**: Show help information
-- **`snapshot`**: Create or update a data snapshot from Phoenix
-- **`prune`**: Delete local snapshot directory to free up space
+### Combining Options
-### Command Line Options
+Combine multiple options for complex scenarios:
-| Option                | Description                        | Default          | Applies to     |
-| --------------------- | ---------------------------------- | ---------------- | -------------- |
-| `--config <path>`     | Custom config file path            | (auto-detected)  | all            |
-| `--sandbox`           | Run in sandbox mode (default)      | true             | query          |
-| `--local`             | Run in local mode                  | false            | query          |
-| `--base-url <url>`    | Phoenix server URL                 | env or localhost | all            |
-| `--api-key <key>`     | Phoenix API key                    | env or none      | all            |
-| `--refresh`           | Force fresh snapshot               | false            | query/snapshot |
-| `--limit <n>`         | Max spans per project              | 1000             | query          |
-| `--stream`            | Stream agent responses             | true             | query          |
-| `--interactive`, `-i` | Interactive REPL mode              | false            | query          |
-| `--trace`             | Enable tracing to Phoenix instance | false            | query/snapshot |
-| `--dry-run`           | Preview without making changes     | false            | prune          |
+```bash
+# Remote instance with authentication, local mode, and increased limit
+phoenix-insight --local --base-url https://phoenix.example.com \
+  --api-key your-api-key --limit 5000 "deep analysis of production traces"
-### Local Mode Storage
+# Refresh data, enable tracing, and stream output
+phoenix-insight --refresh --trace --stream "analyze error patterns over time"
+```
-In local mode, data is stored in:
+## Example Queries
-```
-~/.phoenix-insight/
-  config.json                  # Configuration (auto-created on first run)
-  /snapshots/
-    /{timestamp}/              # Each snapshot
-      /phoenix/                # Phoenix data
-  /cache/                      # API response cache
-```
+Phoenix Insight understands natural language queries about your observability data. Here are examples organized by analysis type to help you get started.
-To clean up local storage:
+### Error Analysis
 ```bash
-# Delete all local snapshots
-phoenix-insight prune
-# Preview what will be deleted
-phoenix-insight prune --dry-run
+# Find and categorize errors
+phoenix-insight "What are the most common errors in the last 24 hours?"
+phoenix-insight "Show me all failed spans and their error messages"
+phoenix-insight "Which services have the highest error rates?"
+phoenix-insight "Find traces that contain exceptions or timeouts"
 ```
-## Troubleshooting
-### Connection Issues
+### Performance & Latency
 ```bash
-# Test connection to Phoenix
-phoenix-insight snapshot
+# Identify performance bottlenecks
+phoenix-insight "What are the slowest LLM calls in my application?"
+phoenix-insight "Find traces where latency exceeds 5 seconds"
+phoenix-insight "Show p50, p95, and p99 latency by endpoint"
+phoenix-insight "Which operations have the highest latency variance?"
+```
-# If that fails, check your Phoenix instance:
-curl http://localhost:6006/v1/projects
+### Token Usage & Costs
-# Verify with explicit connection:
-phoenix-insight snapshot --base-url http://your-phoenix:6006
+```bash
+# Analyze LLM resource consumption
+phoenix-insight "Which LLM calls are consuming the most tokens?"
+phoenix-insight "Calculate total token usage for the chatbot project"
+phoenix-insight "Show token usage breakdown by model type"
+phoenix-insight "Find conversations that exceeded 10,000 tokens"
 ```
-### Authentication Errors
+### RAG Analysis
 ```bash
-# Set API key via environment
-export PHOENIX_API_KEY="your-key"
-phoenix-insight "your query"
-# Or pass directly
-phoenix-insight "your query" --api-key "your-key"
+# Examine retrieval-augmented generation patterns
+phoenix-insight "Show retrieved documents with low relevance scores"
+phoenix-insight "Find retrieval calls that returned no results"
+phoenix-insight "What's the average number of documents retrieved per query?"
+phoenix-insight "Identify queries where retrieval latency dominated total time"
 ```
-### Debug Mode
-For detailed error information:
+### Evaluations & Experiments
 ```bash
-# Enable debug output
-DEBUG=1 phoenix-insight "problematic query"
-# This shows:
-# - Full stack traces
-# - API request details
-# - Agent tool calls
-# - Raw responses
+# Review evaluation results and experiment metrics
+phoenix-insight "What's the hallucination rate across my experiments?"
+phoenix-insight "Compare accuracy scores between model versions"
+phoenix-insight "Show experiments sorted by success rate"
+phoenix-insight "Find evaluation runs where quality scores dropped below threshold"
 ```
-### Common Issues
-**"No snapshot found" in local mode**
+### Dataset Analysis
 ```bash
-# Create initial snapshot
-phoenix-insight snapshot
-# Or use --refresh to create on-demand
-phoenix-insight "query" --refresh
+# Explore datasets and examples
+phoenix-insight "Show statistics for my evaluation datasets"
+phoenix-insight "Find examples where the model failed quality checks"
+phoenix-insight "What's the distribution of example categories in my dataset?"
+phoenix-insight "List datasets with the most recent updates"
 ```
-**Out of memory in sandbox mode**
+### Prompt Engineering
 ```bash
-# Reduce span limit
-phoenix-insight "query" --sandbox --limit 500
-# Or use local mode for large datasets
-phoenix-insight "query" --local
+# Analyze prompt versions and performance
+phoenix-insight "List all prompt versions and their performance metrics"
+phoenix-insight "Compare outputs between prompt v1 and v2"
+phoenix-insight "Which prompt template has the lowest error rate?"
+phoenix-insight "Show the evolution of my summarization prompt"
 ```
-**Local storage getting too large**
+### Session & Conversation Analysis
 ```bash
-# Check what will be deleted
-phoenix-insight prune --dry-run
-# Clean up all local snapshots
-phoenix-insight prune
+# Understand user interaction patterns
+phoenix-insight "Show the conversation flow for session abc123"
+phoenix-insight "Find sessions with high user abandonment"
+phoenix-insight "What's the average conversation length by project?"
+phoenix-insight "Identify sessions where users repeated similar queries"
 ```
-**Agent can't find expected data**
+### Tool & Function Calls
 ```bash
-# Force refresh to get latest
-phoenix-insight "query" --refresh
-# Fetch more data on-demand (agent will do this automatically)
-px-fetch-more spans --project my-project --limit 2000
+# Analyze agent tool usage
+phoenix-insight "Which tools are being called most frequently?"
+phoenix-insight "Find tool calls that failed or timed out"
+phoenix-insight "Show the success rate for each tool type"
+phoenix-insight "What's the average latency for function calls?"
 ```
-## Observability
+## Command Reference
-Phoenix Insight can trace its own execution back to Phoenix for monitoring and debugging:
+Phoenix Insight provides several commands, each with its own options.
-```bash
-# Enable tracing for queries
-phoenix-insight "analyze errors" --trace
+### Query Command (Default)
-# Enable tracing in interactive mode
-phoenix-insight --interactive --trace
+The default command analyzes Phoenix data with natural language queries.
-# Enable tracing for snapshot creation
-phoenix-insight snapshot --trace
+```bash
+phoenix-insight [options] [query]
 ```
-When `--trace` is enabled:
+| Option              | Description                                   | Default                 | Example                                        |
+| ------------------- | --------------------------------------------- | ----------------------- | ---------------------------------------------- |
+| `--config <path>`   | Custom config file path                       | `~/.phoenix-insight/config.json` | `--config ./my-config.json`           |
+| `--sandbox`         | Run in sandbox mode with in-memory filesystem | `true`                  | `phoenix-insight --sandbox "query"`            |
+| `--local`           | Run in local mode with persistent storage     | `false`                 | `phoenix-insight --local "query"`              |
+| `--base-url <url>`  | Phoenix server URL                            | `http://localhost:6006` | `--base-url https://phoenix.example.com`       |
+| `--api-key <key>`   | Phoenix API key for authentication            | (none)                  | `--api-key your-api-key`                       |
+| `--refresh`         | Force refresh of cached snapshot data         | `false`                 | `phoenix-insight --refresh "show latest data"` |
+| `--limit <n>`       | Maximum spans to fetch per project            | `1000`                  | `--limit 5000`                                 |
+| `--stream`          | Stream agent responses in real-time           | `true`                  | `--no-stream` to disable                       |
+| `-i, --interactive` | Start interactive REPL mode                   | `false`                 | `phoenix-insight -i`                           |
+| `--trace`           | Enable tracing of agent operations to Phoenix | `false`                 | `phoenix-insight --trace "query"`              |
-- All agent operations are traced as spans
-- Tool calls and responses are captured
-- Performance metrics are recorded
-- Traces are sent to the same Phoenix instance being queried (or the one specified by --base-url)
+### Snapshot Command
-This is particularly useful for:
+Creates or updates a data snapshot from Phoenix without running a query.
-- Debugging slow queries
-- Understanding agent decision-making
-- Monitoring Phoenix Insight usage
-- Optimizing performance
+```bash
+phoenix-insight snapshot [options]
+phoenix-insight snapshot create [options]
+```
-## Agent Capabilities
+Note: `phoenix-insight snapshot` (without subcommand) is equivalent to `phoenix-insight snapshot create` for backward compatibility.
-The AI agent has access to:
+| Option             | Description                                   | Default                 | Example                                         |
+| ------------------ | --------------------------------------------- | ----------------------- | ----------------------------------------------- |
+| `--config <path>`  | Custom config file path                       | `~/.phoenix-insight/config.json` | `--config ./my-config.json`            |
+| `--base-url <url>` | Phoenix server URL                            | `http://localhost:6006` | `--base-url https://phoenix.example.com`        |
+| `--api-key <key>`  | Phoenix API key for authentication            | (none)                  | `--api-key your-api-key`                        |
+| `--refresh`        | Force refresh (ignore existing cache)         | `false`                 | `phoenix-insight snapshot --refresh`            |
+| `--limit <n>`      | Maximum spans to fetch per project            | `1000`                  | `phoenix-insight snapshot --limit 5000`         |
+| `--trace`          | Enable tracing of snapshot operations         | `false`                 | `phoenix-insight snapshot --trace`              |
-### Bash Commands (Sandbox Mode)
+### Snapshot Create Subcommand
-- **File operations**: `cat`, `ls`, `find`, `head`, `tail`
-- **Search & filter**: `grep`, `awk`, `sed`
-- **JSON processing**: `jq` (full featured)
-- **Analysis**: `sort`, `uniq`, `wc`
-- **And more**: 50+ commands via just-bash
+Explicitly creates a new snapshot from Phoenix data. This is the preferred way to create snapshots.
-### Bash Commands (Local Mode)
+```bash
+phoenix-insight snapshot create
+```
-- All commands available on your system
-- Custom tools: `ripgrep`, `fd`, `bat`, etc.
-- Full `jq`, `awk`, `sed` features
-- Any installed CLI tools
+Same options as `phoenix-insight snapshot`. Use `snapshot create` for clarity in scripts and automation.
-### Custom Commands
+### Snapshot Latest Command
-- `px-fetch-more spans`: Fetch additional spans
-- `px-fetch-more trace`: Fetch specific trace by ID
+Prints the absolute path to the latest snapshot directory.
-### Understanding Context
+```bash
+phoenix-insight snapshot latest
+```
-The agent always starts by reading `/_context.md` which provides:
+Outputs the path to stdout with no decoration (just the path). Exit code 0 on success, exit code 1 if no snapshots exist.
-- Summary of available data
-- Recent activity highlights
-- Data freshness information
-- Available commands reminder
+**Example usage:**
+```bash
+# Get the latest snapshot path
+phoenix-insight snapshot latest
+# Output: /Users/you/.phoenix-insight/snapshots/1704067200000-abc123/phoenix
+# Use in scripts
+SNAPSHOT_PATH=$(phoenix-insight snapshot latest)
+ls "$SNAPSHOT_PATH"
+# Check if snapshots exist
+if phoenix-insight snapshot latest > /dev/null 2>&1; then
+  echo "Snapshots available"
+else
+  echo "No snapshots found"
+fi
+```
-## Development
+### Snapshot List Command
-### Building from Source
+Lists all available snapshots with their timestamps.
 ```bash
-# Clone the repository
-git clone https://github.com/Arize-ai/phoenix.git
-cd phoenix/js/packages/phoenix-insight
+phoenix-insight snapshot list
+```
-# Install dependencies
-pnpm install
+Outputs one snapshot per line in the format `<timestamp> <path>` where timestamp is ISO 8601. Most recent first. Exit code 0 even if empty (just prints nothing).
-# Run in development
-pnpm dev "your query"
+**Example usage:**
-# Run tests
-pnpm test
+```bash
+# List all snapshots
+phoenix-insight snapshot list
+# Output:
+# 2024-01-01T12:30:00.000Z /Users/you/.phoenix-insight/snapshots/1704113400000-abc123/phoenix
+# 2024-01-01T10:00:00.000Z /Users/you/.phoenix-insight/snapshots/1704104400000-def456/phoenix
+# Count snapshots
+phoenix-insight snapshot list | wc -l
+# Get oldest snapshot path
+phoenix-insight snapshot list | tail -1 | cut -d' ' -f2
+# Process snapshots in a script
+phoenix-insight snapshot list | while read timestamp path; do
+  echo "Snapshot from $timestamp at $path"
+done
+```
-# Build for production
-pnpm build
+### Prune Command
-# Type checking
-pnpm typecheck
+Deletes the local snapshot directory to free up disk space.
+```bash
+phoenix-insight prune [options]
 ```
-### Architecture
+| Option      | Description                              | Default | Example                         |
+| ----------- | ---------------------------------------- | ------- | ------------------------------- |
+| `--dry-run` | Preview what would be deleted without actually deleting | `false` | `phoenix-insight prune --dry-run` |
-Phoenix Insight uses:
+### Help Command
-- **Commander.js** for CLI interface
-- **AI SDK** with Anthropic Claude for the agent
-- **just-bash** for sandbox execution
-- **Phoenix Client** for data fetching
-- **TypeScript** for type safety
-### Testing
+Displays help information and available options.
 ```bash
-# Run all tests
-pnpm test
+phoenix-insight help
+```
-# Run with coverage
-pnpm test -- --coverage
+No additional options. Shows usage information, all commands, and their options.
-# Run specific test file
-pnpm test src/modes/sandbox.test.ts
+## How It Works
+Phoenix Insight operates in three phases:
+1. **Data Ingestion**: Fetches data from your Phoenix instance and creates a structured filesystem snapshot
+2. **AI Analysis**: An AI agent explores the data using bash commands (cat, grep, jq, awk, etc.)
+3. **Natural Language Results**: The agent synthesizes findings into clear, actionable insights
+### Filesystem Structure
+Phoenix data is organized into an intuitive REST-like hierarchy:
-# Type checking
-pnpm typecheck
+```
+/phoenix/
+  _context.md                       # Start here! Human-readable summary
+  /projects/
+    index.jsonl                     # All projects
+    /{project_name}/
+      metadata.json                 # Project details
+      /spans/
+        index.jsonl                 # Trace spans (sampled)
+  /datasets/
+    index.jsonl                     # All datasets
+    /{dataset_name}/
+      metadata.json
+      examples.jsonl
+  /experiments/
+    index.jsonl                     # All experiments
+    /{experiment_id}/
+      metadata.json
+      runs.jsonl
+  /prompts/
+    index.jsonl                     # All prompts
+    /{prompt_name}/
+      metadata.json
+      /versions/
+        /{version}.md               # Prompt templates as markdown
+  /traces/                          # Fetched on-demand
+    /{trace_id}/
+      spans.jsonl
+      metadata.json
+  /_meta/
+    snapshot.json                   # Snapshot metadata
 ```
 ## Examples of Agent Analysis
@@ -573,11 +515,187 @@ The recommendations endpoint has high variability, suggesting cache misses.
 - Never put API keys in queries
 - Review agent actions with `--stream`
-## Contributing & Releases
+## Advanced Topics
+The following sections cover configuration, execution modes, and internal details for power users.
+### Configuration
+Phoenix Insight uses a layered configuration system with the following priority (highest to lowest):
+1. **CLI arguments** - Options passed directly to the command
+2. **Environment variables** - `PHOENIX_*` environment variables
+3. **Config file** - JSON file at `~/.phoenix-insight/config.json`
+#### Config File
+On first run, Phoenix Insight automatically creates a default config file at `~/.phoenix-insight/config.json` with all default values. You can edit this file to customize your settings.
+**Config file location:**
+- Default: `~/.phoenix-insight/config.json`
+- Override with env var: `PHOENIX_INSIGHT_CONFIG=/path/to/config.json`
+- Override with CLI flag: `--config /path/to/config.json`
+**Example config.json with all options:**
+```json
+{
+  "baseUrl": "http://localhost:6006",
+  "apiKey": "your-api-key",
+  "limit": 1000,
+  "stream": true,
+  "mode": "sandbox",
+  "refresh": false,
+  "trace": false
+}
+```
+| Config Key | Type                     | Default                 | Description                                   |
+| ---------- | ------------------------ | ----------------------- | --------------------------------------------- |
+| `baseUrl`  | string                   | `http://localhost:6006` | Phoenix server URL                            |
+| `apiKey`   | string                   | (none)                  | API key for authentication                    |
+| `limit`    | number                   | `1000`                  | Maximum spans to fetch per project            |
+| `stream`   | boolean                  | `true`                  | Enable streaming responses from the agent     |
+| `mode`     | `"sandbox"` \| `"local"` | `"sandbox"`             | Execution mode                                |
+| `refresh`  | boolean                  | `false`                 | Force refresh of snapshot data                |
+| `trace`    | boolean                  | `false`                 | Enable tracing of agent operations to Phoenix |
+#### Environment Variables
+| Variable                  | Config Key | Default                 | Description                |
+| ------------------------- | ---------- | ----------------------- | -------------------------- |
+| `PHOENIX_BASE_URL`        | `baseUrl`  | `http://localhost:6006` | Phoenix server URL         |
+| `PHOENIX_API_KEY`         | `apiKey`   | (none)                  | API key for authentication |
+| `PHOENIX_INSIGHT_LIMIT`   | `limit`    | `1000`                  | Max spans per project      |
+| `PHOENIX_INSIGHT_STREAM`  | `stream`   | `true`                  | Enable streaming           |
+| `PHOENIX_INSIGHT_MODE`    | `mode`     | `sandbox`               | Execution mode             |
+| `PHOENIX_INSIGHT_REFRESH` | `refresh`  | `false`                 | Force refresh snapshot     |
+| `PHOENIX_INSIGHT_TRACE`   | `trace`    | `false`                 | Enable tracing             |
+| `PHOENIX_INSIGHT_CONFIG`  | -          | -                       | Custom config file path    |
+| `DEBUG`                   | -          | `0`                     | Show detailed error info   |
+#### Local Mode Storage
+In local mode (`--local`), data persists at `~/.phoenix-insight/`:
+```
+~/.phoenix-insight/
+  config.json                  # Configuration (auto-created on first run)
+  /snapshots/{timestamp}/      # Snapshot data
+  /cache/                      # API response cache
+```
+Use `phoenix-insight prune` to clean up local storage.
+### Execution Modes
+Phoenix Insight supports two execution modes:
+| Mode | Flag | Filesystem | Bash | Use Case |
+|------|------|------------|------|----------|
+| **Sandbox** (default) | `--sandbox` | In-memory | [just-bash](https://github.com/vercel-labs/just-bash) (50+ commands) | CI/CD, demos, safe exploration |
+| **Local** | `--local` | Persistent (`~/.phoenix-insight/`) | Real system bash | Power users, complex analysis |
+### Agent Capabilities
+The AI agent has access to:
+#### Bash Commands (Sandbox Mode)
+- **File operations**: `cat`, `ls`, `find`, `head`, `tail`
+- **Search & filter**: `grep`, `awk`, `sed`
+- **JSON processing**: `jq` (full featured)
+- **Analysis**: `sort`, `uniq`, `wc`
+- **And more**: 50+ commands via just-bash
+#### Bash Commands (Local Mode)
+- All commands available on your system
+- Custom tools: `ripgrep`, `fd`, `bat`, etc.
+- Full `jq`, `awk`, `sed` features
+- Any installed CLI tools
+#### Custom Commands
+- `px-fetch-more spans`: Fetch additional spans for deeper analysis
+- `px-fetch-more trace`: Fetch a specific trace by ID
+See [On-Demand Data Fetching](#on-demand-data-fetching) for usage examples.
+#### Understanding Context
+The agent always starts by reading `/_context.md` which provides:
+- Summary of available data
+- Recent activity highlights
+- Data freshness information
+- Available commands reminder
+### Observability
+Phoenix Insight can trace its own execution back to Phoenix using `--trace`. When enabled, all agent operations, tool calls, and responses are traced as spans and sent to the Phoenix instance being queried. This is useful for debugging slow queries and understanding agent decision-making.
+### Troubleshooting
+For connection issues, authentication errors, debug mode, and common issues, see the [Troubleshooting Guide](./TROUBLESHOOTING.md).
+### Development
+#### Building from Source
+```bash
+# Clone the repository
+git clone https://github.com/Arize-ai/phoenix.git
+cd phoenix/js/packages/phoenix-insight
+# Install dependencies
+pnpm install
+# Run in development
+pnpm dev "your query"
+# Run tests
+pnpm test
+# Build for production
+pnpm build
+# Type checking
+pnpm typecheck
+```
+#### Architecture
+Phoenix Insight uses:
+- **Commander.js** for CLI interface
+- **AI SDK** with Anthropic Claude for the agent
+- **just-bash** for sandbox execution
+- **Phoenix Client** for data fetching
+- **TypeScript** for type safety
+#### Testing
+```bash
+# Run all tests
+pnpm test
+# Run with coverage
+pnpm test -- --coverage
+# Run specific test file
+pnpm test test/modes/sandbox.test.ts
+# Type checking
+pnpm typecheck
+```
+### Contributing & Releases
 Contributions are welcome! This project uses [changesets](https://github.com/changesets/changesets) for version management and automated releases.
-### Making Changes
+#### Making Changes
 1. Fork the repository and create a feature branch
 2. Make your changes and ensure tests pass (`pnpm test`)
@@ -594,7 +712,7 @@ pnpm changeset
 5. Commit the generated changeset file along with your changes
 6. Open a pull request
-### Release Process
+#### Release Process
 When your PR is merged to `main`:
@@ -602,19 +720,18 @@ When your PR is merged to `main`:
 2. This PR updates the version in `package.json` and generates `CHANGELOG.md` entries
 3. When the Version Packages PR is merged, the package is automatically published to npm
-### Changeset Guidelines
+#### Changeset Guidelines
 - **patch**: Bug fixes, documentation updates, internal refactoring
 - **minor**: New features, new CLI options, non-breaking enhancements
 - **major**: Breaking changes to CLI interface or behavior
-## Support
+### Support
 This software is provided "as is" without warranty of any kind. Use at your own risk.
 You may file GitHub issues at [https://github.com/cephalization/phoenix-insight/issues](https://github.com/cephalization/phoenix-insight/issues).
+### License
-## License
-Apache-2.0 - See [LICENSE](./LICENSE) for details.
+Apache-2.0 - See [LICENSE](./LICENSE) for details.