PyPI - srcodex - Versions diffs - 0.2.0__tar.gz → 0.2.1__tar.gz - Mend

srcodex 0.2.0tar.gz → 0.2.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

{srcodex-0.2.0 → srcodex-0.2.1}/MANIFEST.in RENAMED Viewed

@@ -3,6 +3,9 @@ include LICENSE
 include pyproject.toml
 include .env.example
+# Include SQL schema files
+include srcodex/indexer/db_schema.sql
 # Include CSS files for TUI
 recursive-include srcodex/tui *.tcss

{srcodex-0.2.0/srcodex.egg-info → srcodex-0.2.1}/PKG-INFO RENAMED Viewed

@@ -1,8 +1,8 @@
 Metadata-Version: 2.4
 Name: srcodex
-Version: 0.2.0
+Version: 0.2.1
 Summary: Semantic code explorer with AI-powered search and analysis
-Author-email: Jonathan Antoun <jonathan.antoun@amd.com>
+Author: Jonathan L'Work
 License: MIT
 Project-URL: Homepage, https://github.com/Jonathan03ant/srcodex
 Project-URL: Repository, https://github.com/Jonathan03ant/srcodex
@@ -43,28 +43,67 @@ Unlike generic code assistants (Claude CLI, GitHub Copilot, etc.) that read enti
 | "Find all ioctls" | Grep + read matches (15K tokens) | Database search (300 tokens) |
 | "Explain module Y" | Read 10+ files (30K tokens) | Aggregate query (2K tokens) |
-**Result:** 90% more token-efficient, instant relationship queries, and unique capabilities impossible for file-based tools (call chains, data flow analysis, architecture visualization).
+**Result:** 99% more token-efficient, instant relationship queries, and unique capabilities impossible for file-based tools (call chains, data flow analysis, architecture visualization).
-## Features
+## Key Features
-- **Semantic Indexing**: Builds a persistent graph of symbols, functions, types, and their relationships
-- **AI-Powered Search**: Ask questions in natural language about your code
-- **Call Graph Analysis**: Trace function calls, dependencies, and execution paths
-- **Terminal UI**: Beautiful terminal interface with file browser and AI chat
-- **Multi-Language**: Supports C, C++, Python, and more
-- **Fast**: SQLite-backed graph queries with intelligent caching
-- **Portable**: `.srcodex/` directory makes indexed projects shareable
+- **Semantic Indexing Engine**: Extracts symbols, relationships, and cross-references from source code
+- **AI-Powered Chat**: Natural language queries about your codebase architecture
+- **Call Graph Analysis**: Trace function calls, caller chains, and dependency paths
+- **Terminal UI**: Full-featured TUI with file browser, search, and AI chat interface
+- **Multi-Language Support**: C, C++, Python, JavaScript, Go, Rust, and more
+- **Persistent Graph Database**: SQLite-backed semantic graph with relationship edges
+- **Portable**: `.srcodex/` directory makes indexed projects shareable across teams
+- **Token Efficient**: 99% reduction in API costs via semantic queries and intelligent caching
 ## Installation
+### From PyPI (Recommended)
 ```bash
 pip install srcodex
 ```
+### From Source
+```bash
+git clone https://github.com/Jonathan03ant/srcodex.git
+cd srcodex
+pip install -e .
+```
+## Prerequisites
+Before installing srcodex, you need these system tools:
+**Ubuntu/Debian:**
+```bash
+sudo apt install universal-ctags cscope
+```
+**macOS:**
+```bash
+brew install universal-ctags cscope
+```
+**Arch Linux:**
+```bash
+sudo pacman -S ctags cscope
+```
+**Other systems:** Install Universal CTags from https://github.com/universal-ctags/ctags
 ## Quick Start
 ```bash
-# Index your codebase (first time)
+# 1. Install srcodex
+pip install srcodex
+# 2. Configure API key
+export ANTHROPIC_API_KEY="your-api-key"
+# Or create .env file with ANTHROPIC_API_KEY=...
+# 3. Index your codebase (first time)
 cd /path/to/your/project
 srcodex
@@ -73,7 +112,7 @@ srcodex
 # [Indexing happens...]
 # [TUI launches]
-# Next time - instant launch
+# 4. Next time - instant launch (uses cached index)
 srcodex
 ```
@@ -96,51 +135,71 @@ Once indexed, use the TUI to:
 ## Configuration
-Copy `.env.example` to `.env` and configure your API key:
+srcodex requires a Claude API key from Anthropic.
+**Option 1: Environment Variable**
 ```bash
-# Public Anthropic API
-ANTHROPIC_API_KEY=sk-ant-your-key-here
-# Or enterprise gateway (if applicable)
-AMD_LLM_API_KEY=your-subscription-key
+export ANTHROPIC_API_KEY="sk-ant-your-key-here"
 ```
-## Requirements
+**Option 2: .env File**
+Create a `.env` file in your project directory:
+```bash
+ANTHROPIC_API_KEY=sk-ant-your-key-here
+```
-- Python 3.9+
-- Universal CTags (`brew install universal-ctags` or `apt install universal-ctags`)
-- Cscope (optional, for call graph)
-- Claude API key (Anthropic or enterprise gateway)
+Get your API key from https://console.anthropic.com/
 ## How It Works
-1. **Indexing**: Extracts symbols, relationships, and metadata using CTags and Cscope
-2. **Graph Building**: Creates semantic graph with typed edges (CALLS, INCLUDES, ACCESSES)
-3. **AI Integration**: Claude queries the graph using specialized tools (not reading full files)
-4. **Token Efficiency**: **99%+ reduction** in tokens vs. traditional code assistants
-   - **Breakthrough caching architecture**: 25-100 tokens per query after initial cache build
-   - Aggressive parallel tool batching (20-40 tools per iteration)
-   - 3-iteration cache strategy: iterations 1-3 cached, iteration 4 answers with cached data
-   - Semantic graph queries instead of file reads (10-100x more efficient)
-   - **Real example**: 500 input tokens vs 60,000+ for traditional file-based approaches
-   - Cache persists across queries - subsequent questions cost nearly nothing!
+**Indexing Phase:**
+1. Analyzes source code to extract symbols, functions, types, and relationships
+2. Builds a semantic graph database with typed edges (function calls, includes, data access)
+3. Stores everything in a persistent SQLite database
+**Query Phase:**
+1. You ask questions in natural language via the terminal UI
+2. Claude queries the semantic graph database using specialized tools
+3. Returns targeted answers without reading entire files
+**Why This Is Efficient:**
+- Traditional code assistants: Read full files (20K-60K tokens per query)
+- srcodex: Semantic graph queries (100-500 tokens per query)
+- Intelligent caching: First query builds cache, subsequent queries reuse it
+- Result: 99% reduction in API costs
 ## Project Structure
-After indexing, your project will have:
+After indexing, srcodex creates a `.srcodex/` directory in your project:
 ```
 your-project/
 ├── .srcodex/
-│   ├── metadata.json       # Project stats
-│   ├── config.toml         # Indexing config
+│   ├── metadata.json       # Project statistics
 │   ├── data/
-│   │   └── project.db      # Semantic graph
-│   └── logs/               # Debug logs
+│   │   └── project.db      # Semantic graph database
+│   ├── conversations/      # Chat history
+│   └── .debug/             # Debug logs
 └── [your source files...]
 ```
+The `.srcodex/` directory is portable - you can commit it to git or share it with your team to avoid re-indexing.
+## Performance
+**Indexing Speed** (varies by codebase size):
+- Small projects (< 100 files): 2-5 seconds
+- Medium projects (100-1000 files): 5-20 seconds
+- Large projects (1000+ files): 20-60 seconds
+**Query Speed:**
+- Database queries: < 100ms
+- AI responses: 2-10 seconds (depends on complexity)
+**Token Usage:**
+- First query: 500-2000 tokens (builds cache)
+- Subsequent queries: 25-200 tokens (uses cache)
 ## Development
 ```bash
@@ -151,8 +210,8 @@ cd srcodex
 # Install in development mode
 pip install -e .
-# Run tests
-pytest
+# Index and run on srcodex itself
+srcodex .
 ```
 ## License

srcodex-0.2.1/README.md ADDED Viewed

@@ -0,0 +1,201 @@
+# srcodex
+**Semantic code explorer with AI-powered search and analysis**
+srcodex builds a semantic graph of your codebase and provides AI-powered exploration through natural language queries. Think of it as an intelligent code search that understands relationships, call graphs, and architecture.
+## Why srcodex?
+Unlike generic code assistants (Claude CLI, GitHub Copilot, etc.) that read entire files to answer questions, srcodex uses a **semantic graph database** to understand your code:
+| Question | Generic Assistant | srcodex |
+|----------|------------------|---------|
+| "Who calls function X?" | Grep entire codebase (20K tokens) | `get_callers('X')` (200 tokens) |
+| "Show call chain A→B" | Read multiple files, manual tracing | Graph query (500 tokens) |
+| "Find all ioctls" | Grep + read matches (15K tokens) | Database search (300 tokens) |
+| "Explain module Y" | Read 10+ files (30K tokens) | Aggregate query (2K tokens) |
+**Result:** 99% more token-efficient, instant relationship queries, and unique capabilities impossible for file-based tools (call chains, data flow analysis, architecture visualization).
+## Key Features
+- **Semantic Indexing Engine**: Extracts symbols, relationships, and cross-references from source code
+- **AI-Powered Chat**: Natural language queries about your codebase architecture
+- **Call Graph Analysis**: Trace function calls, caller chains, and dependency paths
+- **Terminal UI**: Full-featured TUI with file browser, search, and AI chat interface
+- **Multi-Language Support**: C, C++, Python, JavaScript, Go, Rust, and more
+- **Persistent Graph Database**: SQLite-backed semantic graph with relationship edges
+- **Portable**: `.srcodex/` directory makes indexed projects shareable across teams
+- **Token Efficient**: 99% reduction in API costs via semantic queries and intelligent caching
+## Installation
+### From PyPI (Recommended)
+```bash
+pip install srcodex
+```
+### From Source
+```bash
+git clone https://github.com/Jonathan03ant/srcodex.git
+cd srcodex
+pip install -e .
+```
+## Prerequisites
+Before installing srcodex, you need these system tools:
+**Ubuntu/Debian:**
+```bash
+sudo apt install universal-ctags cscope
+```
+**macOS:**
+```bash
+brew install universal-ctags cscope
+```
+**Arch Linux:**
+```bash
+sudo pacman -S ctags cscope
+```
+**Other systems:** Install Universal CTags from https://github.com/universal-ctags/ctags
+## Quick Start
+```bash
+# 1. Install srcodex
+pip install srcodex
+# 2. Configure API key
+export ANTHROPIC_API_KEY="your-api-key"
+# Or create .env file with ANTHROPIC_API_KEY=...
+# 3. Index your codebase (first time)
+cd /path/to/your/project
+srcodex
+# Output:
+# No .srcodex/ found. Index this directory? (y/n) y
+# [Indexing happens...]
+# [TUI launches]
+# 4. Next time - instant launch (uses cached index)
+srcodex
+```
+## Usage
+Once indexed, use the TUI to:
+- Browse files and symbols
+- Search across your codebase
+- Chat with AI about your code architecture
+- Trace call chains and dependencies
+### Example AI Queries
+```
+"What does the init_system function do?"
+"Show me all functions that call malloc"
+"Trace the execution path from main to shutdown"
+"What structs are defined in config.h?"
+```
+## Configuration
+srcodex requires a Claude API key from Anthropic.
+**Option 1: Environment Variable**
+```bash
+export ANTHROPIC_API_KEY="sk-ant-your-key-here"
+```
+**Option 2: .env File**
+Create a `.env` file in your project directory:
+```bash
+ANTHROPIC_API_KEY=sk-ant-your-key-here
+```
+Get your API key from https://console.anthropic.com/
+## How It Works
+**Indexing Phase:**
+1. Analyzes source code to extract symbols, functions, types, and relationships
+2. Builds a semantic graph database with typed edges (function calls, includes, data access)
+3. Stores everything in a persistent SQLite database
+**Query Phase:**
+1. You ask questions in natural language via the terminal UI
+2. Claude queries the semantic graph database using specialized tools
+3. Returns targeted answers without reading entire files
+**Why This Is Efficient:**
+- Traditional code assistants: Read full files (20K-60K tokens per query)
+- srcodex: Semantic graph queries (100-500 tokens per query)
+- Intelligent caching: First query builds cache, subsequent queries reuse it
+- Result: 99% reduction in API costs
+## Project Structure
+After indexing, srcodex creates a `.srcodex/` directory in your project:
+```
+your-project/
+├── .srcodex/
+│   ├── metadata.json       # Project statistics
+│   ├── data/
+│   │   └── project.db      # Semantic graph database
+│   ├── conversations/      # Chat history
+│   └── .debug/             # Debug logs
+└── [your source files...]
+```
+The `.srcodex/` directory is portable - you can commit it to git or share it with your team to avoid re-indexing.
+## Performance
+**Indexing Speed** (varies by codebase size):
+- Small projects (< 100 files): 2-5 seconds
+- Medium projects (100-1000 files): 5-20 seconds
+- Large projects (1000+ files): 20-60 seconds
+**Query Speed:**
+- Database queries: < 100ms
+- AI responses: 2-10 seconds (depends on complexity)
+**Token Usage:**
+- First query: 500-2000 tokens (builds cache)
+- Subsequent queries: 25-200 tokens (uses cache)
+## Development
+```bash
+# Clone repository
+git clone https://github.com/Jonathan03ant/srcodex.git
+cd srcodex
+# Install in development mode
+pip install -e .
+# Index and run on srcodex itself
+srcodex .
+```
+## License
+MIT License - see LICENSE file for details
+## Contributing
+Contributions welcome! Please open an issue or pull request.
+## Links
+- [GitHub Repository](https://github.com/Jonathan03ant/srcodex)
+- [Issue Tracker](https://github.com/Jonathan03ant/srcodex/issues)
+- [Documentation](https://github.com/Jonathan03ant/srcodex/wiki)

{srcodex-0.2.0 → srcodex-0.2.1}/pyproject.toml RENAMED Viewed

@@ -4,13 +4,13 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "srcodex"
-version = "0.2.0"
+version = "0.2.1"
 description = "Semantic code explorer with AI-powered search and analysis"
 readme = "README.md"
 requires-python = ">=3.9"
 license = {text = "MIT"}
 authors = [
-    {name = "Jonathan Antoun", email = "jonathan.antoun@amd.com"}
+    {name = "Jonathan L'Work"}
 ]
 keywords = [
     "code-search",
@@ -45,3 +45,6 @@ srcodex = "srcodex.cli:main"
 Homepage = "https://github.com/Jonathan03ant/srcodex"
 Repository = "https://github.com/Jonathan03ant/srcodex"
 Issues = "https://github.com/Jonathan03ant/srcodex/issues"
+[tool.setuptools.package-data]
+srcodex = ["indexer/*.sql", "tui/*.tcss"]

{srcodex-0.2.0 → srcodex-0.2.1}/srcodex/cli.py RENAMED Viewed

@@ -17,6 +17,7 @@ from srcodex.indexer.indexer import Indexer
 from srcodex.indexer.field_access_analyzer import FieldAccessAnalyzer
 from srcodex.indexer.reference_ingestor import ReferenceIngestor
 from srcodex.indexer.reference_resolver import ReferenceResolver
+from importlib.metadata import version as get_version
 @click.command()
 @click.argument('path', default='.', type=click.Path(exists=True))
@@ -25,15 +26,17 @@ from srcodex.indexer.reference_resolver import ReferenceResolver
 def main(path, reindex, debug):
     """
     Launch srcodex TUI
     EXAMPLES:
         srcodex                  # Index current directory and launch
         srcodex /path/to/code    # Index specific directory
         srcodex --reindex        # Force re-index and launch
     """
-    click.echo("srcodex v0.1.0 - Semantic code explorer")
-    click.echo()
+    try:
+        pkg_version = get_version("srcodex")
+    except Exception:
+        pkg_version = "dev"
+    click.echo(f"srcodex v{pkg_version} - Semantic code explorer")
     project_path = Path(path).resolve()
     srcodex_dir = project_path / ".srcodex"
@@ -62,7 +65,7 @@ def main(path, reindex, debug):
         click.echo("\nError: No API key found!", err=True)
         click.echo("Please set either:")
         click.echo("  - ANTHROPIC_API_KEY (for public API)")
-        click.echo("  - AMD_LLM_API_KEY (for enterprise gateway)")
+        click.echo("  - AMD_LLM_API_KEY (for AMD enterprise gateway)")
         click.echo("\nSee .env.example for configuration details")
         sys.exit(1)

srcodex-0.2.1/srcodex/indexer/db_schema.sql ADDED Viewed

@@ -0,0 +1,123 @@
+-- SQLite database for storing symbols, references, and file contents
+-- Symbols (function/variable/struct/macro definitions)
+-- is the universal entity that represents anything that can be tracked in code.
+CREATE TABLE IF NOT EXISTS symbols (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    name TEXT NOT NULL,
+    type TEXT NOT NULL,           -- Normalized: function, struct, variable, macro, typedef, enum, etc.
+    kind_raw TEXT,                -- Raw ctags kind: prototype, function, variable, member, etc.
+    file_path TEXT NOT NULL,
+    line_number INTEGER NOT NULL,
+    signature TEXT,               -- Raw signature from ctags (e.g., "(uint32_t intr_sts, uint8_t device)"), NULL if not available
+    typeref TEXT,                 -- Raw typeref from ctags (e.g., "typename:void"), NULL if not available
+    scope TEXT,                   -- global, extern (deprecated - use is_file_scope instead)
+    scope_kind TEXT,              -- struct, union, enum, class (parent scope type)
+    scope_name TEXT,              -- PowerState, Dummy, etc. (parent scope name)
+    is_file_scope INTEGER,        -- 1 if file-local (static in C), 0 if not, NULL if unknown
+    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+);
+-- Create indexes for fast lookup
+CREATE INDEX IF NOT EXISTS idx_symbols_name ON symbols(name);
+CREATE INDEX IF NOT EXISTS idx_symbols_file ON symbols(file_path);
+CREATE INDEX IF NOT EXISTS idx_symbols_type ON symbols(type);
+CREATE INDEX IF NOT EXISTS idx_symbols_kind_raw ON symbols(kind_raw);
+CREATE INDEX IF NOT EXISTS idx_symbols_scope ON symbols(scope_kind, scope_name);
+CREATE INDEX IF NOT EXISTS idx_symbols_file_scope ON symbols(is_file_scope);
+-- Files (source file metadata - content NOT stored for performance/size)
+CREATE TABLE IF NOT EXISTS files (
+    path TEXT PRIMARY KEY,        -- Relative path from source_root
+    size INTEGER NOT NULL,
+    language TEXT,                -- c, h, python, makefile, etc.
+    sha1 TEXT,                    -- SHA1 hash for change detection
+    last_modified REAL,           -- mtime for change detection
+    last_indexed TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+);
+CREATE INDEX IF NOT EXISTS idx_files_language ON files(language);
+CREATE INDEX IF NOT EXISTS idx_files_sha1 ON files(sha1);
+-- Full-text search for symbols (FTS5 for fast text search)
+CREATE VIRTUAL TABLE IF NOT EXISTS symbols_fts USING fts5(
+    name,
+    file_path,
+    signature,
+    content=symbols,
+    content_rowid=id
+);
+-- Triggers to keep FTS table in sync with symbols table
+CREATE TRIGGER IF NOT EXISTS symbols_ai AFTER INSERT ON symbols BEGIN
+    INSERT INTO symbols_fts(rowid, name, file_path, signature)
+    VALUES (new.id, new.name, new.file_path, new.signature);
+END;
+CREATE TRIGGER IF NOT EXISTS symbols_ad AFTER DELETE ON symbols BEGIN
+    DELETE FROM symbols_fts WHERE rowid = old.id;
+END;
+CREATE TRIGGER IF NOT EXISTS symbols_au AFTER UPDATE ON symbols BEGIN
+    DELETE FROM symbols_fts WHERE rowid = old.id;
+    INSERT INTO symbols_fts(rowid, name, file_path, signature)
+    VALUES (new.id, new.name, new.file_path, new.signature);
+END;
+-- Metadata table for tracking indexing status
+CREATE TABLE IF NOT EXISTS metadata (
+    key TEXT PRIMARY KEY,
+    value TEXT,
+    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+);
+-- Raw references (untrusted cscope output, stored verbatim for debugging/replay)
+-- This is the ingestion layer: what cscope actually said, before semantic resolution
+CREATE TABLE IF NOT EXISTS raw_references (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    query_type TEXT NOT NULL,           -- 'callees', 'callers', 'includes', 'symbol'
+    query_symbol TEXT NOT NULL,         -- what we asked cscope for (function/header name)
+    source_file TEXT NOT NULL,          -- file from cscope output (relative POSIX path)
+    source_function TEXT,               -- function from cscope output (may be NULL for includes)
+    line_number INTEGER NOT NULL,       -- line number from cscope output
+    line_text TEXT,                     -- raw line content from cscope
+    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+);
+-- Indexes for raw_references (query patterns: by type/symbol, by source location)
+CREATE INDEX IF NOT EXISTS idx_raw_refs_query ON raw_references(query_type, query_symbol);
+CREATE INDEX IF NOT EXISTS idx_raw_refs_source ON raw_references(source_file, source_function);
+-- Symbol edges (trusted semantic graph: resolved symbol_id → symbol_id relationships)
+-- This is the semantic layer: machine-readable typed edges between symbols
+CREATE TABLE IF NOT EXISTS symbol_edges (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    edge_type TEXT NOT NULL,            -- 'CALLS', 'INCLUDES' (later: 'USES', 'DEFINES')
+    src_symbol_id INTEGER NOT NULL,     -- source symbol (who calls/includes)
+    dst_symbol_id INTEGER NOT NULL,     -- destination symbol (what is called/included)
+    source_file TEXT NOT NULL,          -- where the edge occurs (file path)
+    line_number INTEGER,                -- where the edge occurs (line number)
+    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+    FOREIGN KEY(src_symbol_id) REFERENCES symbols(id) ON DELETE CASCADE,
+    FOREIGN KEY(dst_symbol_id) REFERENCES symbols(id) ON DELETE CASCADE,
+    UNIQUE(edge_type, src_symbol_id, dst_symbol_id, source_file, line_number)
+);
+-- Indexes for symbol_edges (query patterns: find edges by type and direction)
+CREATE INDEX IF NOT EXISTS idx_edges_src ON symbol_edges(edge_type, src_symbol_id);
+CREATE INDEX IF NOT EXISTS idx_edges_dst ON symbol_edges(edge_type, dst_symbol_id);
+-- File edges (file-to-file relationships: includes, imports, etc.)
+-- This is separate from symbol_edges because files are not symbols
+CREATE TABLE IF NOT EXISTS file_edges (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    edge_type TEXT NOT NULL,        -- 'INCLUDES' (later: 'IMPORTS' for Python/JS)
+    src_file TEXT NOT NULL,         -- includer (repo-relative POSIX path)
+    dst_file TEXT NOT NULL,         -- included (repo-relative POSIX path)
+    line_number INTEGER,            -- where the include occurs
+    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+    UNIQUE(edge_type, src_file, dst_file, line_number)
+);
+-- Indexes for file_edges (query patterns: find dependencies by direction)
+CREATE INDEX IF NOT EXISTS idx_file_edges_src ON file_edges(edge_type, src_file);
+CREATE INDEX IF NOT EXISTS idx_file_edges_dst ON file_edges(edge_type, dst_file);

{srcodex-0.2.0 → srcodex-0.2.1/srcodex.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,8 +1,8 @@
 Metadata-Version: 2.4
 Name: srcodex
-Version: 0.2.0
+Version: 0.2.1
 Summary: Semantic code explorer with AI-powered search and analysis
-Author-email: Jonathan Antoun <jonathan.antoun@amd.com>
+Author: Jonathan L'Work
 License: MIT
 Project-URL: Homepage, https://github.com/Jonathan03ant/srcodex
 Project-URL: Repository, https://github.com/Jonathan03ant/srcodex
@@ -43,28 +43,67 @@ Unlike generic code assistants (Claude CLI, GitHub Copilot, etc.) that read enti
 | "Find all ioctls" | Grep + read matches (15K tokens) | Database search (300 tokens) |
 | "Explain module Y" | Read 10+ files (30K tokens) | Aggregate query (2K tokens) |
-**Result:** 90% more token-efficient, instant relationship queries, and unique capabilities impossible for file-based tools (call chains, data flow analysis, architecture visualization).
+**Result:** 99% more token-efficient, instant relationship queries, and unique capabilities impossible for file-based tools (call chains, data flow analysis, architecture visualization).
-## Features
+## Key Features
-- **Semantic Indexing**: Builds a persistent graph of symbols, functions, types, and their relationships
-- **AI-Powered Search**: Ask questions in natural language about your code
-- **Call Graph Analysis**: Trace function calls, dependencies, and execution paths
-- **Terminal UI**: Beautiful terminal interface with file browser and AI chat
-- **Multi-Language**: Supports C, C++, Python, and more
-- **Fast**: SQLite-backed graph queries with intelligent caching
-- **Portable**: `.srcodex/` directory makes indexed projects shareable
+- **Semantic Indexing Engine**: Extracts symbols, relationships, and cross-references from source code
+- **AI-Powered Chat**: Natural language queries about your codebase architecture
+- **Call Graph Analysis**: Trace function calls, caller chains, and dependency paths
+- **Terminal UI**: Full-featured TUI with file browser, search, and AI chat interface
+- **Multi-Language Support**: C, C++, Python, JavaScript, Go, Rust, and more
+- **Persistent Graph Database**: SQLite-backed semantic graph with relationship edges
+- **Portable**: `.srcodex/` directory makes indexed projects shareable across teams
+- **Token Efficient**: 99% reduction in API costs via semantic queries and intelligent caching
 ## Installation
+### From PyPI (Recommended)
 ```bash
 pip install srcodex
 ```
+### From Source
+```bash
+git clone https://github.com/Jonathan03ant/srcodex.git
+cd srcodex
+pip install -e .
+```
+## Prerequisites
+Before installing srcodex, you need these system tools:
+**Ubuntu/Debian:**
+```bash
+sudo apt install universal-ctags cscope
+```
+**macOS:**
+```bash
+brew install universal-ctags cscope
+```
+**Arch Linux:**
+```bash
+sudo pacman -S ctags cscope
+```
+**Other systems:** Install Universal CTags from https://github.com/universal-ctags/ctags
 ## Quick Start
 ```bash
-# Index your codebase (first time)
+# 1. Install srcodex
+pip install srcodex
+# 2. Configure API key
+export ANTHROPIC_API_KEY="your-api-key"
+# Or create .env file with ANTHROPIC_API_KEY=...
+# 3. Index your codebase (first time)
 cd /path/to/your/project
 srcodex
@@ -73,7 +112,7 @@ srcodex
 # [Indexing happens...]
 # [TUI launches]
-# Next time - instant launch
+# 4. Next time - instant launch (uses cached index)
 srcodex
 ```
@@ -96,51 +135,71 @@ Once indexed, use the TUI to:
 ## Configuration
-Copy `.env.example` to `.env` and configure your API key:
+srcodex requires a Claude API key from Anthropic.
+**Option 1: Environment Variable**
 ```bash
-# Public Anthropic API
-ANTHROPIC_API_KEY=sk-ant-your-key-here
-# Or enterprise gateway (if applicable)
-AMD_LLM_API_KEY=your-subscription-key
+export ANTHROPIC_API_KEY="sk-ant-your-key-here"
 ```
-## Requirements
+**Option 2: .env File**
+Create a `.env` file in your project directory:
+```bash
+ANTHROPIC_API_KEY=sk-ant-your-key-here
+```
-- Python 3.9+
-- Universal CTags (`brew install universal-ctags` or `apt install universal-ctags`)
-- Cscope (optional, for call graph)
-- Claude API key (Anthropic or enterprise gateway)
+Get your API key from https://console.anthropic.com/
 ## How It Works
-1. **Indexing**: Extracts symbols, relationships, and metadata using CTags and Cscope
-2. **Graph Building**: Creates semantic graph with typed edges (CALLS, INCLUDES, ACCESSES)
-3. **AI Integration**: Claude queries the graph using specialized tools (not reading full files)
-4. **Token Efficiency**: **99%+ reduction** in tokens vs. traditional code assistants
-   - **Breakthrough caching architecture**: 25-100 tokens per query after initial cache build
-   - Aggressive parallel tool batching (20-40 tools per iteration)
-   - 3-iteration cache strategy: iterations 1-3 cached, iteration 4 answers with cached data
-   - Semantic graph queries instead of file reads (10-100x more efficient)
-   - **Real example**: 500 input tokens vs 60,000+ for traditional file-based approaches
-   - Cache persists across queries - subsequent questions cost nearly nothing!
+**Indexing Phase:**
+1. Analyzes source code to extract symbols, functions, types, and relationships
+2. Builds a semantic graph database with typed edges (function calls, includes, data access)
+3. Stores everything in a persistent SQLite database
+**Query Phase:**
+1. You ask questions in natural language via the terminal UI
+2. Claude queries the semantic graph database using specialized tools
+3. Returns targeted answers without reading entire files
+**Why This Is Efficient:**
+- Traditional code assistants: Read full files (20K-60K tokens per query)
+- srcodex: Semantic graph queries (100-500 tokens per query)
+- Intelligent caching: First query builds cache, subsequent queries reuse it
+- Result: 99% reduction in API costs
 ## Project Structure
-After indexing, your project will have:
+After indexing, srcodex creates a `.srcodex/` directory in your project:
 ```
 your-project/
 ├── .srcodex/
-│   ├── metadata.json       # Project stats
-│   ├── config.toml         # Indexing config
+│   ├── metadata.json       # Project statistics
 │   ├── data/
-│   │   └── project.db      # Semantic graph
-│   └── logs/               # Debug logs
+│   │   └── project.db      # Semantic graph database
+│   ├── conversations/      # Chat history
+│   └── .debug/             # Debug logs
 └── [your source files...]
 ```
+The `.srcodex/` directory is portable - you can commit it to git or share it with your team to avoid re-indexing.
+## Performance
+**Indexing Speed** (varies by codebase size):
+- Small projects (< 100 files): 2-5 seconds
+- Medium projects (100-1000 files): 5-20 seconds
+- Large projects (1000+ files): 20-60 seconds
+**Query Speed:**
+- Database queries: < 100ms
+- AI responses: 2-10 seconds (depends on complexity)
+**Token Usage:**
+- First query: 500-2000 tokens (builds cache)
+- Subsequent queries: 25-200 tokens (uses cache)
 ## Development
 ```bash
@@ -151,8 +210,8 @@ cd srcodex
 # Install in development mode
 pip install -e .
-# Run tests
-pytest
+# Index and run on srcodex itself
+srcodex .
 ```
 ## License

{srcodex-0.2.0 → srcodex-0.2.1}/srcodex.egg-info/SOURCES.txt RENAMED Viewed

@@ -30,6 +30,7 @@ srcodex/indexer/__init__.py
 srcodex/indexer/cscope_client.py
 srcodex/indexer/ctags_compat.py
 srcodex/indexer/ctags_parser.py
+srcodex/indexer/db_schema.sql
 srcodex/indexer/explorer.py
 srcodex/indexer/field_access_analyzer.py
 srcodex/indexer/indexer.py

srcodex-0.2.0/README.md DELETED Viewed

@@ -1,142 +0,0 @@
-# srcodex
-**Semantic code explorer with AI-powered search and analysis**
-srcodex builds a semantic graph of your codebase and provides AI-powered exploration through natural language queries. Think of it as an intelligent code search that understands relationships, call graphs, and architecture.
-## Why srcodex?
-Unlike generic code assistants (Claude CLI, GitHub Copilot, etc.) that read entire files to answer questions, srcodex uses a **semantic graph database** to understand your code:
-| Question | Generic Assistant | srcodex |
-|----------|------------------|---------|
-| "Who calls function X?" | Grep entire codebase (20K tokens) | `get_callers('X')` (200 tokens) |
-| "Show call chain A→B" | Read multiple files, manual tracing | Graph query (500 tokens) |
-| "Find all ioctls" | Grep + read matches (15K tokens) | Database search (300 tokens) |
-| "Explain module Y" | Read 10+ files (30K tokens) | Aggregate query (2K tokens) |
-**Result:** 90% more token-efficient, instant relationship queries, and unique capabilities impossible for file-based tools (call chains, data flow analysis, architecture visualization).
-## Features
-- **Semantic Indexing**: Builds a persistent graph of symbols, functions, types, and their relationships
-- **AI-Powered Search**: Ask questions in natural language about your code
-- **Call Graph Analysis**: Trace function calls, dependencies, and execution paths
-- **Terminal UI**: Beautiful terminal interface with file browser and AI chat
-- **Multi-Language**: Supports C, C++, Python, and more
-- **Fast**: SQLite-backed graph queries with intelligent caching
-- **Portable**: `.srcodex/` directory makes indexed projects shareable
-## Installation
-```bash
-pip install srcodex
-```
-## Quick Start
-```bash
-# Index your codebase (first time)
-cd /path/to/your/project
-srcodex
-# Output:
-# No .srcodex/ found. Index this directory? (y/n) y
-# [Indexing happens...]
-# [TUI launches]
-# Next time - instant launch
-srcodex
-```
-## Usage
-Once indexed, use the TUI to:
-- Browse files and symbols
-- Search across your codebase
-- Chat with AI about your code architecture
-- Trace call chains and dependencies
-### Example AI Queries
-```
-"What does the init_system function do?"
-"Show me all functions that call malloc"
-"Trace the execution path from main to shutdown"
-"What structs are defined in config.h?"
-```
-## Configuration
-Copy `.env.example` to `.env` and configure your API key:
-```bash
-# Public Anthropic API
-ANTHROPIC_API_KEY=sk-ant-your-key-here
-# Or enterprise gateway (if applicable)
-AMD_LLM_API_KEY=your-subscription-key
-```
-## Requirements
-- Python 3.9+
-- Universal CTags (`brew install universal-ctags` or `apt install universal-ctags`)
-- Cscope (optional, for call graph)
-- Claude API key (Anthropic or enterprise gateway)
-## How It Works
-1. **Indexing**: Extracts symbols, relationships, and metadata using CTags and Cscope
-2. **Graph Building**: Creates semantic graph with typed edges (CALLS, INCLUDES, ACCESSES)
-3. **AI Integration**: Claude queries the graph using specialized tools (not reading full files)
-4. **Token Efficiency**: **99%+ reduction** in tokens vs. traditional code assistants
-   - **Breakthrough caching architecture**: 25-100 tokens per query after initial cache build
-   - Aggressive parallel tool batching (20-40 tools per iteration)
-   - 3-iteration cache strategy: iterations 1-3 cached, iteration 4 answers with cached data
-   - Semantic graph queries instead of file reads (10-100x more efficient)
-   - **Real example**: 500 input tokens vs 60,000+ for traditional file-based approaches
-   - Cache persists across queries - subsequent questions cost nearly nothing!
-## Project Structure
-After indexing, your project will have:
-```
-your-project/
-├── .srcodex/
-│   ├── metadata.json       # Project stats
-│   ├── config.toml         # Indexing config
-│   ├── data/
-│   │   └── project.db      # Semantic graph
-│   └── logs/               # Debug logs
-└── [your source files...]
-```
-## Development
-```bash
-# Clone repository
-git clone https://github.com/Jonathan03ant/srcodex.git
-cd srcodex
-# Install in development mode
-pip install -e .
-# Run tests
-pytest
-```
-## License
-MIT License - see LICENSE file for details
-## Contributing
-Contributions welcome! Please open an issue or pull request.
-## Links
-- [GitHub Repository](https://github.com/Jonathan03ant/srcodex)
-- [Issue Tracker](https://github.com/Jonathan03ant/srcodex/issues)
-- [Documentation](https://github.com/Jonathan03ant/srcodex/wiki)