PyPI - ai-codeindex - Versions diffs - 0.7.0__py3-none-any.whl - Mend

ai-codeindex 0.7.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

ai_codeindex-0.7.0.dist-info/METADATA +966 -0
ai_codeindex-0.7.0.dist-info/RECORD +41 -0
ai_codeindex-0.7.0.dist-info/WHEEL +4 -0
ai_codeindex-0.7.0.dist-info/entry_points.txt +2 -0
ai_codeindex-0.7.0.dist-info/licenses/LICENSE +21 -0
codeindex/README_AI.md +767 -0
codeindex/__init__.py +11 -0
codeindex/adaptive_config.py +83 -0
codeindex/adaptive_selector.py +171 -0
codeindex/ai_helper.py +48 -0
codeindex/cli.py +40 -0
codeindex/cli_common.py +10 -0
codeindex/cli_config.py +97 -0
codeindex/cli_docs.py +66 -0
codeindex/cli_hooks.py +765 -0
codeindex/cli_scan.py +562 -0
codeindex/cli_symbols.py +295 -0
codeindex/cli_tech_debt.py +238 -0
codeindex/config.py +479 -0
codeindex/directory_tree.py +229 -0
codeindex/docstring_processor.py +342 -0
codeindex/errors.py +62 -0
codeindex/extractors/__init__.py +9 -0
codeindex/extractors/thinkphp.py +132 -0
codeindex/file_classifier.py +148 -0
codeindex/framework_detect.py +323 -0
codeindex/hierarchical.py +428 -0
codeindex/incremental.py +278 -0
codeindex/invoker.py +260 -0
codeindex/parallel.py +155 -0
codeindex/parser.py +740 -0
codeindex/route_extractor.py +98 -0
codeindex/route_registry.py +77 -0
codeindex/scanner.py +167 -0
codeindex/semantic_extractor.py +408 -0
codeindex/smart_writer.py +737 -0
codeindex/symbol_index.py +199 -0
codeindex/symbol_scorer.py +283 -0
codeindex/tech_debt.py +619 -0
codeindex/tech_debt_formatters.py +234 -0
codeindex/writer.py +164 -0

codeindex/README_AI.md ADDED Viewed

@@ -0,0 +1,767 @@
+<!-- Generated by codeindex at 2026-02-03T16:03:18+08:00 -->
+# README_AI.md - codeindex
+## Overview
+- **Files**: 28
+- **Symbols**: 183
+## Files
+### __init__.py
+_codeindex - AI-native code indexing tool for large codebases
+Usage:
+    codeindex scan <path>     # Scan a directory and generate README_AI.md
+    co_
+### adaptive_config.py
+_Adaptive symbols configuration.
+This module defines the configuration structure for adaptive symbol extraction,
+which allows dynamically adjusting th_
+**class** `class AdaptiveSymbolsConfig`
+> Configuration for adaptive symbol extraction.
+    Adaptive symbol extraction adjusts the number of
+### adaptive_selector.py
+_Adaptive symbol selector for dynamic symbol limit calculation.
+This module implements the core algorithm for adaptive symbol extraction,
+which adjust_
+**class** `class AdaptiveSymbolSelector`
+> Selects appropriate symbol limit based on file size.
+    This selector implements a tiered approach
+**Methods:**
+- `def calculate_limit(self, file_lines: int, total_symbols: int) -> int`
+- `def _determine_size_category(self, lines: int) -> str`
+- `def _apply_constraints(self, limit: int, total_symbols: int) -> int`
+### ai_enhancement.py
+_AI enhancement strategies for super large files (Epic 3.2).
+This module provides intelligent file size detection and strategy selection
+for optimizin_
+**class** `class SuperLargeFileDetection`
+> Result of super large file detection.
+**class** `class SymbolGroup`
+> Group of symbols by responsibility.
+**class** `class MultiTurnResult`
+> Result of multi-turn dialogue enhancement.
+**Functions:**
+- `def is_super_large_file(parse_result: ParseResult, config: Config) -> SuperLargeFileDetection`
+- `def select_enhancement_strategy(
+    parse_result: ParseResult,
+    config: Config,
+) -> EnhancementStrategy`
+- `def _group_symbols_by_responsibility(parse_result: ParseResult) -> list[SymbolGroup]`
+- `def _generate_round1_prompt(parse_result: ParseResult) -> str`
+- `def _generate_round2_prompt(
+    round1_output: str,
+    symbol_groups: list[SymbolGroup],
+) -> str`
+- `def _generate_round3_prompt(
+    round1_output: str,
+    round2_output: str,
+    parse_result: ParseResult,
+) -> str`
+- `def multi_turn_ai_enhancement(
+    parse_result: ParseResult,
+    config: Config,
+    ai_command: str,
+    timeout_per_round: int = 180,
+) -> MultiTurnResult`
+### ai_helper.py
+_AI enhancement helper functions (Epic 4 Story 4.1).
+This module provides reusable functions for AI enhancement operations,
+eliminating code duplicati_
+**Functions:**
+- `def aggregate_parse_results(
+    parse_results: list[ParseResult],
+    path: Path,
+) -> ParseResult`
+### cli.py
+_CLI entry point for codeindex.
+This module serves as the main entry point for the codeindex CLI tool.
+It imports and registers commands from speciali_
+**Functions:**
+- `def main()`
+### cli_common.py
+_Common utilities for CLI modules.
+This module provides shared resources used across all CLI command modules,
+such as the Rich console instance for fo_
+### cli_config.py
+_CLI commands for configuration and project status.
+This module provides commands for initializing configuration files,
+checking indexing status, and _
+**Functions:**
+- `def init(force: bool)`
+- `def status(root: Path)`
+- `def list_dirs(root: Path)`
+### cli_scan.py
+_CLI commands for scanning directories and generating README files.
+This module provides the core scanning functionality, including single directory
+s_
+**Functions:**
+- `def _process_directory_with_smartwriter(
+    dir_path: Path,
+    tree: DirectoryTree,
+    config: Config,
+) -> tuple[Path, bool, str, int]`
+- `def scan(
+    path: Path,
+    dry_run: bool,
+    fallback: bool,
+    quiet: bool,
+    timeout: int,
+    parallel: int | None,
+    docstring_mode: str | None,
+    show_cost: bool,
+)`
+- `def scan_all(
+    root: Path | None,
+    parallel: int | None,
+    timeout: int,
+    no_ai: bool,
+    fallback: bool,
+    quiet: bool,
+    hierarchical: bool,
+    docstring_mode: str | None,
+    show_cost: bool,
+)`
+### cli_symbols.py
+_CLI commands for symbol indexing and dependency analysis.
+This module provides commands for generating project-wide indices
+and analyzing code depend_
+**Functions:**
+- `def extract_module_purpose(
+    dir_path: Path,
+    config: Config,
+    output_file: str = "README_AI.md"
+) -> str`
+- `def index(root: Path, output: str)`
+- `def symbols(root: Path, output: str, quiet: bool)`
+- `def affected(since: str, until: str, as_json: bool)`
+### cli_tech_debt.py
+_CLI commands for technical debt analysis.
+This module provides the tech-debt command for analyzing technical debt
+in a directory, including file size_
+**Functions:**
+- `def _find_source_files(
+    path: Path, recursive: bool, languages: list[str] | None = None
+) -> list[Path]`
+- `def _analyze_files(
+    files: list[Path],
+    detector: TechDebtDetector,
+    reporter: TechDebtReporter,
+    show_progress: bool,
+) -> None`
+- `def _format_and_output(
+    report: TechDebtReport,
+    format: str,
+    output: Path | None,
+    quiet: bool,
+) -> None`
+- `def tech_debt(path: Path, format: str, output: Path | None, recursive: bool, quiet: bool)`
+### config.py
+_Configuration management for codeindex._
+**class** `class SymbolsConfig`
+> Configuration for symbol extraction.
+**class** `class GroupingConfig`
+> Configuration for file grouping.
+**class** `class SemanticConfig`
+> Configuration for semantic extraction.
+**class** `class IndexingConfig`
+> Configuration for smart indexing.
+**class** `class IncrementalConfig`
+> Configuration for incremental updates.
+**class** `class DocstringConfig`
+> Configuration for docstring extraction.
+**class** `class Config`
+> Configuration for codeindex.
+### directory_tree.py
+_Directory tree structure for hierarchical indexing._
+**class** `class DirectoryNode`
+> A node in the directory tree.
+**class** `class DirectoryTree`
+> Pre-scanned directory tree for determining index levels.
+    This enables two-pass indexing:
+    1.
+**Methods:**
+- `def _build_tree(self)`
+- `def print_tree(self, max_depth: int = 3)`
+### file_classifier.py
+_Unified file size classification system (Epic 4 Story 4.2).
+This module provides a unified approach to file size classification,
+replacing hard-coded_
+**class** `class FileSizeCategory(Enum)`
+> File size categories for classification.
+**class** `class FileSizeAnalysis`
+> Result of file size analysis.
+    Attributes:
+        category: File size category (enum)
+        f
+**class** `class FileSizeClassifier`
+> Unified file size classifier for all modules.
+    This classifier provides consistent file size det
+**Methods:**
+- `def classify(self, parse_result: ParseResult) -> FileSizeAnalysis`
+- `def is_super_large(self, parse_result: ParseResult) -> bool`
+- `def is_large(self, parse_result: ParseResult) -> bool`
+### framework_detect.py
+_Framework detection and pattern extraction for PHP projects._
+**class** `class RouteInfo`
+> Information about a route.
+**class** `class ModelInfo`
+> Information about a model.
+**class** `class FrameworkInfo`
+> Detected framework information.
+**Functions:**
+- `def detect_framework(root: Path) -> FrameworkType`
+- `def extract_thinkphp_routes(
+    parse_results: list[ParseResult],
+    module_name: str,
+) -> list[RouteInfo]`
+- `def extract_thinkphp_models(
+    parse_results: list[ParseResult],
+) -> list[ModelInfo]`
+- `def analyze_thinkphp_project(
+    root: Path,
+    parse_results_by_dir: dict[Path, list[ParseResult]],
+) -> FrameworkInfo`
+- `def format_framework_info(info: FrameworkInfo, max_routes: int = 20) -> str`
+### hierarchical.py
+_Bottom-up hierarchical processing for codeindex._
+**class** `class DirectoryInfo`
+> Information about a directory in the hierarchy.
+**Functions:**
+- `def build_directory_hierarchy(directories: List[Path]) -> Tuple[Dict[Path, DirectoryInfo], List[Path]]`
+- `def create_processing_batches(dir_info: Dict[Path, DirectoryInfo], max_workers: int) -> List[List[Path]]`
+- `def process_directory_batch(
+    batch: List[Path],
+    config: Config,
+    use_fallback: bool = False,
+    quiet: bool = False,
+    timeout: int = 120,
+    root_path: Path = None,
+) -> Dict[Path, bool]`
+- `def process_normal(path: Path, config: Config, use_fallback: bool, quiet: bool, timeout: int, root_path: Path = None) -> bool`
+- `def process_with_children(path: Path, config: Config, use_fallback: bool, quiet: bool, timeout: int) -> bool`
+- `def scan_directories_hierarchical(
+    root: Path,
+    config: Config,
+    max_workers: int = 8,
+    use_fallback: bool = True,
+    quiet: bool = False,
+    timeout: int = 120
+) -> bool`
+- `def generate_enhanced_fallback_readme(
+    dir_path: Path,
+    parse_results: list,
+    child_readmes: List[Path],
+    output_file: str = "README_AI.md"
+)`
+### incremental.py
+_Incremental update logic for codeindex.
+This module analyzes git changes and determines which directories
+need README_AI.md updates based on configur_
+**class** `class UpdateLevel(Enum)`
+> Update decision levels.
+**class** `class FileChange`
+> Represents a changed file.
+**class** `class ChangeAnalysis`
+> Analysis result of git changes.
+**Methods:**
+- `def to_dict(self) -> dict`
+**Functions:**
+- `def run_git_command(args: list[str], cwd: Path | None = None) -> str`
+- `def filter_code_files(
+    changes: list[FileChange],
+    languages: list[str],
+) -> list[FileChange]`
+- `def analyze_changes(
+    config: Config,
+    since: str = "HEAD~1",
+    until: str = "HEAD",
+    cwd: Path | None = None,
+) -> ChangeAnalysis`
+- `def should_update_project_index(analysis: ChangeAnalysis, config: Config) -> bool`
+### invoker.py
+_AI CLI invoker - calls external AI CLI tools._
+**class** `class InvokeResult`
+> Result of invoking AI CLI.
+**Functions:**
+- `def clean_ai_output(output: str) -> str`
+- `def validate_markdown_output(output: str) -> bool`
+- `def format_prompt(
+    dir_path: Path,
+    files_info: str,
+    symbols_info: str,
+    imports_info: str,
+) -> str`
+- `def invoke_ai_cli(
+    command_template: str,
+    prompt: str,
+    timeout: int = 120,
+    dry_run: bool = False,
+) -> InvokeResult`
+- `def invoke_ai_cli_stdin(
+    command: str,
+    prompt: str,
+    timeout: int = 120,
+    dry_run: bool = False,
+) -> InvokeResult`
+### parallel.py
+_Parallel processing utilities for codeindex._
+**class** `class BatchResult`
+> Result of processing a batch of files.
+**Functions:**
+- `def parse_files_parallel(
+    files: List[Path],
+    config: Config,
+    quiet: bool = False
+) -> list[ParseResult]`
+- `def scan_directories_parallel(
+    directories: List[Path],
+    config: Config,
+    quiet: bool = False
+) -> List[Path]`
+### parser.py
+_Multi-language AST parser using tree-sitter._
+**class** `class Symbol`
+> Represents a code symbol (class, function, etc.).
+**class** `class Import`
+> Represents an import statement.
+**class** `class ParseResult`
+> Result of parsing a file.
+**Functions:**
+- `def _get_node_text(node, source_bytes: bytes) -> str`
+- `def _extract_docstring(node, source_bytes: bytes) -> str`
+- `def _parse_function(
+    node,
+    source_bytes: bytes,
+    class_name: str = "",
+    decorators: list[str] | None = None
+) -> Symbol`
+- `def _parse_class(node, source_bytes: bytes) -> list[Symbol]`
+- `def _parse_import(node, source_bytes: bytes) -> Import | None`
+- `def _extract_module_docstring(tree, source_bytes: bytes) -> str`
+- `def parse_file(path: Path) -> ParseResult`
+- `def parse_directory(paths: list[Path]) -> list[ParseResult]`
+- `def _get_language(file_path: Path) -> str`
+- `def _extract_php_docstring(node, source_bytes: bytes) -> str`
+- `def _parse_php_function(node, source_bytes: bytes, class_name: str = "") -> Symbol`
+- `def _parse_php_method(node, source_bytes: bytes, class_name: str) -> Symbol`
+_... and 5 more symbols_
+### scanner.py
+_Directory scanner for codeindex._
+**class** `class ScanResult`
+> Result of scanning a directory.
+**Functions:**
+- `def should_exclude(path: Path, exclude_patterns: list[str], base_path: Path) -> bool`
+- `def scan_directory(
+    path: Path,
+    config: Config,
+    base_path: Path | None = None,
+    recursive: bool = True
+) -> ScanResult`
+- `def find_all_directories(root: Path, config: Config) -> list[Path]`
+### semantic_extractor.py
+_Business Semantic Extractor
+Story 4.4: Extract business semantics from directory structure
+Task 4.4.5: KISS Universal Description Generator
+This mod_
+**class** `class DirectoryContext`
+> Context information about a directory
+    Used to collect information for semantic extraction.
+**class** `class BusinessSemantic`
+> Business semantic information
+    Extracted description of what a directory does.
+**class** `class SimpleDescriptionGenerator`
+> Universal description generator: zero assumptions, zero semantic understanding
+    Only extracts ob
+**class** `class SemanticExtractor`
+> Extract business semantics from directory context
+    Supports two modes:
+    - Heuristic mode: KIS
+**Methods:**
+- `def generate(self, context: DirectoryContext) -> str`
+- `def _extract_path_context(self, path: str) -> str`
+- `def _analyze_symbol_pattern(self, symbols: List[str]) -> str`
+- `def _pluralize(self, suffix: str) -> str`
+- `def _extract_entity_names(self, symbols: List[str]) -> List[str]`
+- `def extract_directory_semantic(
+        self,
+        context: DirectoryContext
+    ) -> BusinessSemantic`
+- `def _heuristic_extract(self, context: DirectoryContext) -> BusinessSemantic`
+- `def _ai_extract(self, context: DirectoryContext) -> BusinessSemantic`
+- `def _build_ai_prompt(self, context: DirectoryContext) -> str`
+- `def _parse_ai_response(self, response: str) -> BusinessSemantic`
+### smart_writer.py
+_Smart README writer with grouping, size limits, and hierarchical levels._
+**class** `class WriteResult`
+> Result of writing a README file.
+**class** `class SmartWriter`
+> Smart README writer that generates appropriate content based on level.
+    Levels:
+    - overview:
+**Methods:**
+- `def write_readme(
+        self,
+        dir_path: Path,
+        parse_results: list[ParseResult],
+        level: LevelType = "detailed",
+        child_dirs: list[Path] | None = None,
+        output_file: str = "README_AI.md",
+    ) -> WriteResult`
+- `def _generate_overview(
+        self,
+        dir_path: Path,
+        parse_results: list[ParseResult],
+        child_dirs: list[Path],
+    ) -> str`
+- `def _generate_navigation(
+        self,
+        dir_path: Path,
+        parse_results: list[ParseResult],
+        child_dirs: list[Path],
+    ) -> str`
+- `def _generate_detailed(
+        self,
+        dir_path: Path,
+        parse_results: list[ParseResult],
+        child_dirs: list[Path],
+    ) -> str`
+- `def _group_files(self, results: list[ParseResult]) -> dict[str, list[ParseResult]]`
+- `def _filter_symbols(self, symbols: list[Symbol]) -> list[Symbol]`
+- `def _get_key_symbols(self, symbols: list[Symbol]) -> list[Symbol]`
+- `def _extract_module_description(self, dir_path: Path, output_file: str = "README_AI.md") -> str`
+- `def _extract_module_description_semantic(
+        self,
+        dir_path: Path,
+        parse_result: Optional[ParseResult] = None
+    ) -> str`
+- `def _truncate_content(self, content: str, max_size: int) -> tuple[str, bool]`
+**Functions:**
+- `def determine_level(
+    dir_path: Path,
+    root_path: Path,
+    has_children: bool,
+    config: IndexingConfig,
+) -> LevelType`
+### symbol_index.py
+_Global symbol index generator for PROJECT_SYMBOLS.md._
+**class** `class SymbolEntry`
+> A symbol entry in the global index.
+**class** `class GlobalSymbolIndex`
+> Generates a global symbol index (PROJECT_SYMBOLS.md) for a project.
+    Collects all classes, funct
+**Methods:**
+- `def collect_symbols(self, quiet: bool = False) -> dict`
+- `def generate_index(self, output_file: str = "PROJECT_SYMBOLS.md") -> Path`
+- `def _group_by_type(self) -> dict[str, list[SymbolEntry]]`
+### symbol_scorer.py
+_Symbol importance scoring system.
+This module provides functionality to score symbols based on their importance,
+helping to prioritize which symbols _
+**class** `class ScoringContext`
+> Scoring context for symbols.
+    Attributes:
+        framework: The framework being used (e.g., 'th
+**class** `class SymbolImportanceScorer`
+> Score symbols by importance for inclusion in documentation.
+    This scorer evaluates symbols acros
+**Methods:**
+- `def _score_visibility(self, symbol: Symbol) -> float`
+- `def _score_semantics(self, symbol: Symbol) -> float`
+- `def _score_documentation(self, symbol: Symbol) -> float`
+- `def _score_complexity(self, symbol: Symbol) -> float`
+- `def _score_naming_pattern(self, symbol: Symbol) -> float`
+- `def score(self, symbol: Symbol) -> float`
+### tech_debt.py
+_Technical debt detection for code analysis.
+This module provides tools to detect and analyze technical debt in codebases,
+including file size issues,_
+**class** `class DebtSeverity(IntEnum)`
+> Severity levels for technical debt issues.
+    Lower values indicate higher severity (CRITICAL is m
+**class** `class DebtIssue`
+> Represents a technical debt issue detected in code.
+    Attributes:
+        severity: The severity
+**class** `class DebtAnalysisResult`
+> Result of analyzing a file for technical debt.
+    Attributes:
+        issues: List of detected tec
+**class** `class SymbolOverloadAnalysis`
+> Analysis result of symbol overload detection.
+    Attributes:
+        total_symbols: Total number o
+**class** `class FileReport`
+> Report for a single file's technical debt analysis.
+    Attributes:
+        file_path: Path to the
+**class** `class TechDebtReport`
+> Aggregate report for technical debt across multiple files.
+    Attributes:
+        file_reports: Li
+**class** `class TechDebtReporter`
+> Reporter for aggregating technical debt analysis across multiple files.
+    This class collects ana
+**class** `class TechDebtDetector`
+> Detector for technical debt in code.
+    This class analyzes parsed code to identify technical debt
+**Methods:**
+- `def add_file_result(
+        self,
+        file_path: Path,
+        debt_analysis: DebtAnalysisResult,
+        symbol_analysis: SymbolOverloadAnalysis | None = None,
+    )`
+- `def generate_report(self) -> TechDebtReport`
+- `def analyze_file(
+        self, parse_result: ParseResult, scorer: SymbolImportanceScorer
+    ) -> DebtAnalysisResult`
+- `def _detect_file_size_issues(self, parse_result: ParseResult) -> list[DebtIssue]`
+- `def _detect_god_class(self, parse_result: ParseResult) -> list[DebtIssue]`
+- `def _calculate_quality_score(
+        self, parse_result: ParseResult, issues: list[DebtIssue]
+    ) -> float`
+- `def analyze_symbol_overload(
+        self, parse_result: ParseResult, scorer: SymbolImportanceScorer
+    ) -> tuple[list[DebtIssue], SymbolOverloadAnalysis]`
+_... and 4 more symbols_
+### tech_debt_formatters.py
+_Formatters for technical debt reports.
+This module provides different output formatters for technical debt reports:
+- ConsoleFormatter: Human-readabl_
+**class** `class ReportFormatter(ABC)`
+> Abstract base class for report formatters.
+**class** `class ConsoleFormatter(ReportFormatter)`
+> Formatter for console output with ANSI colors.
+**class** `class MarkdownFormatter(ReportFormatter)`
+> Formatter for Markdown output.
+**class** `class JSONFormatter(ReportFormatter)`
+> Formatter for JSON output.
+**Methods:**
+- `def format(self, report: TechDebtReport) -> str`
+- `def _get_severity_color(self, severity: DebtSeverity) -> str`
+- `def format(self, report: TechDebtReport) -> str`
+- `def _format_issues_table(
+        self, report: TechDebtReport, severity: DebtSeverity
+    ) -> list[str]`
+- `def format(self, report: TechDebtReport) -> str`
+### writer.py
+_Markdown writer for README_AI.md files._
+**class** `class WriteResult`
+> Result of writing a README_AI.md file.
+**Functions:**
+- `def format_symbols_for_prompt(results: list[ParseResult]) -> str`
+- `def format_imports_for_prompt(results: list[ParseResult]) -> str`
+- `def format_files_for_prompt(results: list[ParseResult]) -> str`
+- `def write_readme(
+    dir_path: Path,
+    content: str,
+    output_file: str = "README_AI.md",
+) -> WriteResult`
+- `def generate_fallback_readme(
+    dir_path: Path,
+    results: list[ParseResult],
+    output_file: str = "README_AI.md",
+) -> WriteResult`
+## Dependencies
+- .adaptive_selector
+- .cli_common
+- .cli_config
+- .cli_scan
+- .cli_symbols
+- .cli_tech_debt
+- .config
+- .directory_tree
+- .framework_detect
+- .incremental
+- .invoker
+- .parallel
+- .parser
+- .scanner
+- .semantic_extractor
+- .smart_writer
+- .symbol_index
+- .symbol_scorer
+- .tech_debt
+- .tech_debt_formatters
+_... and 29 more_
+**Commit `d9c40ec`**: feat(json): implement ParseResult serialization (Story 1)
+Changed files:
+- `parser.py`
+**Commit `5a89ba2`**: feat(json): add --output json to scan and scan-all commands (Stories 2 & 3)
+Changed files:
+- `cli_scan.py`
+- `scanner.py`
+**Commit ``**:
+Changed files:
+- `cli_scan.py`
+- `errors.py`
+- `parser.py`
+---
+## Recent Changes
+**Commit ``**:
+Changed files:
+- `config.py`