PyPI - codedocent - Versions diffs - 0.2.1__tar.gz → 0.4.0__tar.gz - Mend

codedocent 0.2.1tar.gz → 0.4.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

codedocent-0.4.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,69 @@
+Metadata-Version: 2.1
+Name: codedocent
+Version: 0.4.0
+Summary: Code visualization for non-programmers
+License: MIT
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+Provides-Extra: dev
+License-File: LICENSE
+# codedocent
+<img width="1658" height="2158" alt="Screenshot_2026-02-09_13-17-06" src="https://github.com/user-attachments/assets/ff097ead-69ec-4618-b7b7-2b99c60ac57e" />
+**Code visualization for non-programmers.**
+A docent is a guide who explains things to people who aren't experts. Codedocent does that for code.
+## The problem
+You're staring at a codebase you didn't write — maybe thousands of files across dozens of directories — and you need to understand what it does. Reading every file isn't realistic. You need a way to visualize the code structure, get a high-level map of what's where, and drill into the parts that matter without losing context.
+Codedocent parses the codebase into a navigable, visual block structure and explains each piece in plain English. It's an AI code analysis tool that runs entirely on your machine — no API keys, no cloud, no data leaving your laptop. Point it at any codebase and get a structural overview you can explore interactively, understand quickly, and share as a static HTML file.
+## Who this is for
+- **Developers onboarding onto an unfamiliar codebase** — get oriented in minutes instead of days
+- **Non-programmers** (managers, designers, PMs) who need to understand what code does without reading it
+- **Solo developers inheriting legacy code** — map out the structure before making changes
+- **Code reviewers** who want a high-level overview before diving into details
+- **Security reviewers** who need a structural map of an application
+- **Students** learning to read and navigate real-world codebases
+## What you see
+Nested, color-coded blocks representing directories, files, classes, and functions — the entire structure of a codebase laid out visually. Each block shows a plain English summary, a pseudocode translation, and quality warnings (green/yellow/red). Click any block to drill down; breadcrumbs navigate you back up. You can export code from any block or paste replacement code back into the source file. All AI runs locally through Ollama — nothing leaves your machine.
+## Install
+```bash
+pip install codedocent
+```
+Requires Python 3.10+ and [Ollama](https://ollama.com) running locally for AI features. Works without AI too (`--no-ai`).
+## Quick start
+```bash
+codedocent                         # setup wizard — walks you through everything
+codedocent /path/to/code           # interactive mode (recommended)
+codedocent /path/to/code --full    # full analysis, static HTML output
+codedocent --gui                   # graphical launcher
+```
+## How it works
+Parses code structure with tree-sitter, scores quality with static analysis, and sends individual blocks to a local Ollama model for plain English summaries and pseudocode. Interactive mode analyzes on click — typically 1-2 seconds per block. Full mode analyzes everything upfront into a self-contained HTML file you can share.
+## Why local
+All AI processing runs through Ollama on your machine. Your code is never uploaded, transmitted, or stored anywhere external. No API keys, no accounts, no cloud services. This matters when you're working with proprietary code, client projects, or anything you can't share — codedocent works fully air-gapped. The `--no-ai` mode removes the AI dependency entirely while keeping the structural visualization and quality scoring.
+## Supported languages
+Full AST parsing for Python and JavaScript/TypeScript (functions, classes, methods, imports). File-level detection for 23 extensions including C, C++, Rust, Go, Java, Ruby, PHP, Swift, Kotlin, Scala, HTML, CSS, and config formats.
+## License
+MIT

codedocent-0.4.0/README.md ADDED Viewed

@@ -0,0 +1,59 @@
+# codedocent
+<img width="1658" height="2158" alt="Screenshot_2026-02-09_13-17-06" src="https://github.com/user-attachments/assets/ff097ead-69ec-4618-b7b7-2b99c60ac57e" />
+**Code visualization for non-programmers.**
+A docent is a guide who explains things to people who aren't experts. Codedocent does that for code.
+## The problem
+You're staring at a codebase you didn't write — maybe thousands of files across dozens of directories — and you need to understand what it does. Reading every file isn't realistic. You need a way to visualize the code structure, get a high-level map of what's where, and drill into the parts that matter without losing context.
+Codedocent parses the codebase into a navigable, visual block structure and explains each piece in plain English. It's an AI code analysis tool that runs entirely on your machine — no API keys, no cloud, no data leaving your laptop. Point it at any codebase and get a structural overview you can explore interactively, understand quickly, and share as a static HTML file.
+## Who this is for
+- **Developers onboarding onto an unfamiliar codebase** — get oriented in minutes instead of days
+- **Non-programmers** (managers, designers, PMs) who need to understand what code does without reading it
+- **Solo developers inheriting legacy code** — map out the structure before making changes
+- **Code reviewers** who want a high-level overview before diving into details
+- **Security reviewers** who need a structural map of an application
+- **Students** learning to read and navigate real-world codebases
+## What you see
+Nested, color-coded blocks representing directories, files, classes, and functions — the entire structure of a codebase laid out visually. Each block shows a plain English summary, a pseudocode translation, and quality warnings (green/yellow/red). Click any block to drill down; breadcrumbs navigate you back up. You can export code from any block or paste replacement code back into the source file. All AI runs locally through Ollama — nothing leaves your machine.
+## Install
+```bash
+pip install codedocent
+```
+Requires Python 3.10+ and [Ollama](https://ollama.com) running locally for AI features. Works without AI too (`--no-ai`).
+## Quick start
+```bash
+codedocent                         # setup wizard — walks you through everything
+codedocent /path/to/code           # interactive mode (recommended)
+codedocent /path/to/code --full    # full analysis, static HTML output
+codedocent --gui                   # graphical launcher
+```
+## How it works
+Parses code structure with tree-sitter, scores quality with static analysis, and sends individual blocks to a local Ollama model for plain English summaries and pseudocode. Interactive mode analyzes on click — typically 1-2 seconds per block. Full mode analyzes everything upfront into a self-contained HTML file you can share.
+## Why local
+All AI processing runs through Ollama on your machine. Your code is never uploaded, transmitted, or stored anywhere external. No API keys, no accounts, no cloud services. This matters when you're working with proprietary code, client projects, or anything you can't share — codedocent works fully air-gapped. The `--no-ai` mode removes the AI dependency entirely while keeping the structural visualization and quality scoring.
+## Supported languages
+Full AST parsing for Python and JavaScript/TypeScript (functions, classes, methods, imports). File-level detection for 23 extensions including C, C++, Rust, Go, Java, Ruby, PHP, Swift, Kotlin, Scala, HTML, CSS, and config formats.
+## License
+MIT

{codedocent-0.2.1 → codedocent-0.4.0}/codedocent/analyzer.py RENAMED Viewed

@@ -7,6 +7,7 @@ import json
 import os
 import re
 import sys
+import tempfile
 import threading
 import time
 from concurrent.futures import ThreadPoolExecutor, as_completed
@@ -28,6 +29,14 @@ MAX_SOURCE_LINES = 200
 MIN_LINES_FOR_AI = 3
+def _md5(data: bytes) -> "hashlib._Hash":
+    """Create an MD5 hash, tolerating FIPS-mode Python builds."""
+    try:
+        return hashlib.md5(data, usedforsecurity=False)
+    except TypeError:
+        return hashlib.md5(data)  # nosec B324
 def _count_nodes(node: CodeNode) -> int:
     """Recursive count of all nodes in tree."""
     return 1 + sum(_count_nodes(c) for c in node.children)
@@ -103,15 +112,34 @@ def _parse_ai_response(text: str) -> tuple[str, str]:
     return summary, pseudocode
+_AI_TIMEOUT = 120
 def _summarize_with_ai(
     node: CodeNode, model: str
-) -> tuple[str, str]:
-    """Call ollama to get summary and pseudocode for a node."""
+) -> tuple[str, str] | None:
+    """Call ollama to get summary and pseudocode for a node.
+    Returns ``None`` if the AI call times out.
+    """
     prompt = _build_prompt(node, model)
-    response = ollama.chat(
-        model=model, messages=[{"role": "user", "content": prompt}]
+    pool = ThreadPoolExecutor(max_workers=1)
+    future = pool.submit(
+        ollama.chat,
+        model=model,
+        messages=[{"role": "user", "content": prompt}],
     )
-    raw = response.message.content or ""  # pylint: disable=no-member
+    try:
+        response = future.result(timeout=_AI_TIMEOUT)
+    except TimeoutError:
+        future.cancel()
+        pool.shutdown(wait=False, cancel_futures=True)
+        return None
+    pool.shutdown(wait=False)
+    msg = getattr(response, "message", None)
+    if msg is None:
+        raise ValueError("Unexpected Ollama response format")
+    raw = getattr(msg, "content", None) or ""
     raw = _strip_think_tags(raw)
     # Garbage response fallback: empty or very short after stripping
     if not raw or len(raw) < 10:
@@ -130,9 +158,7 @@ def _summarize_with_ai(
 def _cache_key(node: CodeNode) -> str:
     """Generate a cache key based on filepath, name, and source hash."""
-    source_hash = hashlib.md5(
-        node.source.encode(), usedforsecurity=False
-    ).hexdigest()
+    source_hash = _md5(node.source.encode()).hexdigest()
     return f"{node.filepath}::{node.name}::{source_hash}"
@@ -149,12 +175,34 @@ def _load_cache(path: str) -> dict:
 def _save_cache(path: str, data: dict) -> None:
-    """Save cache to JSON file."""
+    """Save cache to JSON file atomically."""
+    parent = os.path.dirname(os.path.abspath(path))
+    tmp_path: str | None = None
     try:
-        with open(path, "w", encoding="utf-8") as f:
-            json.dump(data, f, indent=2)
+        fd = tempfile.NamedTemporaryFile(  # pylint: disable=consider-using-with  # noqa: E501
+            mode="w", encoding="utf-8",
+            dir=parent, delete=False, suffix=".tmp",
+        )
+        tmp_path = fd.name
+        try:
+            json.dump(data, fd, indent=2)
+            fd.flush()
+            os.fsync(fd.fileno())
+        finally:
+            fd.close()
+        os.replace(tmp_path, path)
+        tmp_path = None  # success — don't clean up
     except OSError as e:
-        print(f"Warning: could not save cache: {e}", file=sys.stderr)
+        print(
+            f"Warning: could not save cache: {e}",
+            file=sys.stderr,
+        )
+    finally:
+        if tmp_path is not None:
+            try:
+                os.unlink(tmp_path)
+            except OSError:
+                pass
 # ---------------------------------------------------------------------------
@@ -171,9 +219,7 @@ def assign_node_ids(root: CodeNode) -> dict[str, CodeNode]:
     def _walk(node: CodeNode, path_parts: list[str]) -> None:
         key = "::".join(path_parts)
-        node_id = hashlib.md5(
-            key.encode(), usedforsecurity=False
-        ).hexdigest()[:12]
+        node_id = _md5(key.encode()).hexdigest()[:12]
         node.node_id = node_id
         lookup[node_id] = node
         for child in node.children:
@@ -228,12 +274,19 @@ def analyze_single_node(node: CodeNode, model: str, cache_dir: str) -> None:
         return
     try:
-        summary, pseudocode = _summarize_with_ai(node, model)
+        result = _summarize_with_ai(node, model)
+        if result is None:
+            node.summary = "Summary timed out"
+            return
+        summary, pseudocode = result
         node.summary = summary
         node.pseudocode = pseudocode
         cache["entries"][key] = {"summary": summary, "pseudocode": pseudocode}
         _save_cache(cache_path, cache)
-    except (ConnectionError, RuntimeError, ValueError, OSError) as e:
+    except (
+        ConnectionError, RuntimeError, ValueError,
+        OSError, AttributeError, TypeError,
+    ) as e:
         node.summary = f"Summary generation failed: {e}"
@@ -331,7 +384,11 @@ def _run_ai_batch(
                 return
         _progress(f"Analyzing {node.name}")
         try:
-            summary, pseudocode = _summarize_with_ai(node, model)
+            result = _summarize_with_ai(node, model)
+            if result is None:
+                node.summary = "Summary timed out"
+                return
+            summary, pseudocode = result
             with cache_lock:
                 node.summary = summary
                 node.pseudocode = pseudocode

{codedocent-0.2.1 → codedocent-0.4.0}/codedocent/cli.py RENAMED Viewed

@@ -4,6 +4,7 @@ from __future__ import annotations
 import argparse
 import os
+import sys
 from codedocent.ollama_utils import check_ollama, fetch_ollama_models
 from codedocent.parser import CodeNode, parse_directory
@@ -15,6 +16,15 @@ _check_ollama = check_ollama
 _fetch_ollama_models = fetch_ollama_models
+def _safe_input(prompt: str) -> str:
+    """Wrap input() to handle EOF gracefully."""
+    try:
+        return input(prompt)
+    except EOFError:
+        print("\nInput closed. Exiting.")
+        sys.exit(0)
 def print_tree(node: CodeNode, indent: int = 0) -> None:
     """Print a text representation of the code tree."""
     prefix = "  " * indent
@@ -44,7 +54,7 @@ def print_tree(node: CodeNode, indent: int = 0) -> None:
 def _ask_folder() -> str:
     """Prompt for a valid folder path, re-asking on invalid input."""
     while True:
-        path = input("What folder do you want to analyze? ").strip()
+        path = _safe_input("What folder do you want to analyze? ").strip()
         path = os.path.expanduser(path)
         if os.path.isdir(path):
             file_count = len(list(scan_directory(path)))
@@ -55,7 +65,7 @@ def _ask_folder() -> str:
 def _ask_no_ai_fallback() -> bool:
     """Ask user whether to continue without AI. Returns True for no-ai."""
-    fallback = input("Continue without AI? [Y/n]: ").strip().lower()
+    fallback = _safe_input("Continue without AI? [Y/n]: ").strip().lower()
     if fallback in ("", "y", "yes"):
         return True
     raise SystemExit(0)
@@ -66,7 +76,7 @@ def _pick_model(models: list[str]) -> str:
     print("Available models:")
     for i, m in enumerate(models, 1):
         print(f"  {i}. {m}")
-    choice = input("Which model? [1]: ").strip()
+    choice = _safe_input("Which model? [1]: ").strip()
     if choice == "":
         return models[0]
     try:
@@ -106,7 +116,7 @@ def _run_wizard() -> argparse.Namespace:
     print("  1. Interactive \u2014 browse in browser [default]")
     print("  2. Full export \u2014 analyze everything, save HTML")
     print("  3. Text tree \u2014 plain text in terminal")
-    mode_choice = input("Choice [1]: ").strip()
+    mode_choice = _safe_input("Choice [1]: ").strip()
     text = mode_choice == "3"
     full = mode_choice == "2"

codedocent-0.4.0/codedocent/editor.py ADDED Viewed

@@ -0,0 +1,173 @@
+"""Code replacement: write modified source back into a file."""
+from __future__ import annotations
+import os
+import shutil
+import tempfile
+from datetime import datetime
+def _read_and_validate(
+    filepath: str, start_line: int, end_line: int,
+) -> tuple[list[str] | None, str | None, tuple[int, int], str]:
+    """Read *filepath* and validate the line range.
+    Returns ``(lines, None, (mtime_ns, size), line_ending)`` on success,
+    or ``(None, error_message, (0, 0), "\\n")`` on failure.
+    """
+    if not os.path.isfile(filepath):
+        return (None, f"File not found: {filepath}", (0, 0), "\n")
+    if (
+        not isinstance(start_line, int)
+        or not isinstance(end_line, int)
+        or start_line < 1
+        or end_line < 1
+        or start_line > end_line
+    ):
+        return (
+            None,
+            f"Invalid line range: {start_line}-{end_line}",
+            (0, 0), "\n",
+        )
+    try:
+        with open(filepath, "rb") as f:
+            raw = f.read()
+        _stat = os.stat(filepath)
+        file_stamp = (_stat.st_mtime_ns, _stat.st_size)
+        text = raw.decode("utf-8")
+    except UnicodeDecodeError:
+        return (None, "File is not valid UTF-8 text", (0, 0), "\n")
+    # Detect line ending style: CRLF vs LF
+    crlf_count = text.count("\r\n")
+    lf_count = text.count("\n") - crlf_count
+    line_ending = "\r\n" if crlf_count > lf_count else "\n"
+    lines = text.splitlines(True)
+    if end_line > len(lines):
+        return (
+            None,
+            f"end_line {end_line} exceeds file length"
+            f" ({len(lines)} lines)",
+            (0, 0), "\n",
+        )
+    return (lines, None, file_stamp, line_ending)
+def _write_with_backup(
+    filepath: str, lines: list[str], file_stamp: tuple[int, int],
+) -> None:
+    """Create a timestamped ``.bak`` backup and write *lines* back.
+    *file_stamp* is ``(st_mtime_ns, st_size)`` from the initial read.
+    Raises ``OSError`` if the file was modified externally since the
+    last read, if the backup could not be created, or on write failure.
+    """
+    _stat = os.stat(filepath)
+    if (_stat.st_mtime_ns, _stat.st_size) != file_stamp:
+        raise OSError("File was modified externally since last read")
+    now = datetime.now()
+    backup_path = (
+        filepath + ".bak."
+        + now.strftime("%Y%m%dT%H%M%S") + f".{now.microsecond:06d}"
+    )
+    flags = os.O_CREAT | os.O_EXCL | os.O_WRONLY
+    if hasattr(os, "O_NOFOLLOW"):
+        flags |= os.O_NOFOLLOW
+    try:
+        fd_bak = os.open(backup_path, flags, 0o600)
+        os.close(fd_bak)
+    except FileExistsError:
+        for i in range(1, 100):
+            candidate = backup_path + f".{i}"
+            try:
+                fd_bak = os.open(candidate, flags, 0o600)
+                os.close(fd_bak)
+                backup_path = candidate
+                break
+            except FileExistsError:
+                continue
+        else:
+            raise OSError("Cannot create unique backup path") from None
+    shutil.copy2(filepath, backup_path)
+    if not os.path.exists(backup_path):
+        raise OSError(
+            "Backup creation failed: "
+            f"{backup_path} does not exist"
+        )
+    parent_dir = os.path.dirname(os.path.abspath(filepath))
+    fd = tempfile.NamedTemporaryFile(  # pylint: disable=consider-using-with
+        mode="wb",
+        dir=parent_dir, delete=False, suffix=".tmp",
+    )
+    tmp_path = fd.name
+    try:
+        for line in lines:
+            fd.write(line.encode("utf-8"))
+        fd.flush()
+        os.fsync(fd.fileno())
+        fd.close()
+        orig_mode = os.stat(filepath).st_mode
+        os.chmod(tmp_path, orig_mode)
+        os.replace(tmp_path, filepath)
+    except BaseException:
+        fd.close()
+        try:
+            os.unlink(tmp_path)
+        except OSError:
+            pass
+        raise
+def replace_block_source(
+    filepath: str,
+    start_line: int,
+    end_line: int,
+    new_source: str,
+) -> dict:
+    """Replace lines *start_line* through *end_line* (1-indexed, inclusive).
+    Creates a timestamped ``.bak`` backup before writing.  Returns a
+    result dict with ``success``, ``lines_before``, ``lines_after`` on
+    success, or ``success=False`` and ``error`` on failure.
+    """
+    if not isinstance(new_source, str):
+        return {"success": False, "error": "new_source must be a string"}
+    lines, error, file_stamp, line_ending = _read_and_validate(
+        filepath, start_line, end_line,
+    )
+    if lines is None:
+        return {"success": False, "error": error}
+    old_count = end_line - start_line + 1
+    try:
+        # Build replacement lines
+        if new_source == "":
+            new_lines: list[str] = []
+        else:
+            raw_lines = new_source.splitlines(True)
+            new_lines = [
+                ln.rstrip("\r\n") + line_ending for ln in raw_lines
+            ]
+        new_count = len(new_lines)
+        lines[start_line - 1:end_line] = new_lines
+        _write_with_backup(filepath, lines, file_stamp)
+        return {
+            "success": True,
+            "lines_before": old_count,
+            "lines_after": new_count,
+        }
+    except OSError as exc:
+        return {"success": False, "error": str(exc)}

{codedocent-0.2.1 → codedocent-0.4.0}/codedocent/gui.py RENAMED Viewed

@@ -4,6 +4,7 @@ from __future__ import annotations
 import subprocess  # nosec B404
 import sys
+import threading
 from codedocent.ollama_utils import check_ollama, fetch_ollama_models
@@ -41,19 +42,35 @@ def _create_folder_row(frame: ttk.Frame) -> tk.StringVar:
     return folder_var
-def _create_model_row(frame: ttk.Frame) -> tk.StringVar:
+def _create_model_row(
+    frame: ttk.Frame, root: tk.Tk,
+) -> tk.StringVar:
     """Create the model-dropdown row and return the StringVar."""
     ttk.Label(frame, text="Model:").grid(
         row=2, column=0, sticky="w", pady=(12, 4),
     )
-    ollama_ok = _check_ollama()
-    models = _fetch_ollama_models() if ollama_ok else []
-    model_values = models if models else ["No AI"]
-    model_var = tk.StringVar(value=model_values[0])
-    ttk.Combobox(
-        frame, textvariable=model_var, values=model_values,
+    model_var = tk.StringVar(value="Checking...")
+    combo = ttk.Combobox(
+        frame, textvariable=model_var, values=["Checking..."],
         state="readonly", width=37,
-    ).grid(row=3, column=0, columnspan=2, sticky="ew")
+    )
+    combo.grid(row=3, column=0, columnspan=2, sticky="ew")
+    def _bg_fetch() -> None:
+        try:
+            ollama_ok = _check_ollama()
+            models = _fetch_ollama_models() if ollama_ok else []
+        except Exception:  # pylint: disable=broad-exception-caught
+            models = []
+        model_values = models if models else ["No AI"]
+        def _update_ui() -> None:
+            combo["values"] = model_values
+            model_var.set(model_values[0])
+        root.after(0, _update_ui)
+    threading.Thread(target=_bg_fetch, daemon=True).start()
     return model_var
@@ -90,6 +107,8 @@ def _create_go_button(
         cmd = [sys.executable, "-m", "codedocent", folder]
         selected_model = model_var.get()
+        if selected_model == "Checking...":
+            return
         if selected_model == "No AI":
             cmd.append("--no-ai")
         else:
@@ -119,7 +138,7 @@ def _build_gui() -> None:
     frame.grid(row=0, column=0, sticky="nsew")
     folder_var = _create_folder_row(frame)
-    model_var = _create_model_row(frame)
+    model_var = _create_model_row(frame, root)
     mode_var = _create_mode_row(frame)
     _create_go_button(frame, root, folder_var, model_var, mode_var)

{codedocent-0.2.1 → codedocent-0.4.0}/codedocent/parser.py RENAMED Viewed

@@ -66,6 +66,19 @@ _METHOD_TYPES: dict[str, dict[str, str]] = {
 }
+def _unwrap_exports(root_node) -> list:
+    """Yield top-level children, unwrapping export_statement nodes."""
+    result = []
+    for child in root_node.children:
+        if child.type == "export_statement":
+            for inner in child.children:
+                if inner.type not in ("export", "default", ",", ";"):
+                    result.append(inner)
+        else:
+            result.append(child)
+    return result
 def _rules_for(language: str) -> dict[str, tuple[str, str]]:
     """Return AST extraction rules for the given language."""
     if language == "python":
@@ -126,7 +139,7 @@ def _extract_arrow_functions(root_node, language: str) -> list[CodeNode]:
     if language not in ("javascript", "typescript", "tsx"):
         return []
     results: list[CodeNode] = []
-    for child in root_node.children:
+    for child in _unwrap_exports(root_node):
         if child.type != "lexical_declaration":
             continue
         for decl in child.children:
@@ -210,7 +223,12 @@ def _extract_top_level_nodes(
 ) -> list[CodeNode]:
     """Walk AST top-level children, create CodeNodes, attach methods."""
     children: list[CodeNode] = []
-    for child in root_node.children:
+    top_children = (
+        _unwrap_exports(root_node)
+        if language in ("javascript", "typescript", "tsx")
+        else root_node.children
+    )
+    for child in top_children:
         if child.type in rules:
             our_type, name_child = rules[child.type]
             node = CodeNode(

codedocent 0.2.1__tar.gz → 0.4.0__tar.gz

codedocent 0.2.1tar.gz → 0.4.0tar.gz