PyPI - graphifyy - Versions diffs - 0.2.0__tar.gz → 0.2.2__tar.gz - Mend

graphifyy 0.2.0tar.gz → 0.2.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (54) hide show

{graphifyy-0.2.0 → graphifyy-0.2.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: graphifyy
-Version: 0.2.0
+Version: 0.2.2
 Summary: Claude Code skill - turn any folder of code, docs, papers, images, or tweets into a queryable knowledge graph
 License: MIT
 Project-URL: Homepage, https://github.com/safishamsi/graphify
@@ -10,7 +10,6 @@ Keywords: claude,claude-code,knowledge-graph,rag,graphrag,obsidian,community-det
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
 Requires-Dist: networkx
-Requires-Dist: graspologic
 Requires-Dist: tree-sitter
 Requires-Dist: tree-sitter-python
 Requires-Dist: tree-sitter-javascript
@@ -34,16 +33,20 @@ Requires-Dist: pypdf; extra == "pdf"
 Requires-Dist: html2text; extra == "pdf"
 Provides-Extra: watch
 Requires-Dist: watchdog; extra == "watch"
+Provides-Extra: leiden
+Requires-Dist: graspologic; extra == "leiden"
 Provides-Extra: all
 Requires-Dist: mcp; extra == "all"
 Requires-Dist: neo4j; extra == "all"
 Requires-Dist: pypdf; extra == "all"
 Requires-Dist: html2text; extra == "all"
 Requires-Dist: watchdog; extra == "all"
+Requires-Dist: graspologic; extra == "all"
 # graphify
-[![CI](https://github.com/safishamsi/graphify/actions/workflows/ci.yml/badge.svg?branch=v1)](https://github.com/safishamsi/graphify/actions/workflows/ci.yml)
+[![CI](https://github.com/safishamsi/graphify/actions/workflows/ci.yml/badge.svg?branch=v2)](https://github.com/safishamsi/graphify/actions/workflows/ci.yml)
+[![PyPI](https://img.shields.io/pypi/v/graphifyy)](https://pypi.org/project/graphifyy/)
 **A Claude Code skill.** Type `/graphify` in Claude Code - it reads your files, builds a knowledge graph, and gives you back structure you didn't know was there. Understand a codebase faster. Find the "why" behind architectural decisions.
@@ -63,6 +66,12 @@ graphify-out/
 └── cache/           SHA256 cache - re-runs only process changed files
 ```
+## How it works
+graphify runs in two passes. First, a deterministic AST pass extracts structure from code files (classes, functions, imports, call graphs, docstrings, rationale comments) with no LLM needed. Second, Claude subagents run in parallel over docs, papers, and images to extract concepts, relationships, and design rationale. The results are merged into a NetworkX graph, clustered with Leiden community detection, and exported as interactive HTML, queryable JSON, and a plain-language audit report.
+Every relationship is tagged `EXTRACTED` (found directly in source), `INFERRED` (reasonable inference, with a confidence score), or `AMBIGUOUS` (flagged for review). You always know what was found vs guessed.
 ## Install
 **Requires:** [Claude Code](https://claude.ai/code) and Python 3.10+
@@ -79,12 +88,30 @@ Then open Claude Code in any directory and type:
 /graphify .
 ```
+### Make Claude always use the graph (recommended)
+After building a graph, run this once in your project:
+```bash
+graphify claude install
+```
+This does two things:
+1. **CLAUDE.md rules** - tells Claude to read `graphify-out/GRAPH_REPORT.md` before answering architecture questions, and to rebuild the graph after editing code files.
+2. **PreToolUse hook** (`settings.json`) - fires automatically before every Glob and Grep call. If a knowledge graph exists, Claude sees: _"graphify: Knowledge graph exists. Read GRAPH_REPORT.md for god nodes and community structure before searching raw files."_ This means Claude navigates via the graph instead of grepping through every file - faster answers, fewer wasted tool calls, and responses grounded in the actual structure of your codebase rather than keyword matches.
+Without this, Claude will grep raw files by default even when a graph exists. With it, the graph becomes the first thing Claude reaches for.
+Uninstall with `graphify claude uninstall`.
 <details>
 <summary>Manual install (curl)</summary>
 ```bash
 mkdir -p ~/.claude/skills/graphify
-curl -fsSL https://raw.githubusercontent.com/safishamsi/graphify/v1/skills/graphify/skill.md \
+curl -fsSL https://raw.githubusercontent.com/safishamsi/graphify/v2/graphify/skill.md \
   > ~/.claude/skills/graphify/SKILL.md
 ```
@@ -121,14 +148,14 @@ When the user types `/graphify`, invoke the Skill tool with `skill: "graphify"`
 /graphify ./raw --mcp              # start MCP stdio server
 graphify hook install              # git hooks - rebuilds graph on commit and branch switch
-graphify claude install            # write graphify rules to local CLAUDE.md + install PreToolUse hook
+graphify claude install            # always-on: CLAUDE.md + PreToolUse hook for this project
 ```
 Works with any mix of file types:
 | Type | Extensions | Extraction |
 |------|-----------|------------|
-| Code | `.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php` | AST via tree-sitter + call-graph pass + docstring/comment rationale |
+| Code | `.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php` | AST via tree-sitter + call-graph + docstring/comment rationale |
 | Docs | `.md .txt .rst` | Concepts + relationships + design rationale via Claude |
 | Papers | `.pdf` | Citation mining + concept extraction |
 | Images | `.png .jpg .webp .gif` | Claude vision - screenshots, diagrams, any language |
@@ -145,7 +172,7 @@ Works with any mix of file types:
 **Confidence scores** - every INFERRED edge has a `confidence_score` (0.0-1.0). You know not just what was guessed but how confident the model was. EXTRACTED edges are always 1.0.
-**Semantic similarity edges** - cross-file conceptual links that have no structural connection. Two functions solving the same problem without calling each other, a class in code and a concept in a paper describing the same algorithm.
+**Semantic similarity edges** - cross-file conceptual links with no structural connection. Two functions solving the same problem without calling each other, a class in code and a concept in a paper describing the same algorithm.
 **Hyperedges** - group relationships connecting 3+ nodes that pairwise edges can't express. All classes implementing a shared protocol, all functions in an auth flow, all concepts from a paper section forming one idea.
@@ -155,12 +182,8 @@ Works with any mix of file types:
 **Git hooks** (`graphify hook install`) - installs post-commit and post-checkout hooks. Graph rebuilds automatically after every commit and every branch switch. No background process needed.
-**Always-on for Claude** (`graphify claude install`) - writes a `CLAUDE.md` section so Claude checks the graph before answering architecture questions, plus a `.claude/settings.json` PreToolUse hook that fires before every Glob/Grep - Claude is reminded to check the graph before searching raw files.
 **Wiki** (`--wiki`) - Wikipedia-style markdown articles per community and god node, with an `index.md` entry point. Point any agent at `index.md` and it can navigate the knowledge base by reading files instead of parsing JSON.
-Every edge is tagged `EXTRACTED`, `INFERRED`, or `AMBIGUOUS` - you always know what was found vs guessed.
 ## Worked examples
 | Corpus | Files | Reduction | Output |
@@ -171,6 +194,10 @@ Every edge is tagged `EXTRACTED`, `INFERRED`, or `AMBIGUOUS` - you always know w
 Token reduction scales with corpus size. 6 files fits in a context window anyway, so graph value there is structural clarity, not compression. At 52 files (code + papers + images) you get 71x+. Each `worked/` folder has the raw input files and the actual output (`GRAPH_REPORT.md`, `graph.json`) so you can run it yourself and verify the numbers.
+## Privacy
+graphify sends file contents to the Claude API (Anthropic) for semantic extraction of docs, papers, and images. Code files are processed locally via tree-sitter AST — no file contents leave your machine for code. No telemetry, usage tracking, or analytics of any kind. The only network calls are to Anthropic's API during extraction, using your own API key via Claude Code.
 ## Tech stack
 NetworkX + Leiden (graspologic) + tree-sitter + Claude + vis.js. No Neo4j required, no server, runs entirely locally.

{graphifyy-0.2.0 → graphifyy-0.2.2}/README.md RENAMED Viewed

@@ -1,6 +1,7 @@
 # graphify
-[![CI](https://github.com/safishamsi/graphify/actions/workflows/ci.yml/badge.svg?branch=v1)](https://github.com/safishamsi/graphify/actions/workflows/ci.yml)
+[![CI](https://github.com/safishamsi/graphify/actions/workflows/ci.yml/badge.svg?branch=v2)](https://github.com/safishamsi/graphify/actions/workflows/ci.yml)
+[![PyPI](https://img.shields.io/pypi/v/graphifyy)](https://pypi.org/project/graphifyy/)
 **A Claude Code skill.** Type `/graphify` in Claude Code - it reads your files, builds a knowledge graph, and gives you back structure you didn't know was there. Understand a codebase faster. Find the "why" behind architectural decisions.
@@ -20,6 +21,12 @@ graphify-out/
 └── cache/           SHA256 cache - re-runs only process changed files
 ```
+## How it works
+graphify runs in two passes. First, a deterministic AST pass extracts structure from code files (classes, functions, imports, call graphs, docstrings, rationale comments) with no LLM needed. Second, Claude subagents run in parallel over docs, papers, and images to extract concepts, relationships, and design rationale. The results are merged into a NetworkX graph, clustered with Leiden community detection, and exported as interactive HTML, queryable JSON, and a plain-language audit report.
+Every relationship is tagged `EXTRACTED` (found directly in source), `INFERRED` (reasonable inference, with a confidence score), or `AMBIGUOUS` (flagged for review). You always know what was found vs guessed.
 ## Install
 **Requires:** [Claude Code](https://claude.ai/code) and Python 3.10+
@@ -36,12 +43,30 @@ Then open Claude Code in any directory and type:
 /graphify .
 ```
+### Make Claude always use the graph (recommended)
+After building a graph, run this once in your project:
+```bash
+graphify claude install
+```
+This does two things:
+1. **CLAUDE.md rules** - tells Claude to read `graphify-out/GRAPH_REPORT.md` before answering architecture questions, and to rebuild the graph after editing code files.
+2. **PreToolUse hook** (`settings.json`) - fires automatically before every Glob and Grep call. If a knowledge graph exists, Claude sees: _"graphify: Knowledge graph exists. Read GRAPH_REPORT.md for god nodes and community structure before searching raw files."_ This means Claude navigates via the graph instead of grepping through every file - faster answers, fewer wasted tool calls, and responses grounded in the actual structure of your codebase rather than keyword matches.
+Without this, Claude will grep raw files by default even when a graph exists. With it, the graph becomes the first thing Claude reaches for.
+Uninstall with `graphify claude uninstall`.
 <details>
 <summary>Manual install (curl)</summary>
 ```bash
 mkdir -p ~/.claude/skills/graphify
-curl -fsSL https://raw.githubusercontent.com/safishamsi/graphify/v1/skills/graphify/skill.md \
+curl -fsSL https://raw.githubusercontent.com/safishamsi/graphify/v2/graphify/skill.md \
   > ~/.claude/skills/graphify/SKILL.md
 ```
@@ -78,14 +103,14 @@ When the user types `/graphify`, invoke the Skill tool with `skill: "graphify"`
 /graphify ./raw --mcp              # start MCP stdio server
 graphify hook install              # git hooks - rebuilds graph on commit and branch switch
-graphify claude install            # write graphify rules to local CLAUDE.md + install PreToolUse hook
+graphify claude install            # always-on: CLAUDE.md + PreToolUse hook for this project
 ```
 Works with any mix of file types:
 | Type | Extensions | Extraction |
 |------|-----------|------------|
-| Code | `.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php` | AST via tree-sitter + call-graph pass + docstring/comment rationale |
+| Code | `.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php` | AST via tree-sitter + call-graph + docstring/comment rationale |
 | Docs | `.md .txt .rst` | Concepts + relationships + design rationale via Claude |
 | Papers | `.pdf` | Citation mining + concept extraction |
 | Images | `.png .jpg .webp .gif` | Claude vision - screenshots, diagrams, any language |
@@ -102,7 +127,7 @@ Works with any mix of file types:
 **Confidence scores** - every INFERRED edge has a `confidence_score` (0.0-1.0). You know not just what was guessed but how confident the model was. EXTRACTED edges are always 1.0.
-**Semantic similarity edges** - cross-file conceptual links that have no structural connection. Two functions solving the same problem without calling each other, a class in code and a concept in a paper describing the same algorithm.
+**Semantic similarity edges** - cross-file conceptual links with no structural connection. Two functions solving the same problem without calling each other, a class in code and a concept in a paper describing the same algorithm.
 **Hyperedges** - group relationships connecting 3+ nodes that pairwise edges can't express. All classes implementing a shared protocol, all functions in an auth flow, all concepts from a paper section forming one idea.
@@ -112,12 +137,8 @@ Works with any mix of file types:
 **Git hooks** (`graphify hook install`) - installs post-commit and post-checkout hooks. Graph rebuilds automatically after every commit and every branch switch. No background process needed.
-**Always-on for Claude** (`graphify claude install`) - writes a `CLAUDE.md` section so Claude checks the graph before answering architecture questions, plus a `.claude/settings.json` PreToolUse hook that fires before every Glob/Grep - Claude is reminded to check the graph before searching raw files.
 **Wiki** (`--wiki`) - Wikipedia-style markdown articles per community and god node, with an `index.md` entry point. Point any agent at `index.md` and it can navigate the knowledge base by reading files instead of parsing JSON.
-Every edge is tagged `EXTRACTED`, `INFERRED`, or `AMBIGUOUS` - you always know what was found vs guessed.
 ## Worked examples
 | Corpus | Files | Reduction | Output |
@@ -128,6 +149,10 @@ Every edge is tagged `EXTRACTED`, `INFERRED`, or `AMBIGUOUS` - you always know w
 Token reduction scales with corpus size. 6 files fits in a context window anyway, so graph value there is structural clarity, not compression. At 52 files (code + papers + images) you get 71x+. Each `worked/` folder has the raw input files and the actual output (`GRAPH_REPORT.md`, `graph.json`) so you can run it yourself and verify the numbers.
+## Privacy
+graphify sends file contents to the Claude API (Anthropic) for semantic extraction of docs, papers, and images. Code files are processed locally via tree-sitter AST — no file contents leave your machine for code. No telemetry, usage tracking, or analytics of any kind. The only network calls are to Anthropic's API during extraction, using your own API key via Claude Code.
 ## Tech stack
 NetworkX + Leiden (graspologic) + tree-sitter + Claude + vis.js. No Neo4j required, no server, runs entirely locally.

{graphifyy-0.2.0 → graphifyy-0.2.2}/graphify/__main__.py RENAMED Viewed

@@ -171,12 +171,11 @@ def claude_uninstall(project_dir: Path | None = None) -> None:
     ).rstrip()
     if cleaned:
         target.write_text(cleaned + "\n")
+        print(f"graphify section removed from {target.resolve()}")
     else:
         target.unlink()
         print(f"CLAUDE.md was empty after removal - deleted {target.resolve()}")
-        return
-    print(f"graphify section removed from {target.resolve()}")
     _uninstall_claude_hook(project_dir or Path("."))

{graphifyy-0.2.0 → graphifyy-0.2.2}/graphify/cluster.py RENAMED Viewed

@@ -1,8 +1,25 @@
-"""Leiden community detection on NetworkX graphs. Splits oversized communities. Returns cohesion scores."""
+"""Community detection on NetworkX graphs. Uses Leiden (graspologic) if available, falls back to Louvain (networkx). Splits oversized communities. Returns cohesion scores."""
 from __future__ import annotations
 import networkx as nx
+def _partition(G: nx.Graph) -> dict[str, int]:
+    """Run community detection. Returns {node_id: community_id}.
+    Tries Leiden (graspologic) first — best quality.
+    Falls back to Louvain (built into networkx) if graspologic is not installed.
+    """
+    try:
+        from graspologic.partition import leiden
+        return leiden(G)
+    except ImportError:
+        pass
+    # Fallback: networkx louvain (available since networkx 2.7)
+    communities = nx.community.louvain_communities(G, seed=42)
+    return {node: cid for cid, nodes in enumerate(communities) for node in nodes}
 def build_graph(nodes: list[dict], edges: list[dict]) -> nx.Graph:
     """Build a NetworkX graph from graphify node/edge dicts.
@@ -36,8 +53,6 @@ def cluster(G: nx.Graph) -> dict[int, list[str]]:
     if G.number_of_edges() == 0:
         return {i: [n] for i, n in enumerate(sorted(G.nodes))}
-    from graspologic.partition import leiden  # lazy - avoids 15s numba JIT on import
     # Leiden warns and drops isolates - handle them separately
     isolates = [n for n in G.nodes() if G.degree(n) == 0]
     connected_nodes = [n for n in G.nodes() if G.degree(n) > 0]
@@ -45,7 +60,7 @@ def cluster(G: nx.Graph) -> dict[int, list[str]]:
     raw: dict[int, list[str]] = {}
     if connected.number_of_nodes() > 0:
-        partition: dict[str, int] = leiden(connected)
+        partition = _partition(connected)
         for node, cid in partition.items():
             raw.setdefault(cid, []).append(node)
@@ -76,13 +91,11 @@ def _split_community(G: nx.Graph, nodes: list[str]) -> list[list[str]]:
         # No edges - split into individual nodes
         return [[n] for n in sorted(nodes)]
     try:
-        from graspologic.partition import leiden
-        sub_partition: dict[str, int] = leiden(subgraph)
+        sub_partition = _partition(subgraph)
         sub_communities: dict[int, list[str]] = {}
         for node, cid in sub_partition.items():
             sub_communities.setdefault(cid, []).append(node)
         if len(sub_communities) <= 1:
-            # Leiden couldn't split it - return as-is
             return [sorted(nodes)]
         return [sorted(v) for v in sub_communities.values()]
     except Exception:

{graphifyy-0.2.0 → graphifyy-0.2.2}/graphify/skill.md RENAMED Viewed

@@ -411,9 +411,11 @@ print('Report updated with community labels')
 Replace `LABELS_DICT` with the actual dict you constructed (e.g. `{0: "Attention Mechanism", 1: "Training Pipeline"}`).
 Replace INPUT_PATH with the actual path.
-### Step 6 - Generate Obsidian vault (default) + optional HTML
+### Step 6 - Generate Obsidian vault (opt-in) + HTML
-**Generate HTML always** (unless `--no-viz`). **Obsidian vault only if `--obsidian` was given** — it generates one file per node which creates thousands of files in large repos. Skip it by default.
+**Generate HTML always** (unless `--no-viz`). **Obsidian vault only if `--obsidian` was explicitly given** — skip it otherwise, it generates one file per node.
+If `--obsidian` was given:
 ```bash
 python3 -c "
@@ -444,7 +446,7 @@ print('  _COMMUNITY_* - overview notes with cohesion scores and dataview queries
 "
 ```
-Also generate the HTML graph (always, unless `--no-viz`):
+Generate the HTML graph (always, unless `--no-viz`):
 ```bash
 python3 -c "
@@ -631,22 +633,14 @@ rm -f .graphify_detect.json .graphify_extract.json .graphify_ast.json .graphify_
 rm -f graphify-out/.needs_update 2>/dev/null || true
 ```
-Tell the user:
+Tell the user (omit the obsidian line unless --obsidian was given):
 ```
-Graph complete. Outputs are in a hidden folder called graphify-out/ inside the directory you ran this on.
-The folder is hidden (dot prefix) so it won't show in Finder or a normal ls.
-To see it:
-  Mac/Linux:  ls -la graphify-out/
-  VS Code:    the Explorer panel shows hidden files by default
-  Finder:     Cmd+Shift+. to toggle hidden files
-What's inside:
-  graphify-out/obsidian/        - open this folder as a vault in Obsidian (File > Open Vault)
-  graphify-out/GRAPH_REPORT.md  - full audit report, also readable here in Claude
-  graphify-out/graph.json       - persistent graph, query it later with /graphify query "..."
+Graph complete. Outputs in PATH_TO_DIR/graphify-out/
-Full path: PATH_TO_DIR/graphify-out/
+  graph.html            - interactive graph, open in browser
+  GRAPH_REPORT.md       - audit report
+  graph.json            - raw graph data
+  obsidian/             - Obsidian vault (only if --obsidian was given)
 ```
 Replace PATH_TO_DIR with the actual absolute path of the directory that was processed.

{graphifyy-0.2.0 → graphifyy-0.2.2}/graphifyy.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: graphifyy
-Version: 0.2.0
+Version: 0.2.2
 Summary: Claude Code skill - turn any folder of code, docs, papers, images, or tweets into a queryable knowledge graph
 License: MIT
 Project-URL: Homepage, https://github.com/safishamsi/graphify
@@ -10,7 +10,6 @@ Keywords: claude,claude-code,knowledge-graph,rag,graphrag,obsidian,community-det
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
 Requires-Dist: networkx
-Requires-Dist: graspologic
 Requires-Dist: tree-sitter
 Requires-Dist: tree-sitter-python
 Requires-Dist: tree-sitter-javascript
@@ -34,16 +33,20 @@ Requires-Dist: pypdf; extra == "pdf"
 Requires-Dist: html2text; extra == "pdf"
 Provides-Extra: watch
 Requires-Dist: watchdog; extra == "watch"
+Provides-Extra: leiden
+Requires-Dist: graspologic; extra == "leiden"
 Provides-Extra: all
 Requires-Dist: mcp; extra == "all"
 Requires-Dist: neo4j; extra == "all"
 Requires-Dist: pypdf; extra == "all"
 Requires-Dist: html2text; extra == "all"
 Requires-Dist: watchdog; extra == "all"
+Requires-Dist: graspologic; extra == "all"
 # graphify
-[![CI](https://github.com/safishamsi/graphify/actions/workflows/ci.yml/badge.svg?branch=v1)](https://github.com/safishamsi/graphify/actions/workflows/ci.yml)
+[![CI](https://github.com/safishamsi/graphify/actions/workflows/ci.yml/badge.svg?branch=v2)](https://github.com/safishamsi/graphify/actions/workflows/ci.yml)
+[![PyPI](https://img.shields.io/pypi/v/graphifyy)](https://pypi.org/project/graphifyy/)
 **A Claude Code skill.** Type `/graphify` in Claude Code - it reads your files, builds a knowledge graph, and gives you back structure you didn't know was there. Understand a codebase faster. Find the "why" behind architectural decisions.
@@ -63,6 +66,12 @@ graphify-out/
 └── cache/           SHA256 cache - re-runs only process changed files
 ```
+## How it works
+graphify runs in two passes. First, a deterministic AST pass extracts structure from code files (classes, functions, imports, call graphs, docstrings, rationale comments) with no LLM needed. Second, Claude subagents run in parallel over docs, papers, and images to extract concepts, relationships, and design rationale. The results are merged into a NetworkX graph, clustered with Leiden community detection, and exported as interactive HTML, queryable JSON, and a plain-language audit report.
+Every relationship is tagged `EXTRACTED` (found directly in source), `INFERRED` (reasonable inference, with a confidence score), or `AMBIGUOUS` (flagged for review). You always know what was found vs guessed.
 ## Install
 **Requires:** [Claude Code](https://claude.ai/code) and Python 3.10+
@@ -79,12 +88,30 @@ Then open Claude Code in any directory and type:
 /graphify .
 ```
+### Make Claude always use the graph (recommended)
+After building a graph, run this once in your project:
+```bash
+graphify claude install
+```
+This does two things:
+1. **CLAUDE.md rules** - tells Claude to read `graphify-out/GRAPH_REPORT.md` before answering architecture questions, and to rebuild the graph after editing code files.
+2. **PreToolUse hook** (`settings.json`) - fires automatically before every Glob and Grep call. If a knowledge graph exists, Claude sees: _"graphify: Knowledge graph exists. Read GRAPH_REPORT.md for god nodes and community structure before searching raw files."_ This means Claude navigates via the graph instead of grepping through every file - faster answers, fewer wasted tool calls, and responses grounded in the actual structure of your codebase rather than keyword matches.
+Without this, Claude will grep raw files by default even when a graph exists. With it, the graph becomes the first thing Claude reaches for.
+Uninstall with `graphify claude uninstall`.
 <details>
 <summary>Manual install (curl)</summary>
 ```bash
 mkdir -p ~/.claude/skills/graphify
-curl -fsSL https://raw.githubusercontent.com/safishamsi/graphify/v1/skills/graphify/skill.md \
+curl -fsSL https://raw.githubusercontent.com/safishamsi/graphify/v2/graphify/skill.md \
   > ~/.claude/skills/graphify/SKILL.md
 ```
@@ -121,14 +148,14 @@ When the user types `/graphify`, invoke the Skill tool with `skill: "graphify"`
 /graphify ./raw --mcp              # start MCP stdio server
 graphify hook install              # git hooks - rebuilds graph on commit and branch switch
-graphify claude install            # write graphify rules to local CLAUDE.md + install PreToolUse hook
+graphify claude install            # always-on: CLAUDE.md + PreToolUse hook for this project
 ```
 Works with any mix of file types:
 | Type | Extensions | Extraction |
 |------|-----------|------------|
-| Code | `.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php` | AST via tree-sitter + call-graph pass + docstring/comment rationale |
+| Code | `.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php` | AST via tree-sitter + call-graph + docstring/comment rationale |
 | Docs | `.md .txt .rst` | Concepts + relationships + design rationale via Claude |
 | Papers | `.pdf` | Citation mining + concept extraction |
 | Images | `.png .jpg .webp .gif` | Claude vision - screenshots, diagrams, any language |
@@ -145,7 +172,7 @@ Works with any mix of file types:
 **Confidence scores** - every INFERRED edge has a `confidence_score` (0.0-1.0). You know not just what was guessed but how confident the model was. EXTRACTED edges are always 1.0.
-**Semantic similarity edges** - cross-file conceptual links that have no structural connection. Two functions solving the same problem without calling each other, a class in code and a concept in a paper describing the same algorithm.
+**Semantic similarity edges** - cross-file conceptual links with no structural connection. Two functions solving the same problem without calling each other, a class in code and a concept in a paper describing the same algorithm.
 **Hyperedges** - group relationships connecting 3+ nodes that pairwise edges can't express. All classes implementing a shared protocol, all functions in an auth flow, all concepts from a paper section forming one idea.
@@ -155,12 +182,8 @@ Works with any mix of file types:
 **Git hooks** (`graphify hook install`) - installs post-commit and post-checkout hooks. Graph rebuilds automatically after every commit and every branch switch. No background process needed.
-**Always-on for Claude** (`graphify claude install`) - writes a `CLAUDE.md` section so Claude checks the graph before answering architecture questions, plus a `.claude/settings.json` PreToolUse hook that fires before every Glob/Grep - Claude is reminded to check the graph before searching raw files.
 **Wiki** (`--wiki`) - Wikipedia-style markdown articles per community and god node, with an `index.md` entry point. Point any agent at `index.md` and it can navigate the knowledge base by reading files instead of parsing JSON.
-Every edge is tagged `EXTRACTED`, `INFERRED`, or `AMBIGUOUS` - you always know what was found vs guessed.
 ## Worked examples
 | Corpus | Files | Reduction | Output |
@@ -171,6 +194,10 @@ Every edge is tagged `EXTRACTED`, `INFERRED`, or `AMBIGUOUS` - you always know w
 Token reduction scales with corpus size. 6 files fits in a context window anyway, so graph value there is structural clarity, not compression. At 52 files (code + papers + images) you get 71x+. Each `worked/` folder has the raw input files and the actual output (`GRAPH_REPORT.md`, `graph.json`) so you can run it yourself and verify the numbers.
+## Privacy
+graphify sends file contents to the Claude API (Anthropic) for semantic extraction of docs, papers, and images. Code files are processed locally via tree-sitter AST — no file contents leave your machine for code. No telemetry, usage tracking, or analytics of any kind. The only network calls are to Anthropic's API during extraction, using your own API key via Claude Code.
 ## Tech stack
 NetworkX + Leiden (graspologic) + tree-sitter + Claude + vis.js. No Neo4j required, no server, runs entirely locally.

{graphifyy-0.2.0 → graphifyy-0.2.2}/graphifyy.egg-info/SOURCES.txt RENAMED Viewed

@@ -42,6 +42,7 @@ tests/test_ingest.py
 tests/test_languages.py
 tests/test_multilang.py
 tests/test_pipeline.py
+tests/test_rationale.py
 tests/test_report.py
 tests/test_security.py
 tests/test_semantic_similarity.py

{graphifyy-0.2.0 → graphifyy-0.2.2}/graphifyy.egg-info/requires.txt RENAMED Viewed

@@ -1,5 +1,4 @@
 networkx
-graspologic
 tree-sitter
 tree-sitter-python
 tree-sitter-javascript
@@ -21,6 +20,10 @@ neo4j
 pypdf
 html2text
 watchdog
+graspologic
+[leiden]
+graspologic
 [mcp]
 mcp

{graphifyy-0.2.0 → graphifyy-0.2.2}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "graphifyy"
-version = "0.2.0"
+version = "0.2.2"
 description = "Claude Code skill - turn any folder of code, docs, papers, images, or tweets into a queryable knowledge graph"
 readme = "README.md"
 license = { text = "MIT" }
@@ -12,7 +12,6 @@ keywords = ["claude", "claude-code", "knowledge-graph", "rag", "graphrag", "obsi
 requires-python = ">=3.10"
 dependencies = [
     "networkx",
-    "graspologic",
     "tree-sitter",
     "tree-sitter-python",
     "tree-sitter-javascript",
@@ -39,7 +38,8 @@ mcp = ["mcp"]
 neo4j = ["neo4j"]
 pdf = ["pypdf", "html2text"]
 watch = ["watchdog"]
-all = ["mcp", "neo4j", "pypdf", "html2text", "watchdog"]
+leiden = ["graspologic"]
+all = ["mcp", "neo4j", "pypdf", "html2text", "watchdog", "graspologic"]
 [project.scripts]
 graphify = "graphify.__main__:main"

{graphifyy-0.2.0 → graphifyy-0.2.2}/tests/test_claude_md.py RENAMED Viewed

@@ -95,3 +95,42 @@ def test_uninstall_no_op_when_no_file(tmp_path, capsys):
     claude_uninstall(tmp_path)
     out = capsys.readouterr().out
     assert "No CLAUDE.md" in out or "nothing to do" in out
+# ---------------------------------------------------------------------------
+# settings.json PreToolUse hook
+# ---------------------------------------------------------------------------
+def test_install_creates_settings_json(tmp_path):
+    """claude_install also writes .claude/settings.json with PreToolUse hook."""
+    import json
+    claude_install(tmp_path)
+    settings_path = tmp_path / ".claude" / "settings.json"
+    assert settings_path.exists()
+    settings = json.loads(settings_path.read_text())
+    hooks = settings.get("hooks", {}).get("PreToolUse", [])
+    assert any("Glob|Grep" in h.get("matcher", "") for h in hooks)
+def test_install_settings_json_idempotent(tmp_path):
+    """Running claude_install twice does not duplicate the PreToolUse hook."""
+    import json
+    claude_install(tmp_path)
+    claude_install(tmp_path)
+    settings_path = tmp_path / ".claude" / "settings.json"
+    settings = json.loads(settings_path.read_text())
+    hooks = settings.get("hooks", {}).get("PreToolUse", [])
+    glob_grep_hooks = [h for h in hooks if "Glob|Grep" in h.get("matcher", "")]
+    assert len(glob_grep_hooks) == 1
+def test_uninstall_removes_settings_hook(tmp_path):
+    """claude_uninstall removes the PreToolUse hook from settings.json."""
+    import json
+    claude_install(tmp_path)
+    claude_uninstall(tmp_path)
+    settings_path = tmp_path / ".claude" / "settings.json"
+    if settings_path.exists():
+        settings = json.loads(settings_path.read_text())
+        hooks = settings.get("hooks", {}).get("PreToolUse", [])
+        assert not any("Glob|Grep" in h.get("matcher", "") for h in hooks)

{graphifyy-0.2.0 → graphifyy-0.2.2}/tests/test_hooks.py RENAMED Viewed

@@ -2,7 +2,7 @@
 import subprocess
 from pathlib import Path
 import pytest
-from graphify.hooks import install, uninstall, status, _HOOK_MARKER
+from graphify.hooks import install, uninstall, status, _HOOK_MARKER, _CHECKOUT_MARKER
 def _make_git_repo(tmp_path: Path) -> Path:
@@ -78,3 +78,35 @@ def test_status_not_installed(tmp_path):
 def test_no_git_repo_raises(tmp_path):
     with pytest.raises(RuntimeError, match="No git repository"):
         install(tmp_path / "not_a_repo")
+def test_install_creates_post_checkout_hook(tmp_path):
+    repo = _make_git_repo(tmp_path)
+    install(repo)
+    hook = repo / ".git" / "hooks" / "post-checkout"
+    assert hook.exists()
+    assert _CHECKOUT_MARKER in hook.read_text()
+def test_install_post_checkout_is_executable(tmp_path):
+    repo = _make_git_repo(tmp_path)
+    install(repo)
+    hook = repo / ".git" / "hooks" / "post-checkout"
+    assert hook.stat().st_mode & 0o111
+def test_uninstall_removes_post_checkout_hook(tmp_path):
+    repo = _make_git_repo(tmp_path)
+    install(repo)
+    uninstall(repo)
+    hook = repo / ".git" / "hooks" / "post-checkout"
+    assert not hook.exists()
+def test_status_shows_both_hooks(tmp_path):
+    repo = _make_git_repo(tmp_path)
+    install(repo)
+    result = status(repo)
+    assert "post-commit" in result
+    assert "post-checkout" in result
+    assert result.count("installed") >= 2

graphifyy-0.2.2/tests/test_rationale.py ADDED Viewed

@@ -0,0 +1,89 @@
+"""Tests for rationale/docstring extraction in extract.py."""
+import textwrap
+from pathlib import Path
+import pytest
+from graphify.extract import extract_python
+def _write_py(tmp_path: Path, code: str) -> Path:
+    p = tmp_path / "sample.py"
+    p.write_text(textwrap.dedent(code))
+    return p
+def test_module_docstring_extracted(tmp_path):
+    path = _write_py(tmp_path, '''
+        """This module handles authentication because legacy sessions were insecure."""
+        def login(): pass
+    ''')
+    result = extract_python(path)
+    rationale = [n for n in result["nodes"] if n.get("file_type") == "rationale"]
+    assert len(rationale) >= 1
+    assert any("authentication" in n["label"] for n in rationale)
+def test_function_docstring_extracted(tmp_path):
+    path = _write_py(tmp_path, '''
+        def process():
+            """We use chunked processing here because the full dataset exceeds RAM."""
+            pass
+    ''')
+    result = extract_python(path)
+    rationale = [n for n in result["nodes"] if n.get("file_type") == "rationale"]
+    assert any("chunked" in n["label"] for n in rationale)
+def test_class_docstring_extracted(tmp_path):
+    path = _write_py(tmp_path, '''
+        class Cache:
+            """Chosen over Redis because we need zero external dependencies in the test env."""
+            pass
+    ''')
+    result = extract_python(path)
+    rationale = [n for n in result["nodes"] if n.get("file_type") == "rationale"]
+    assert any("Redis" in n["label"] for n in rationale)
+def test_rationale_comment_extracted(tmp_path):
+    path = _write_py(tmp_path, '''
+        def build():
+            # NOTE: must run before compile() or linker will fail
+            pass
+    ''')
+    result = extract_python(path)
+    rationale = [n for n in result["nodes"] if n.get("file_type") == "rationale"]
+    assert any("NOTE" in n["label"] for n in rationale)
+def test_rationale_for_edges_present(tmp_path):
+    path = _write_py(tmp_path, '''
+        """Module docstring explaining the why."""
+        def foo():
+            """Function docstring with rationale."""
+            pass
+    ''')
+    result = extract_python(path)
+    rationale_edges = [e for e in result["edges"] if e.get("relation") == "rationale_for"]
+    assert len(rationale_edges) >= 1
+def test_short_docstring_ignored(tmp_path):
+    """Trivial docstrings under 20 chars should not become rationale nodes."""
+    path = _write_py(tmp_path, '''
+        def foo():
+            """Constructor."""
+            pass
+    ''')
+    result = extract_python(path)
+    rationale = [n for n in result["nodes"] if n.get("file_type") == "rationale"]
+    assert len(rationale) == 0
+def test_rationale_confidence_is_extracted(tmp_path):
+    path = _write_py(tmp_path, '''
+        """This module exists because we needed a standalone parser."""
+        def parse(): pass
+    ''')
+    result = extract_python(path)
+    rationale_edges = [e for e in result["edges"] if e.get("relation") == "rationale_for"]
+    assert all(e.get("confidence") == "EXTRACTED" for e in rationale_edges)