npm - @oriro/orirocli - Versions diffs - 0.1.9 → 0.1.12 - Mend

@oriro/orirocli 0.1.9 → 0.1.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (166) hide show

package/README.md +16 -18
package/dist/cli.js +4776 -2964
package/package.json +2 -2
package/skills/craft/ai-engineering/SKILL.md +2 -2
package/skills/graphify/SKILL.md +0 -619
package/skills/graphify/__init__.py +0 -28
package/skills/graphify/__main__.py +0 -4582
package/skills/graphify/affected.py +0 -154
package/skills/graphify/always_on/agents-md.md +0 -12
package/skills/graphify/always_on/antigravity-rules.md +0 -14
package/skills/graphify/always_on/claude-md.md +0 -9
package/skills/graphify/always_on/gemini-md.md +0 -9
package/skills/graphify/always_on/kiro-steering.md +0 -5
package/skills/graphify/always_on/vscode-instructions.md +0 -17
package/skills/graphify/analyze.py +0 -724
package/skills/graphify/benchmark.py +0 -155
package/skills/graphify/build.py +0 -487
package/skills/graphify/cache.py +0 -417
package/skills/graphify/callflow_html.py +0 -2020
package/skills/graphify/cluster.py +0 -272
package/skills/graphify/command-kilo.md +0 -15
package/skills/graphify/dedup.py +0 -429
package/skills/graphify/detect.py +0 -1379
package/skills/graphify/diagnostics.py +0 -390
package/skills/graphify/export.py +0 -1408
package/skills/graphify/extract.py +0 -11570
package/skills/graphify/global_graph.py +0 -159
package/skills/graphify/google_workspace.py +0 -223
package/skills/graphify/hooks.py +0 -457
package/skills/graphify/ingest.py +0 -331
package/skills/graphify/llm.py +0 -1896
package/skills/graphify/manifest.py +0 -4
package/skills/graphify/mcp_ingest.py +0 -392
package/skills/graphify/multigraph_compat.py +0 -212
package/skills/graphify/pg_introspect.py +0 -142
package/skills/graphify/prs.py +0 -748
package/skills/graphify/querylog.py +0 -70
package/skills/graphify/report.py +0 -218
package/skills/graphify/scip_ingest.py +0 -363
package/skills/graphify/security.py +0 -336
package/skills/graphify/semantic_cleanup.py +0 -319
package/skills/graphify/serve.py +0 -1309
package/skills/graphify/skill-aider.md +0 -1246
package/skills/graphify/skill-amp.md +0 -613
package/skills/graphify/skill-claw.md +0 -616
package/skills/graphify/skill-codex.md +0 -613
package/skills/graphify/skill-copilot.md +0 -616
package/skills/graphify/skill-devin.md +0 -1372
package/skills/graphify/skill-droid.md +0 -613
package/skills/graphify/skill-kilo.md +0 -625
package/skills/graphify/skill-kiro.md +0 -615
package/skills/graphify/skill-opencode.md +0 -608
package/skills/graphify/skill-pi.md +0 -615
package/skills/graphify/skill-trae.md +0 -614
package/skills/graphify/skill-vscode.md +0 -612
package/skills/graphify/skill-windows.md +0 -651
package/skills/graphify/skills/amp/references/add-watch.md +0 -56
package/skills/graphify/skills/amp/references/exports.md +0 -71
package/skills/graphify/skills/amp/references/extraction-spec.md +0 -68
package/skills/graphify/skills/amp/references/github-and-merge.md +0 -46
package/skills/graphify/skills/amp/references/hooks.md +0 -33
package/skills/graphify/skills/amp/references/query.md +0 -249
package/skills/graphify/skills/amp/references/transcribe.md +0 -48
package/skills/graphify/skills/amp/references/update.md +0 -179
package/skills/graphify/skills/claude/references/add-watch.md +0 -56
package/skills/graphify/skills/claude/references/exports.md +0 -71
package/skills/graphify/skills/claude/references/extraction-spec.md +0 -68
package/skills/graphify/skills/claude/references/github-and-merge.md +0 -46
package/skills/graphify/skills/claude/references/hooks.md +0 -33
package/skills/graphify/skills/claude/references/query.md +0 -103
package/skills/graphify/skills/claude/references/transcribe.md +0 -48
package/skills/graphify/skills/claude/references/update.md +0 -179
package/skills/graphify/skills/claw/references/add-watch.md +0 -56
package/skills/graphify/skills/claw/references/exports.md +0 -71
package/skills/graphify/skills/claw/references/extraction-spec.md +0 -29
package/skills/graphify/skills/claw/references/github-and-merge.md +0 -46
package/skills/graphify/skills/claw/references/hooks.md +0 -33
package/skills/graphify/skills/claw/references/query.md +0 -249
package/skills/graphify/skills/claw/references/transcribe.md +0 -48
package/skills/graphify/skills/claw/references/update.md +0 -179
package/skills/graphify/skills/codex/references/add-watch.md +0 -56
package/skills/graphify/skills/codex/references/exports.md +0 -71
package/skills/graphify/skills/codex/references/extraction-spec.md +0 -29
package/skills/graphify/skills/codex/references/github-and-merge.md +0 -46
package/skills/graphify/skills/codex/references/hooks.md +0 -33
package/skills/graphify/skills/codex/references/query.md +0 -249
package/skills/graphify/skills/codex/references/transcribe.md +0 -48
package/skills/graphify/skills/codex/references/update.md +0 -179
package/skills/graphify/skills/copilot/references/add-watch.md +0 -56
package/skills/graphify/skills/copilot/references/exports.md +0 -71
package/skills/graphify/skills/copilot/references/extraction-spec.md +0 -68
package/skills/graphify/skills/copilot/references/github-and-merge.md +0 -46
package/skills/graphify/skills/copilot/references/hooks.md +0 -33
package/skills/graphify/skills/copilot/references/query.md +0 -249
package/skills/graphify/skills/copilot/references/transcribe.md +0 -48
package/skills/graphify/skills/copilot/references/update.md +0 -179
package/skills/graphify/skills/droid/references/add-watch.md +0 -56
package/skills/graphify/skills/droid/references/exports.md +0 -71
package/skills/graphify/skills/droid/references/extraction-spec.md +0 -68
package/skills/graphify/skills/droid/references/github-and-merge.md +0 -46
package/skills/graphify/skills/droid/references/hooks.md +0 -33
package/skills/graphify/skills/droid/references/query.md +0 -249
package/skills/graphify/skills/droid/references/transcribe.md +0 -48
package/skills/graphify/skills/droid/references/update.md +0 -179
package/skills/graphify/skills/kilo/references/add-watch.md +0 -56
package/skills/graphify/skills/kilo/references/exports.md +0 -71
package/skills/graphify/skills/kilo/references/extraction-spec.md +0 -68
package/skills/graphify/skills/kilo/references/github-and-merge.md +0 -46
package/skills/graphify/skills/kilo/references/hooks.md +0 -33
package/skills/graphify/skills/kilo/references/query.md +0 -249
package/skills/graphify/skills/kilo/references/transcribe.md +0 -48
package/skills/graphify/skills/kilo/references/update.md +0 -179
package/skills/graphify/skills/kiro/references/add-watch.md +0 -56
package/skills/graphify/skills/kiro/references/exports.md +0 -71
package/skills/graphify/skills/kiro/references/extraction-spec.md +0 -29
package/skills/graphify/skills/kiro/references/github-and-merge.md +0 -46
package/skills/graphify/skills/kiro/references/hooks.md +0 -33
package/skills/graphify/skills/kiro/references/query.md +0 -249
package/skills/graphify/skills/kiro/references/transcribe.md +0 -48
package/skills/graphify/skills/kiro/references/update.md +0 -179
package/skills/graphify/skills/opencode/references/add-watch.md +0 -56
package/skills/graphify/skills/opencode/references/exports.md +0 -71
package/skills/graphify/skills/opencode/references/extraction-spec.md +0 -68
package/skills/graphify/skills/opencode/references/github-and-merge.md +0 -46
package/skills/graphify/skills/opencode/references/hooks.md +0 -33
package/skills/graphify/skills/opencode/references/query.md +0 -249
package/skills/graphify/skills/opencode/references/transcribe.md +0 -48
package/skills/graphify/skills/opencode/references/update.md +0 -179
package/skills/graphify/skills/pi/references/add-watch.md +0 -56
package/skills/graphify/skills/pi/references/exports.md +0 -71
package/skills/graphify/skills/pi/references/extraction-spec.md +0 -29
package/skills/graphify/skills/pi/references/github-and-merge.md +0 -46
package/skills/graphify/skills/pi/references/hooks.md +0 -33
package/skills/graphify/skills/pi/references/query.md +0 -249
package/skills/graphify/skills/pi/references/transcribe.md +0 -48
package/skills/graphify/skills/pi/references/update.md +0 -179
package/skills/graphify/skills/trae/references/add-watch.md +0 -56
package/skills/graphify/skills/trae/references/exports.md +0 -71
package/skills/graphify/skills/trae/references/extraction-spec.md +0 -68
package/skills/graphify/skills/trae/references/github-and-merge.md +0 -46
package/skills/graphify/skills/trae/references/hooks.md +0 -35
package/skills/graphify/skills/trae/references/query.md +0 -249
package/skills/graphify/skills/trae/references/transcribe.md +0 -48
package/skills/graphify/skills/trae/references/update.md +0 -179
package/skills/graphify/skills/vscode/references/add-watch.md +0 -56
package/skills/graphify/skills/vscode/references/exports.md +0 -71
package/skills/graphify/skills/vscode/references/extraction-spec.md +0 -68
package/skills/graphify/skills/vscode/references/github-and-merge.md +0 -46
package/skills/graphify/skills/vscode/references/hooks.md +0 -33
package/skills/graphify/skills/vscode/references/query.md +0 -249
package/skills/graphify/skills/vscode/references/transcribe.md +0 -48
package/skills/graphify/skills/vscode/references/update.md +0 -179
package/skills/graphify/skills/windows/references/add-watch.md +0 -56
package/skills/graphify/skills/windows/references/exports.md +0 -71
package/skills/graphify/skills/windows/references/extraction-spec.md +0 -68
package/skills/graphify/skills/windows/references/github-and-merge.md +0 -46
package/skills/graphify/skills/windows/references/hooks.md +0 -33
package/skills/graphify/skills/windows/references/query.md +0 -249
package/skills/graphify/skills/windows/references/transcribe.md +0 -48
package/skills/graphify/skills/windows/references/update.md +0 -179
package/skills/graphify/symbol_resolution.py +0 -538
package/skills/graphify/transcribe.py +0 -184
package/skills/graphify/tree_html.py +0 -582
package/skills/graphify/validate.py +0 -72
package/skills/graphify/watch.py +0 -898
package/skills/graphify/wiki.py +0 -282

package/skills/graphify/skill-kiro.md DELETED Viewed

@@ -1,615 +0,0 @@
----
-name: graphify
-description: "Use for any question about a codebase, its architecture, file relationships, or project content — especially when graphify-out/ exists, where the question should be treated as a graphify query first. Turns any input (code, docs, papers, images, videos) into a persistent knowledge graph with god nodes, community detection, and query/path/explain tools."
----
-# /graphify
-Turn any folder of files into a navigable knowledge graph with community detection, an honest audit trail, and three outputs: interactive HTML, GraphRAG-ready JSON, and a plain-language GRAPH_REPORT.md.
-## Usage
-```
-/graphify                                             # full pipeline on current directory → Obsidian vault
-/graphify <path>                                      # full pipeline on specific path
-/graphify https://github.com/<owner>/<repo>           # clone repo then run full pipeline on it
-/graphify https://github.com/<owner>/<repo> --branch <branch>  # clone a specific branch
-/graphify <url1> <url2> ...                           # clone multiple repos, build each, merge into one cross-repo graph
-/graphify <path> --mode deep                          # thorough extraction, richer INFERRED edges
-/graphify <path> --update                             # incremental - re-extract only new/changed files
-/graphify <path> --directed                            # build directed graph (preserves edge direction: source→target)
-/graphify <path> --whisper-model medium                # use a larger Whisper model for better transcription accuracy
-/graphify <path> --cluster-only                       # rerun clustering on existing graph
-/graphify <path> --no-viz                             # skip visualization, just report + JSON
-/graphify <path> --html                               # (HTML is generated by default - this flag is a no-op)
-/graphify <path> --svg                                # also export graph.svg (embeds in Notion, GitHub)
-/graphify <path> --graphml                            # export graph.graphml (Gephi, yEd)
-/graphify <path> --neo4j                              # generate graphify-out/cypher.txt for Neo4j
-/graphify <path> --neo4j-push bolt://localhost:7687   # push directly to Neo4j
-/graphify <path> --mcp                                # start MCP stdio server for agent access
-/graphify <path> --watch                              # watch folder, auto-rebuild on code changes (no LLM needed)
-/graphify <path> --wiki                               # build agent-crawlable wiki (index.md + one article per community)
-/graphify <path> --obsidian --obsidian-dir ~/vaults/my-project  # write vault to custom path (e.g. existing vault)
-/graphify add <url>                                   # fetch URL, save to ./raw, update graph
-/graphify add <url> --author "Name"                   # tag who wrote it
-/graphify add <url> --contributor "Name"              # tag who added it to the corpus
-/graphify query "<question>"                          # BFS traversal - broad context
-/graphify query "<question>" --dfs                    # DFS - trace a specific path
-/graphify query "<question>" --budget 1500            # cap answer at N tokens
-/graphify path "AuthModule" "Database"                # shortest path between two concepts
-/graphify explain "SwinTransformer"                   # plain-language explanation of a node
-```
-## What graphify is for
-Drop any folder of code, docs, papers, images, or video into graphify and get a queryable knowledge graph. Persistent across sessions, honest audit trail (EXTRACTED/INFERRED/AMBIGUOUS), community detection surfaces cross-document connections you wouldn't think to ask about.
-## What You Must Do When Invoked
-If the user invoked `/graphify --help` or `/graphify -h` (with no other arguments), print the contents of the `## Usage` section above verbatim and stop. Do not run any commands, do not detect files, do not default the path to `.`. Just print the Usage block and return.
-**Fast path — existing graph:** Before doing anything else, check whether `graphify-out/graph.json` exists. The expected location is `graphify-out/graph.json` relative to the **current working directory** (i.e. the project root where you are running commands). If it exists AND the user's request is a natural-language question about the codebase (e.g. "How does X work?", "What calls Y?", "Trace the data flow through Z") and NOT an explicit rebuild command (`--update`, `--cluster-only`, or a bare path/URL that implies fresh extraction): **skip Steps 1–5 entirely and jump straight to `## For /graphify query`.** Run `graphify query "<question>"` immediately. Do not run detect. Do not check corpus size. Do not ask the user to narrow. The graph is already built — use it.
-If no path was given, use `.` (current directory). Do not ask the user for a path.
-If the path argument starts with `https://github.com/` or `http://github.com/`, treat it as a GitHub URL - run Step 0 before anything else, then continue with the resolved local path.
-Follow these steps in order. Do not skip steps.
-### Step 0 - GitHub repos and multi-path merge (only if a URL or several paths)
-Only when the path is one or more `https://github.com/...` URLs, or several local subfolders to merge. See `references/github-and-merge.md` for the clone, cross-repo merge, and monorepo flow, then continue with the resolved local path. A plain local path skips this step.
-### Step 1 - Ensure graphify is installed
-```bash
-# Detect the correct Python interpreter (handles uv tool, pipx, venv, system installs)
-PYTHON=""
-GRAPHIFY_BIN=$(which graphify 2>/dev/null)
-# 1. uv tool installs — most reliable on modern Mac/Linux
-if [ -z "$PYTHON" ] && command -v uv >/dev/null 2>&1; then
-    _UV_PY=$(uv tool run graphifyy python -c "import sys; print(sys.executable)" 2>/dev/null)
-    if [ -n "$_UV_PY" ]; then PYTHON="$_UV_PY"; fi
-fi
-# 2. Read shebang from graphify binary (pipx and direct pip installs)
-if [ -z "$PYTHON" ] && [ -n "$GRAPHIFY_BIN" ]; then
-    _SHEBANG=$(head -1 "$GRAPHIFY_BIN" | tr -d '#!')
-    case "$_SHEBANG" in
-        *[!a-zA-Z0-9/_.-]*) ;;
-        *) "$_SHEBANG" -c "import graphify" 2>/dev/null && PYTHON="$_SHEBANG" ;;
-    esac
-fi
-# 3. Fall back to python3
-if [ -z "$PYTHON" ]; then PYTHON="python3"; fi
-if ! "$PYTHON" -c "import graphify" 2>/dev/null; then
-    if command -v uv >/dev/null 2>&1; then
-        uv tool install --upgrade graphifyy -q 2>&1 | tail -3
-        _UV_PY=$(uv tool run graphifyy python -c "import sys; print(sys.executable)" 2>/dev/null)
-        if [ -n "$_UV_PY" ]; then PYTHON="$_UV_PY"; fi
-    else
-        "$PYTHON" -m pip install graphifyy -q 2>/dev/null \
-          || "$PYTHON" -m pip install graphifyy -q --break-system-packages 2>&1 | tail -3
-    fi
-fi
-# Write interpreter path for all subsequent steps (persists across invocations)
-mkdir -p graphify-out
-"$PYTHON" -c "import sys; open('graphify-out/.graphify_python', 'w', encoding='utf-8').write(sys.executable)"
-# Save scan root so `graphify update` (no args) knows where to look next time
-echo "$(cd INPUT_PATH && pwd)" > graphify-out/.graphify_root
-```
-If the import succeeds, print nothing and move straight to Step 2.
-**In every subsequent bash block, replace `python3` with `$(cat graphify-out/.graphify_python)` to use the correct interpreter.**
-### Step 2 - Detect files
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import json
-from graphify.detect import detect
-from pathlib import Path
-result = detect(Path('INPUT_PATH'))
-print(json.dumps(result, ensure_ascii=False))
-" > graphify-out/.graphify_detect.json
-```
-Replace INPUT_PATH with the actual path the user provided. Do NOT cat or print the JSON - read it silently and present a clean summary instead:
-```
-Corpus: X files · ~Y words
-  code:     N files (.py .ts .go ...)
-  docs:     N files (.md .txt ...)
-  papers:   N files (.pdf ...)
-  images:   N files
-  video:    N files (.mp4 .mp3 ...)
-```
-Omit any category with 0 files from the summary.
-Then act on it:
-- If `total_files` is 0: stop with "No supported files found in [path]."
-- If `skipped_sensitive` is non-empty: mention file count skipped, not the file names.
-- If `total_words` > 2,000,000 OR `total_files` > 500: show the warning. Then compute the top 5 first-level subdirectories by file count:
-  - Read `scan_root` from the detect JSON (always an absolute path to the resolved INPUT_PATH).
-  - Concatenate all file lists across all types (`code`, `document`, `paper`, `image`, `video`).
-  - Filter out any path that starts with `scan_root + "/graphify-out/"` to exclude converted sidecars.
-  - For each file, strip the `scan_root` prefix and take the first path component. Files directly in `scan_root` with no subdirectory count as `(root)`.
-  - If all files are in `(root)` with no subdirectories, do not ask to narrow — no subfolders exist. Instead suggest `--no-cluster` to skip the expensive clustering step and proceed.
-  - Otherwise rank by count, show the top 5 with file counts, then ask which subfolder to run on. Wait for the user's answer before proceeding.
-- Otherwise: proceed directly to Step 2.5 if video files were detected, or Step 3 if not.
-### Step 2.5 - Video and audio (only if video files detected)
-Skip this step entirely if `detect` returned zero `video` files. When the corpus has video or audio, see `references/transcribe.md` to transcribe them to text first, then treat the transcripts as doc files in Step 3.
-### Step 3 - Extract entities and relationships
-**Before starting:** note whether `--mode deep` was given. You must pass `DEEP_MODE=true` to every subagent in Step B2 if it was. Track this from the original invocation - do not lose it.
-This step has two parts: **structural extraction** (deterministic, free) and **semantic extraction** (LLM, costs tokens).
-**Before dispatching subagents:** check whether `GEMINI_API_KEY` or `GOOGLE_API_KEY` is set. If neither is set, print this one-liner to the user:
-> Tip: set `GEMINI_API_KEY` or `GOOGLE_API_KEY` to use Gemini for semantic extraction (`pip install 'graphifyy[gemini]'`).
-Print it once, then continue. If `GEMINI_API_KEY` or `GOOGLE_API_KEY` IS set, use `graphify.llm.extract_corpus_parallel(files, backend="gemini")` for semantic extraction instead of dispatching Claude subagents. The default Gemini model is `gemini-3-flash-preview`; set `GRAPHIFY_GEMINI_MODEL` or pass `--model` in headless CLI flows to override it.
-> **No other API keys are read.** If `GEMINI_API_KEY`/`GOOGLE_API_KEY` are unset, fall straight through to Claude Code subagent dispatch (Part B below) — the host session itself is the LLM. graphify does **not** read `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or any other provider key from the environment. If a host agent prompts the user for `ANTHROPIC_API_KEY` to run extraction, that prompt is a misread of this skill — ignore it and dispatch subagents as written.
-**Run Part A (AST) and Part B (semantic) in parallel. Dispatch all semantic subagents AND start AST extraction in the same message. Both can run simultaneously since they operate on different file types. Merge results in Part C as before.**
-Note: Parallelizing AST + semantic saves 5-15s on large corpora. AST is deterministic and fast; start it while subagents are processing docs/papers.
-#### Part A - Structural extraction for code files
-For any code files detected, run AST extraction in parallel with Part B subagents:
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import sys, json
-from graphify.extract import collect_files, extract
-from pathlib import Path
-import json
-code_files = []
-detect = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
-for f in detect.get('files', {}).get('code', []):
-    code_files.extend(collect_files(Path(f)) if Path(f).is_dir() else [Path(f)])
-if code_files:
-    result = extract(code_files, cache_root=Path('.'))
-    Path('graphify-out/.graphify_ast.json').write_text(json.dumps(result, indent=2, ensure_ascii=False), encoding=\"utf-8\")
-    print(f'AST: {len(result[\"nodes\"])} nodes, {len(result[\"edges\"])} edges')
-else:
-    Path('graphify-out/.graphify_ast.json').write_text(json.dumps({'nodes':[],'edges':[],'input_tokens':0,'output_tokens':0}, ensure_ascii=False), encoding=\"utf-8\")
-    print('No code files - skipping AST extraction')
-"
-```
-#### Part B - Semantic extraction (parallel subagents)
-**Fast path:** If detection found zero docs, papers, and images (code-only corpus), skip Part B entirely and go straight to Part C. AST handles code - there is nothing for semantic subagents to do.
-**MANDATORY: You MUST use the Agent tool here. Reading files yourself one-by-one is forbidden - it is 5-10x slower. If you do not use the Agent tool you are doing this wrong.**
-Before dispatching subagents, print a timing estimate:
-- Load `total_words` and file counts from `graphify-out/.graphify_detect.json`
-- Estimate agents needed: `ceil(uncached_non_code_files / 22)` (chunk size is 20-25)
-- Estimate time: ~45s per agent batch (they run in parallel, so total ≈ 45s × ceil(agents/parallel_limit))
-- Print: "Semantic extraction: ~N files → X agents, estimated ~Ys"
-**Step B0 - Check extraction cache first**
-Before dispatching any subagents, check which files already have cached extraction results:
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import json
-from graphify.cache import check_semantic_cache
-from pathlib import Path
-detect = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
-all_files = [f for files in detect['files'].values() for f in files]
-cached_nodes, cached_edges, cached_hyperedges, uncached = check_semantic_cache(all_files)
-if cached_nodes or cached_edges or cached_hyperedges:
-    Path('graphify-out/.graphify_cached.json').write_text(json.dumps({'nodes': cached_nodes, 'edges': cached_edges, 'hyperedges': cached_hyperedges}, ensure_ascii=False), encoding=\"utf-8\")
-Path('graphify-out/.graphify_uncached.txt').write_text('\n'.join(uncached), encoding=\"utf-8\")
-print(f'Cache: {len(all_files)-len(uncached)} files hit, {len(uncached)} files need extraction')
-"
-```
-Only dispatch subagents for files listed in `graphify-out/.graphify_uncached.txt`. If all files are cached, skip to Part C directly.
-**Step B1 - Split into chunks**
-Load files from `graphify-out/.graphify_uncached.txt`. Split into chunks of 20-25 files each. Each image gets its own chunk (vision needs separate context). When splitting, group files from the same directory together so related artifacts land in the same chunk and cross-file relationships are more likely to be extracted.
-**Step B2 - Dispatch ALL subagents in a single message**
-Call the Agent tool multiple times IN THE SAME RESPONSE - one call per chunk. This is the only way they run in parallel. If you make one Agent call, wait, then make another, you are doing it sequentially and defeating the purpose.
-**IMPORTANT - subagent type:** Always use `subagent_type="general-purpose"`. Do NOT use `Explore` - it is read-only and cannot write chunk files to disk, which silently drops extraction results. General-purpose has Write and Bash access which the subagent needs.
-Concrete example for 3 chunks:
-```
-[Agent tool call 1: files 1-15, subagent_type="general-purpose"]
-[Agent tool call 2: files 16-30, subagent_type="general-purpose"]
-[Agent tool call 3: files 31-45, subagent_type="general-purpose"]
-```
-All three in one message. Not three separate messages.
-Each subagent receives this exact prompt (substitute FILE_LIST, CHUNK_NUM, TOTAL_CHUNKS, DEEP_MODE, and CHUNK_PATH).
-CHUNK_PATH must be an **absolute** path — derive it before dispatching:
-```bash
-PROJECT_ROOT=$(cat graphify-out/.graphify_root)
-# Then for chunk N: CHUNK_PATH="${PROJECT_ROOT}/graphify-out/.graphify_chunk_0N.json"
-```
-Subagent prompt template:
-See `references/extraction-spec.md` for the exact subagent prompt (JSON schema, node-ID rules, confidence rubric, frontmatter, hyperedge, and vision rules). Load it only here, only when at least one chunk holds a doc, paper, or image; a pure-code corpus has skipped Part B and never reads it. Pass each subagent that prompt verbatim with FILE_LIST, CHUNK_NUM, TOTAL_CHUNKS, DEEP_MODE, and CHUNK_PATH substituted, and have it write the result to CHUNK_PATH.
-**Step B3 - Collect, cache, and merge**
-Wait for all subagents. For each result:
-- Check that `graphify-out/.graphify_chunk_NN.json` exists on disk — this is the success signal
-- If the file exists and contains valid JSON with `nodes` and `edges`, include it and save to cache
-- If the file is missing, the subagent was likely dispatched as read-only (Explore type) — print a warning: "chunk N missing from disk — subagent may have been read-only. Re-run with general-purpose agent." Do not silently skip.
-- If a subagent failed or returned invalid JSON, print a warning and skip that chunk - do not abort
-If more than half the chunks failed or are missing, stop and tell the user to re-run and ensure `subagent_type="general-purpose"` is used.
-Merge all chunk files into `.graphify_semantic_new.json`. **After each Agent call completes, read the real token counts from the Agent tool result's `usage` field and write them back into the chunk JSON before merging** — the chunk JSON itself always has placeholder zeros. Then run:
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import json, glob
-from pathlib import Path
-chunks = sorted(glob.glob('graphify-out/.graphify_chunk_*.json'))
-all_nodes, all_edges, all_hyperedges = [], [], []
-total_in, total_out = 0, 0
-for c in chunks:
-    d = json.loads(Path(c).read_text(encoding=\"utf-8\"))
-    all_nodes += d.get('nodes', [])
-    all_edges += d.get('edges', [])
-    all_hyperedges += d.get('hyperedges', [])
-    total_in += d.get('input_tokens', 0)
-    total_out += d.get('output_tokens', 0)
-Path('graphify-out/.graphify_semantic_new.json').write_text(json.dumps({
-    'nodes': all_nodes, 'edges': all_edges, 'hyperedges': all_hyperedges,
-    'input_tokens': total_in, 'output_tokens': total_out,
-}, indent=2, ensure_ascii=False), encoding=\"utf-8\")
-print(f'Merged {len(chunks)} chunks: {total_in:,} in / {total_out:,} out tokens')
-"
-```
-Save new results to cache:
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import json
-from graphify.cache import save_semantic_cache
-from pathlib import Path
-new = json.loads(Path('graphify-out/.graphify_semantic_new.json').read_text(encoding=\"utf-8\")) if Path('graphify-out/.graphify_semantic_new.json').exists() else {'nodes':[],'edges':[],'hyperedges':[]}
-saved = save_semantic_cache(new.get('nodes', []), new.get('edges', []), new.get('hyperedges', []))
-print(f'Cached {saved} files')
-"
-```
-Merge cached + new results into `graphify-out/.graphify_semantic.json`:
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import json
-from pathlib import Path
-cached = json.loads(Path('graphify-out/.graphify_cached.json').read_text(encoding=\"utf-8\")) if Path('graphify-out/.graphify_cached.json').exists() else {'nodes':[],'edges':[],'hyperedges':[]}
-new = json.loads(Path('graphify-out/.graphify_semantic_new.json').read_text(encoding=\"utf-8\")) if Path('graphify-out/.graphify_semantic_new.json').exists() else {'nodes':[],'edges':[],'hyperedges':[]}
-all_nodes = cached['nodes'] + new.get('nodes', [])
-all_edges = cached['edges'] + new.get('edges', [])
-all_hyperedges = cached.get('hyperedges', []) + new.get('hyperedges', [])
-seen = set()
-deduped = []
-for n in all_nodes:
-    if n['id'] not in seen:
-        seen.add(n['id'])
-        deduped.append(n)
-merged = {
-    'nodes': deduped,
-    'edges': all_edges,
-    'hyperedges': all_hyperedges,
-    'input_tokens': new.get('input_tokens', 0),
-    'output_tokens': new.get('output_tokens', 0),
-}
-Path('graphify-out/.graphify_semantic.json').write_text(json.dumps(merged, indent=2, ensure_ascii=False), encoding=\"utf-8\")
-print(f'Extraction complete - {len(deduped)} nodes, {len(all_edges)} edges ({len(cached[\"nodes\"])} from cache, {len(new.get(\"nodes\",[]))} new)')
-"
-```
-Clean up temp files: `rm -f graphify-out/.graphify_cached.json graphify-out/.graphify_uncached.txt graphify-out/.graphify_semantic_new.json`
-#### Part C - Merge AST + semantic into final extraction
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import sys, json
-from pathlib import Path
-ast = json.loads(Path('graphify-out/.graphify_ast.json').read_text(encoding=\"utf-8\"))
-sem = json.loads(Path('graphify-out/.graphify_semantic.json').read_text(encoding=\"utf-8\"))
-# Merge: AST nodes first, semantic nodes deduplicated by id
-seen = {n['id'] for n in ast['nodes']}
-merged_nodes = list(ast['nodes'])
-for n in sem['nodes']:
-    if n['id'] not in seen:
-        merged_nodes.append(n)
-        seen.add(n['id'])
-merged_edges = ast['edges'] + sem['edges']
-merged_hyperedges = sem.get('hyperedges', [])
-merged = {
-    'nodes': merged_nodes,
-    'edges': merged_edges,
-    'hyperedges': merged_hyperedges,
-    'input_tokens': sem.get('input_tokens', 0),
-    'output_tokens': sem.get('output_tokens', 0),
-}
-Path('graphify-out/.graphify_extract.json').write_text(json.dumps(merged, indent=2, ensure_ascii=False), encoding=\"utf-8\")
-total = len(merged_nodes)
-edges = len(merged_edges)
-print(f'Merged: {total} nodes, {edges} edges ({len(ast[\"nodes\"])} AST + {len(sem[\"nodes\"])} semantic)')
-"
-```
-### Step 4 - Build graph, cluster, analyze, generate outputs
-**Before starting:** note whether `--directed` was given. If so, pass `directed=True` to `build_from_json()` in the code block below. This builds a `DiGraph` that preserves edge direction (source→target) instead of the default undirected `Graph`.
-```bash
-mkdir -p graphify-out
-$(cat graphify-out/.graphify_python) -c "
-import sys, json
-from graphify.build import build_from_json
-from graphify.cluster import cluster, score_all
-from graphify.analyze import god_nodes, surprising_connections, suggest_questions
-from graphify.report import generate
-from graphify.export import to_json
-from pathlib import Path
-extraction = json.loads(Path('graphify-out/.graphify_extract.json').read_text(encoding=\"utf-8\"))
-detection  = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
-G = build_from_json(extraction)
-communities = cluster(G)
-cohesion = score_all(G, communities)
-tokens = {'input': extraction.get('input_tokens', 0), 'output': extraction.get('output_tokens', 0)}
-gods = god_nodes(G)
-surprises = surprising_connections(G, communities)
-labels = {cid: 'Community ' + str(cid) for cid in communities}
-# Placeholder questions - regenerated with real labels in Step 5
-questions = suggest_questions(G, communities, labels)
-report = generate(G, communities, cohesion, labels, gods, surprises, detection, tokens, '.', suggested_questions=questions)
-Path('graphify-out/GRAPH_REPORT.md').write_text(report, encoding=\"utf-8\")
-to_json(G, communities, 'graphify-out/graph.json')
-analysis = {
-    'communities': {str(k): v for k, v in communities.items()},
-    'cohesion': {str(k): v for k, v in cohesion.items()},
-    'gods': gods,
-    'surprises': surprises,
-    'questions': questions,
-}
-Path('graphify-out/.graphify_analysis.json').write_text(json.dumps(analysis, indent=2, ensure_ascii=False), encoding=\"utf-8\")
-if G.number_of_nodes() == 0:
-    print('ERROR: Graph is empty - extraction produced no nodes.')
-    print('Possible causes: all files were skipped, binary-only corpus, or extraction failed.')
-    raise SystemExit(1)
-print(f'Graph: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges, {len(communities)} communities')
-"
-```
-If this step prints `ERROR: Graph is empty`, stop and tell the user what happened - do not proceed to labeling or visualization.
-Replace INPUT_PATH with the actual path.
-### Step 5 - Label communities
-Read `graphify-out/.graphify_analysis.json`. For each community key, look at its node labels and write a 2-5 word plain-language name (e.g. "Attention Mechanism", "Training Pipeline", "Data Loading").
-Then regenerate the report and save the labels for the visualizer:
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import sys, json
-from graphify.build import build_from_json
-from graphify.cluster import score_all
-from graphify.analyze import god_nodes, surprising_connections, suggest_questions
-from graphify.report import generate
-from pathlib import Path
-extraction = json.loads(Path('graphify-out/.graphify_extract.json').read_text(encoding=\"utf-8\"))
-detection  = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
-analysis   = json.loads(Path('graphify-out/.graphify_analysis.json').read_text(encoding=\"utf-8\"))
-G = build_from_json(extraction)
-communities = {int(k): v for k, v in analysis['communities'].items()}
-cohesion = {int(k): v for k, v in analysis['cohesion'].items()}
-tokens = {'input': extraction.get('input_tokens', 0), 'output': extraction.get('output_tokens', 0)}
-# LABELS - replace these with the names you chose above
-labels = LABELS_DICT
-# Regenerate questions with real community labels (labels affect question phrasing)
-questions = suggest_questions(G, communities, labels)
-report = generate(G, communities, cohesion, labels, analysis['gods'], analysis['surprises'], detection, tokens, '.', suggested_questions=questions)
-Path('graphify-out/GRAPH_REPORT.md').write_text(report, encoding=\"utf-8\")
-Path('graphify-out/.graphify_labels.json').write_text(json.dumps({str(k): v for k, v in labels.items()}, ensure_ascii=False), encoding=\"utf-8\")
-print('Report updated with community labels')
-"
-```
-Replace `LABELS_DICT` with the actual dict you constructed (e.g. `{0: "Attention Mechanism", 1: "Training Pipeline"}`).
-Replace INPUT_PATH with the actual path.
-### Step 6 - Generate Obsidian vault (opt-in) + HTML
-**Generate HTML always** (unless `--no-viz`). **Obsidian vault only if `--obsidian` was explicitly given** — skip it otherwise, it generates one file per node.
-If `--obsidian` was given:
-- If `--obsidian-dir <path>` was also given, pass it via `--dir`. Otherwise defaults to `graphify-out/obsidian`.
-```bash
-graphify export obsidian
-# or with custom dir: graphify export obsidian --dir ~/vaults/my-project
-```
-Generate the HTML graph (always, unless `--no-viz`):
-```bash
-graphify export html  # auto-aggregates to community view if graph > 5000 nodes
-# or: graphify export html --no-viz
-```
-### Steps 6b-8 - Wiki, Neo4j, SVG, GraphML, MCP, benchmark (only on their flags)
-These run only when their flag is present (`--wiki`, `--neo4j`/`--neo4j-push`, `--svg`, `--graphml`, `--mcp`) or, for the token-reduction benchmark, when `total_words` exceeds 5,000. A default run with no export flags skips all of them. See `references/exports.md` for each one. Run any `--wiki` export before Step 9 cleanup so `.graphify_labels.json` is still available.
----
-### Step 9 - Save manifest, update cost tracker, clean up, and report
-```bash
-$(cat graphify-out/.graphify_python) -c "
-import json
-from pathlib import Path
-from datetime import datetime, timezone
-from graphify.detect import save_manifest
-# Save manifest for --update
-detect = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
-# In --update mode, 'all_files' carries the full corpus; 'files' is the changed
-# subset. Full-rebuild mode populates only 'files', so the fallback handles that.
-save_manifest(detect.get('all_files') or detect['files'])
-# Update cumulative cost tracker
-extract = json.loads(Path('graphify-out/.graphify_extract.json').read_text(encoding=\"utf-8\"))
-input_tok = extract.get('input_tokens', 0)
-output_tok = extract.get('output_tokens', 0)
-cost_path = Path('graphify-out/cost.json')
-if cost_path.exists():
-    cost = json.loads(cost_path.read_text(encoding=\"utf-8\"))
-else:
-    cost = {'runs': [], 'total_input_tokens': 0, 'total_output_tokens': 0}
-cost['runs'].append({
-    'date': datetime.now(timezone.utc).isoformat(),
-    'input_tokens': input_tok,
-    'output_tokens': output_tok,
-    'files': detect.get('total_files', 0),
-})
-cost['total_input_tokens'] += input_tok
-cost['total_output_tokens'] += output_tok
-cost_path.write_text(json.dumps(cost, indent=2, ensure_ascii=False), encoding=\"utf-8\")
-print(f'This run: {input_tok:,} input tokens, {output_tok:,} output tokens')
-print(f'All time: {cost[\"total_input_tokens\"]:,} input, {cost[\"total_output_tokens\"]:,} output ({len(cost[\"runs\"])} runs)')
-"
-rm -f graphify-out/.graphify_detect.json graphify-out/.graphify_extract.json graphify-out/.graphify_ast.json graphify-out/.graphify_semantic.json graphify-out/.graphify_analysis.json
-find graphify-out -maxdepth 1 -name '.graphify_chunk_*.json' -delete 2>/dev/null
-rm -f graphify-out/.needs_update 2>/dev/null || true
-```
-Tell the user (omit the obsidian line unless --obsidian was given):
-```
-Graph complete. Outputs in PATH_TO_DIR/graphify-out/
-  graph.html            - interactive graph, open in browser
-  GRAPH_REPORT.md       - audit report
-  graph.json            - raw graph data
-  obsidian/             - Obsidian vault (only if --obsidian was given)
-```
-If graphify saved you time, consider supporting it: https://github.com/sponsors/safishamsi
-Replace PATH_TO_DIR with the actual absolute path of the directory that was processed.
-Then paste these sections from GRAPH_REPORT.md directly into the chat:
-- God Nodes
-- Surprising Connections
-- Suggested Questions
-Do NOT paste the full report - just those three sections. Keep it concise.
-Then immediately offer to explore. Pick the single most interesting suggested question from the report - the one that crosses the most community boundaries or has the most surprising bridge node - and ask:
-> "The most interesting question this graph can answer: **[question]**. Want me to trace it?"
-If the user says yes, run `/graphify query "[question]"` on the graph and walk them through the answer using the graph structure - which nodes connect, which community boundaries get crossed, what the path reveals. Keep going as long as they want to explore. Each answer should end with a natural follow-up ("this connects to X - want to go deeper?") so the session feels like navigation, not a one-shot report.
-The graph is the map. Your job after the pipeline is to be the guide.
----
-## Interpreter guard for subcommands
-Before running any subcommand below (`--update`, `--cluster-only`, `query`, `path`, `explain`, `add`), check that `.graphify_python` exists. If it's missing (e.g. user deleted `graphify-out/`), re-resolve the interpreter first:
-```bash
-if [ ! -f graphify-out/.graphify_python ]; then
-    GRAPHIFY_BIN=$(which graphify 2>/dev/null)
-    if [ -n "$GRAPHIFY_BIN" ]; then
-        PYTHON=$(head -1 "$GRAPHIFY_BIN" | tr -d '#!')
-        case "$PYTHON" in *[!a-zA-Z0-9/_.-]*) PYTHON="python3" ;; esac
-    else
-        PYTHON="python3"
-    fi
-    mkdir -p graphify-out
-    "$PYTHON" -c "import sys; open('graphify-out/.graphify_python', 'w', encoding='utf-8').write(sys.executable)"
-fi
-```
-## For --update and --cluster-only
-Both are non-default subcommands. `--update` re-extracts only new or changed files; `--cluster-only` reruns clustering on the existing graph. See `references/update.md` for both flows.
----
-## For /graphify query
-When `graphify-out/graph.json` already exists and the user asks a question about the corpus, answer from the graph rather than rebuilding it:
-```bash
-graphify query "<question>"
-```
-If the `graphify query` CLI is unavailable, fall back to an inline NetworkX traversal of `graphify-out/graph.json`. Answer using only what the graph output contains, and quote `source_location` when citing a specific fact. For the BFS/DFS traversal modes, the `--budget` cap, the NetworkX fallback, `save-result` feedback, and the `/graphify path` and `/graphify explain` flows, see `references/query.md`.
----
-## For /graphify add and --watch
-Neither is part of the default build. When the user runs `/graphify add <url>` to fetch a URL into the corpus, or passes `--watch` to auto-rebuild on file changes, see `references/add-watch.md`.
----
-## For the commit hook and native CLAUDE.md integration
-When the user asks to install the post-commit auto-rebuild hook or wire graphify into a project's CLAUDE.md, see `references/hooks.md`.
----
-## Honesty Rules
-- Never invent an edge. If unsure, use AMBIGUOUS.
-- Never skip the corpus check warning.
-- Always show token cost in the report.
-- Never hide cohesion scores behind symbols - show the raw number.
-- Never run HTML viz on a graph with more than 5,000 nodes without warning the user.