npm - feed-the-machine - Versions diffs - 1.3.0 → 1.4.0 - Mend

feed-the-machine 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

package/ftm-audit/SKILL.md +383 -57
package/ftm-brainstorm/SKILL.md +119 -51
package/ftm-config/SKILL.md +1 -1
package/ftm-council/SKILL.md +259 -31
package/ftm-dashboard/SKILL.md +10 -10
package/ftm-debug/SKILL.md +861 -54
package/ftm-diagram/SKILL.md +1 -1
package/ftm-executor/SKILL.md +6 -6
package/ftm-git/SKILL.md +208 -22
package/ftm-inbox/bin/start.sh +1 -1
package/ftm-inbox/bin/status.sh +1 -1
package/ftm-inbox/bin/stop.sh +1 -1
package/ftm-intent/SKILL.md +0 -1
package/ftm-map/SKILL.md +46 -14
package/ftm-map/scripts/db.py +439 -118
package/ftm-map/scripts/index.py +128 -54
package/ftm-map/scripts/parser.py +89 -320
package/ftm-map/scripts/queries/go-tags.scm +20 -0
package/ftm-map/scripts/queries/javascript-tags.scm +19 -7
package/ftm-map/scripts/queries/python-tags.scm +22 -8
package/ftm-map/scripts/queries/ruby-tags.scm +19 -0
package/ftm-map/scripts/queries/rust-tags.scm +37 -0
package/ftm-map/scripts/queries/typescript-tags.scm +20 -8
package/ftm-map/scripts/query.py +176 -24
package/ftm-map/scripts/ranker.py +377 -0
package/ftm-map/scripts/requirements.txt +3 -0
package/ftm-map/scripts/setup.sh +11 -0
package/ftm-map/scripts/test_db.py +355 -115
package/ftm-map/scripts/test_parser.py +169 -101
package/ftm-map/scripts/test_query.py +178 -61
package/ftm-map/scripts/test_ranker.py +199 -0
package/ftm-map/scripts/views.py +107 -61
package/ftm-mind/SKILL.md +861 -11
package/ftm-mind/references/event-registry.md +20 -0
package/ftm-pause/SKILL.md +256 -37
package/ftm-resume/SKILL.md +380 -75
package/ftm-retro/SKILL.md +164 -27
package/ftm-upgrade/SKILL.md +4 -4
package/hooks/ftm-blackboard-enforcer.sh +2 -4
package/install.sh +6 -1
package/package.json +1 -1
package/ftm-map/scripts/tests/fixtures/__init__.py +0 -0
package/ftm-map/scripts/tests/fixtures/sample_project/api.ts +0 -16
package/ftm-map/scripts/tests/fixtures/sample_project/auth.py +0 -15
package/ftm-map/scripts/tests/fixtures/sample_project/utils.js +0 -16

package/ftm-diagram/SKILL.md CHANGED Viewed

@@ -18,6 +18,7 @@ Two-level diagram system: a root subway map of modules and per-module street map
 ---
 ## Graph-Powered Mode (ftm-map integration)
 Before running the standard analysis, check if the project has a code knowledge graph:
@@ -225,7 +226,6 @@ Add `sendEmail` node with edges to `src/notifications/DIAGRAM.mmd`.
 **View current architecture:**
 > "show architecture"
 Read and display `ARCHITECTURE.mmd` + list available module diagrams.
 ---
 ### Auto-Invocation by ftm-executor

package/ftm-executor/SKILL.md CHANGED Viewed

@@ -23,10 +23,10 @@ description: Autonomous plan execution engine. Takes any plan document and execu
 Before starting, load context from the blackboard:
-1. Read `/Users/kioja.kudumu/.claude/ftm-state/blackboard/context.json` — check current_task, recent_decisions, active_constraints
-2. Read `/Users/kioja.kudumu/.claude/ftm-state/blackboard/experiences/index.json` — filter entries by task_type matching plan tasks and tags overlapping with the plan domain
+1. Read `~/.claude/ftm-state/blackboard/context.json` — check current_task, recent_decisions, active_constraints
+2. Read `~/.claude/ftm-state/blackboard/experiences/index.json` — filter entries by task_type matching plan tasks and tags overlapping with the plan domain
 3. Load top 3-5 matching experience files for relevant lessons on agent performance and timing
-4. Read `/Users/kioja.kudumu/.claude/ftm-state/blackboard/patterns.json` — check execution_patterns for agent performance and timing accuracy patterns
+4. Read `~/.claude/ftm-state/blackboard/patterns.json` — check execution_patterns for agent performance and timing accuracy patterns
 If index.json is empty or no matches found, proceed normally without experience-informed shortcuts.
@@ -648,12 +648,12 @@ Use `ftm-executor` when: human says "just go" and trusts the plan.
 After completing, update the blackboard:
-1. Update `/Users/kioja.kudumu/.claude/ftm-state/blackboard/context.json`:
+1. Update `~/.claude/ftm-state/blackboard/context.json`:
    - Set current_task status to "complete"
    - Append decision summary to recent_decisions (cap at 10)
    - Update session_metadata.skills_invoked and last_updated
-2. Write an experience file to `/Users/kioja.kudumu/.claude/ftm-state/blackboard/experiences/YYYY-MM-DD_task-slug.json` capturing task_type, agent team used, wave count, audit outcomes, and lessons learned
-3. Update `/Users/kioja.kudumu/.claude/ftm-state/blackboard/experiences/index.json` with the new entry
+2. Write an experience file to `~/.claude/ftm-state/blackboard/experiences/YYYY-MM-DD_task-slug.json` capturing task_type, agent team used, wave count, audit outcomes, and lessons learned
+3. Update `~/.claude/ftm-state/blackboard/experiences/index.json` with the new entry
 4. Emit `task_completed` event
 ## Requirements

package/ftm-git/SKILL.md CHANGED Viewed

@@ -38,7 +38,7 @@ Yesterday we pushed API keys to the repo. That's the kind of mistake that leads
 ## Phase -1: Install Git Hook (First Invocation Only)
-The first time ftm-git runs in a repo, install a pre-commit hook as a hard safety net. This hook runs independently of Claude — it's a shell script that blocks `git commit` if staged files contain Tier 1 secret patterns. Even if this skill is not invoked, or someone runs git directly from the terminal, the hook catches it.
+The first time ftm-git runs in a repo, install a pre-commit hook as a hard safety net. This hook runs independently of Claude — it's a shell script that blocks `git commit` if staged files contain Tier 1 secret patterns. Even if Claude forgets to invoke this skill, or someone runs git directly from the terminal, the hook catches it.
 **Check if the hook is already installed:**
@@ -87,30 +87,104 @@ Before scanning, figure out what needs scanning and why you were invoked.
 Scan the in-scope files using regex patterns. The goal is zero false negatives — a few false positives are acceptable and will be filtered in Phase 2.
-Read `references/patterns/SECRET-PATTERNS.md` for the full Tier 1 and Tier 2 pattern library, the false positive suppression list, severity classifications, and per-finding record format.
+### Tier 1: High-Confidence Patterns (almost certainly real secrets)
-**Core Tier 1 patterns** (the most common — memorize these, consult the reference for the full set):
+These patterns have distinctive prefixes or structures that make false positives rare:
 ```
-AKIA[0-9A-Z]{16}                           # AWS Access Key ID
-ghp_[A-Za-z0-9_]{36}                       # GitHub PAT (classic)
-sk_live_[0-9a-zA-Z]{24,}                   # Stripe secret key (live)
-AIza[0-9A-Za-z\-_]{35}                     # Google API key
-xoxb-[0-9]{10,13}-[0-9]{10,13}-[a-zA-Z0-9]{24}  # Slack bot token
------BEGIN (RSA|DSA|EC|OPENSSH|PGP) PRIVATE KEY-----  # Private keys
+# AWS
+AKIA[0-9A-Z]{16}                                          # AWS Access Key ID
+amzn\.mws\.[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}  # AWS MWS
+# GitHub
+ghp_[A-Za-z0-9_]{36}                                      # GitHub PAT (classic)
+gho_[A-Za-z0-9_]{36}                                      # GitHub OAuth
+ghu_[A-Za-z0-9_]{36}                                      # GitHub user token
+ghs_[A-Za-z0-9_]{36}                                      # GitHub server token
+github_pat_[A-Za-z0-9_]{82}                                # GitHub fine-grained PAT
+# Slack
+xoxb-[0-9]{10,13}-[0-9]{10,13}-[a-zA-Z0-9]{24}           # Slack bot token
+xoxp-[0-9]{10,13}-[0-9]{10,13}-[a-zA-Z0-9]{24,34}        # Slack user token
+xoxa-[0-9]{10,13}-[0-9]{10,13}-[a-zA-Z0-9]{24,34}        # Slack app token
+xoxr-[0-9]{10,13}-[0-9]{10,13}-[a-zA-Z0-9]{24,34}        # Slack refresh token
+# Google
+AIza[0-9A-Za-z\-_]{35}                                    # Google API key
+# Stripe
+sk_live_[0-9a-zA-Z]{24,}                                  # Stripe secret key (live)
+sk_test_[0-9a-zA-Z]{24,}                                  # Stripe secret key (test)
+rk_live_[0-9a-zA-Z]{24,}                                  # Stripe restricted key
+# Other services
+SG\.[A-Za-z0-9\-_]{22}\.[A-Za-z0-9\-_]{43}               # SendGrid
+SK[0-9a-fA-F]{32}                                          # Twilio
+npm_[A-Za-z0-9]{36}                                        # npm token
+pypi-[A-Za-z0-9\-_]{100,}                                 # PyPI token
+glpat-[A-Za-z0-9\-_]{20,}                                 # GitLab PAT
+-----BEGIN (RSA|DSA|EC|OPENSSH|PGP) PRIVATE KEY-----      # Private keys
 ```
-Run Tier 1 patterns in parallel since they're independent. For Tier 2, check surrounding context before confirming.
+### Tier 2: Context-Dependent Patterns (need surrounding context to confirm)
+These match common assignment patterns. Check that the value isn't a placeholder, empty string, or env var reference before flagging:
+```
+# Generic key/secret assignments — flag if value looks real (not placeholder)
+(api_key|apikey|api-key)\s*[:=]\s*["']?[A-Za-z0-9\-_]{16,}["']?
+(secret|secret_key|client_secret)\s*[:=]\s*["']?[A-Za-z0-9\-_]{16,}["']?
+(password|passwd|pwd)\s*[:=]\s*["']?[^\s"']{8,}["']?
+(token|access_token|auth_token)\s*[:=]\s*["']?[A-Za-z0-9\-_.]{16,}["']?
+(database_url|db_url|connection_string)\s*[:=]\s*["']?[^\s"']{20,}["']?
+# Bearer tokens in code
+bearer\s+[A-Za-z0-9\-._~+/]{20,}
+# Webhook URLs with tokens
+https://hooks\.slack\.com/services/T[A-Z0-9]{8,}/B[A-Z0-9]{8,}/[a-zA-Z0-9]{24}
+```
+### What to Ignore (false positive suppression)
+Skip matches that are clearly not real secrets:
+- Values that are `""`, `''`, `None`, `null`, `undefined`, `TODO`, `CHANGEME`, `your-key-here`, `xxx`, `placeholder`, `example`, `test`, `dummy`, `fake`, `sample`
+- References to environment variables: `os.environ[`, `process.env.`, `ENV[`, `${`, `os.getenv(`
+- Lines that are comments (`#`, `//`, `/*`, `--`)
+- Files in `node_modules/`, `.git/`, `vendor/`, `__pycache__/`, `dist/`, `build/`
+- Files that are themselves `.env.example`, `.env.sample`, `.env.template`
+- Lock files (`package-lock.json`, `yarn.lock`, `Gemfile.lock`, `poetry.lock`)
+- Test fixtures where the "secret" is obviously fake (e.g., `test_api_key = "sk_test_abc123"` in a test file — but still flag `sk_live_*` in test files, those are real)
+### Running the Scan
+Use the Grep tool to search in-scope files for each pattern. Run Tier 1 patterns in parallel since they're independent. For Tier 2, check surrounding context before confirming.
+For each finding, record:
+- **file**: absolute path
+- **line**: line number
+- **pattern**: which pattern matched
+- **tier**: 1 or 2
+- **value_preview**: first 8 chars + `...` + last 4 chars (never log the full secret)
+- **context**: the surrounding code (with the secret value masked)
 ## Phase 2: Validate Findings
 For each Tier 2 match, read the surrounding context (5 lines before and after) and determine:
-1. **Is the value a real secret or a placeholder?** — Check against the ignore list in `references/patterns/SECRET-PATTERNS.md`.
+1. **Is the value a real secret or a placeholder?** — Check against the ignore list above.
 2. **Is it already using an env var?** — If the code does `key = os.environ.get("API_KEY", "sk_live_abc...")`, the hardcoded value is a fallback default. Still a finding — fallback defaults with real secrets are dangerous.
 3. **Is it in a file that should be gitignored?** — If the secret is in `.env` and `.env` is in `.gitignore`, it's fine. If `.env` is NOT in `.gitignore`, that's a separate finding.
-After validation, produce a findings list sorted by severity (CRITICAL → HIGH → MEDIUM → LOW). See `references/patterns/SECRET-PATTERNS.md` for the severity table.
+After validation, produce a findings list sorted by severity:
+| Severity | Meaning |
+|---|---|
+| **CRITICAL** | Tier 1 match (high-confidence secret) in a tracked or staged file |
+| **HIGH** | Tier 2 confirmed match in a tracked or staged file |
+| **MEDIUM** | `.env` file not in `.gitignore`, or secret in a fallback default |
+| **LOW** | Secret in a gitignored file but the gitignore rule might be fragile |
 If zero findings after validation: emit `secrets_clear` and proceed. The commit/push is safe.
@@ -120,23 +194,135 @@ If any CRITICAL or HIGH findings: **STOP. The commit/push is BLOCKED.** Say this
 ftm-git: BLOCKED — <N> secret(s) found. Commit/push halted. Attempting auto-remediation...
 ```
-Then proceed to Phase 3. The commit/push does NOT happen until Phase 3 completes and a re-scan comes back clean.
+Then proceed to Phase 3 to fix. The commit/push does NOT happen until Phase 3 completes and a re-scan in Phase 3 Step 5 comes back clean. This is non-negotiable — even if you can fix the secrets, the user needs to see that the operation was blocked and why.
 ## Phase 3: Auto-Remediate
-Read `references/protocols/REMEDIATION.md` for the full step-by-step remediation protocol, language-specific env var patterns, report formats (clean/remediated/blocked), and the Phase 5 git history deep scan procedure.
+For each finding, apply the appropriate fix automatically. The goal is to make the code safe without breaking functionality.
+### Step 1: Ensure .env infrastructure exists
-**Summary of steps:**
-1. Ensure `.env` and `.gitignore` infrastructure exists
-2. Extract each secret to `.env` with a SCREAMING_SNAKE_CASE var name
-3. Add placeholder to `.env.example`
-4. Refactor source files to reference the env var (match language pattern)
-5. Unstage `.env`, re-stage refactored source files
-6. Verify: re-run Phase 1 on refactored files — do not proceed until clean
+Check for a `.env` file in the project root. If it doesn't exist, create one with a header comment:
+```
+# Environment variables — DO NOT COMMIT THIS FILE
+# Copy .env.example for the template, fill in real values locally
+```
+Check `.gitignore` for `.env` coverage. If missing, add:
+```
+# Environment files with secrets
+.env
+.env.local
+.env.production
+.env.staging
+.env.*.local
+```
+### Step 2: Extract secrets to .env
+For each finding:
+1. **Choose an env var name** — derive it from the context. If the code says `STRIPE_API_KEY = "sk_live_..."`, the env var is `STRIPE_API_KEY`. If it says `api_key: "AIza..."`, infer from the file/service context (e.g., `GOOGLE_API_KEY`). Use SCREAMING_SNAKE_CASE.
+2. **Add to .env** — append `VAR_NAME=<actual-secret-value>` to `.env`. If the var already exists, don't duplicate it.
+3. **Add to .env.example** — create or update `.env.example` with `VAR_NAME=your-value-here` so other developers know the variable exists without seeing the real value.
+### Step 3: Refactor source files
+Replace the hardcoded secret with an env var reference. Match the language/framework:
+| Language | Pattern |
+|---|---|
+| Python | `os.environ["VAR_NAME"]` or `os.getenv("VAR_NAME")` (match existing style in file) |
+| JavaScript/TypeScript | `process.env.VAR_NAME` |
+| Ruby | `ENV["VAR_NAME"]` or `ENV.fetch("VAR_NAME")` |
+| Go | `os.Getenv("VAR_NAME")` |
+| Java | `System.getenv("VAR_NAME")` |
+| Shell/Bash | `$VAR_NAME` or `${VAR_NAME}` |
+| YAML/JSON config | `${VAR_NAME}` (if the framework supports interpolation) or add a comment pointing to the env var |
+If the file doesn't already import the env-reading module (e.g., `import os` in Python, `require('dotenv').config()` in Node), add the import. Check if the project uses `python-dotenv`, `dotenv` (Node), or similar — if so, use the project's existing pattern for loading env vars.
+### Step 4: Unstage remediated files
+After refactoring, make sure the `.env` file (with real secrets) is NOT staged:
+```bash
+git reset HEAD .env 2>/dev/null  # unstage if accidentally staged
+```
+Stage the refactored source files (which now reference env vars instead of hardcoded secrets):
+```bash
+git add <refactored-files>
+```
+### Step 5: Verify the fix
+Re-run Phase 1 scan on the refactored files to confirm the secrets are gone. If any remain, loop back and fix. Do not proceed until the scan is clean.
 ## Phase 4: Report
-After remediation or clean scan, produce the summary. Read `references/protocols/REMEDIATION.md` for the exact report formats.
+After remediation (or if the scan was clean from the start), produce a summary:
+**Clean scan:**
+```
+ftm-git: Clean scan. 0 secrets found in <N> files scanned. Safe to commit.
+```
+**After remediation:**
+```
+ftm-git: Found <N> hardcoded secrets. Auto-remediated:
+  CRITICAL: sk_live_**** in src/payments.py:42 -> STRIPE_SECRET_KEY
+  HIGH:     AIza**** in config/google.ts:18 -> GOOGLE_API_KEY
+  MEDIUM:   .env was not in .gitignore -> added
+Actions taken:
+  - Extracted <N> secrets to .env (gitignored)
+  - Created/updated .env.example with placeholder vars
+  - Refactored <N> source files to use env var references
+  - Updated .gitignore
+Verify the app still works with the new env var setup, then commit.
+```
+**Blocked (auto-fix not possible):**
+Some secrets can't be auto-fixed — for example, a private key embedded in a binary file, or a secret in a format the skill can't safely refactor. In these cases:
+```
+ftm-git: BLOCKED. Found secrets that require manual remediation:
+  CRITICAL: Private key in assets/cert.pem:1
+            -> Move this file outside the repo and reference via path env var
+  Action required: Fix the above manually, then run ftm-git again.
+```
+## Phase 5: Git History Check (Manual Invocation Only)
+When explicitly asked to do a deep scan (e.g., "scan the repo history for secrets"), also check past commits. This is expensive so it only runs on explicit request, not as part of the pre-commit gate.
+```bash
+git log --all --diff-filter=A --name-only --pretty=format:"%H" -- "*.env" "*.pem" "*.key" "*credentials*" "*secret*"
+```
+For each historically added sensitive file, check if it's still in the current tree. If it was added and later removed, warn that the secret is still in git history and suggest:
+1. Rotate the credential immediately (it's compromised)
+2. Use `git filter-repo` or BFG Repo Cleaner to purge from history if needed
+## Operating Principles
+1. **Block first, fix second.** Never let a secret through while figuring out the fix. The commit waits.
+2. **Zero false negatives over zero false positives.** It's better to flag something that turns out to be harmless than to miss a real key.
+3. **Never log full secrets.** In all output, mask secret values. Show only enough to identify which secret it is (first 8 + last 4 chars).
+4. **Env vars are the escape hatch.** The remediation pattern is always: secret goes to gitignored .env, code references the env var.
+5. **Existing patterns win.** If the project already uses dotenv, Vault, AWS Secrets Manager, or any other secret management system, match that pattern rather than introducing a new one.
+6. **Test files are not exempt.** A real `sk_live_*` key in a test file is just as dangerous as one in production code. Only `sk_test_*` with obviously fake values get a pass.
 ## Integration Points

package/ftm-inbox/bin/start.sh CHANGED Viewed

@@ -1,6 +1,6 @@
 #!/bin/bash
 # Start ftm-inbox backend + pollers
-cd "$(dirname "$0")/.."
+cd "$(dirname "$0")/.." || exit
 PORT=${FTM_INBOX_PORT:-8042}
 echo "Starting ftm-inbox on port $PORT..."
 python3 -m uvicorn backend.main:app --host 0.0.0.0 --port $PORT &

package/ftm-inbox/bin/status.sh CHANGED Viewed

@@ -1,5 +1,5 @@
 #!/bin/bash
-if [ -f /tmp/ftm-inbox.pid ] && kill -0 $(cat /tmp/ftm-inbox.pid) 2>/dev/null; then
+if [ -f /tmp/ftm-inbox.pid ] && kill -0 "$(cat /tmp/ftm-inbox.pid)" 2>/dev/null; then
     echo "ftm-inbox is running (PID: $(cat /tmp/ftm-inbox.pid))"
     # Show last poll times from DB if available
     CONFIG_DIR="$HOME/.claude/ftm-inbox"

package/ftm-inbox/bin/stop.sh CHANGED Viewed

@@ -1,6 +1,6 @@
 #!/bin/bash
 if [ -f /tmp/ftm-inbox.pid ]; then
-    kill $(cat /tmp/ftm-inbox.pid) 2>/dev/null
+    kill "$(cat /tmp/ftm-inbox.pid)" 2>/dev/null
     rm /tmp/ftm-inbox.pid
     echo "ftm-inbox stopped."
 else

package/ftm-intent/SKILL.md CHANGED Viewed

@@ -190,7 +190,6 @@ When updating after changes:
 4. Write updates — add missing entries, remove stale entries, update changed fields
 5. If new modules were added, create their INTENT.md and add rows to root module map
 6. Report: list of files updated, entries added, entries removed, entries modified
 ---
 ### Auto-Invocation by ftm-executor

package/ftm-map/SKILL.md CHANGED Viewed

@@ -1,17 +1,17 @@
 ---
 name: ftm-map
-description: Persistent code knowledge graph powered by tree-sitter and SQLite with FTS5 full-text search. Builds structural dependency graphs for blast radius analysis, dependency chains, and keyword search. Use when user asks "what breaks if I change X", "blast radius", "what depends on", "where do we handle", "map codebase", "index project", "what calls", "dependency chain", "ftm-map".
+description: Persistent code knowledge graph powered by tree-sitter and SQLite with FTS5 full-text search. Uses a v2 hybrid architecture combining file-level PageRank with symbol-level blast radius analysis. Builds structural dependency graphs for blast radius, dependency chains, context selection, and keyword search. Use when user asks "what breaks if I change X", "blast radius", "what depends on", "where do we handle", "map codebase", "index project", "what calls", "dependency chain", "what's relevant for", "context for", "ftm-map".
 ---
 # ftm-map
-Persistent code knowledge graph powered by tree-sitter and SQLite with FTS5 full-text search. Parses the local codebase into a structural dependency graph stored in `.ftm-map/map.db`, then answers structural queries (blast radius, dependency chains, symbol lookup) and keyword searches without re-reading the source tree on every question.
+Persistent code knowledge graph powered by tree-sitter and SQLite with FTS5 full-text search. Uses a v2 hybrid architecture: file-level PageRank (via fast-pagerank with scipy sparse matrices) for broad relevance ranking, combined with symbol-level blast radius for precise impact analysis. Parses the local codebase using Aider-style def/ref extraction with tags.scm into a 5-table schema (files, symbols, refs, file_edges, symbol_edges) stored in `.ftm-map/map.db`, then answers structural queries (blast radius, dependency chains, context selection, symbol lookup) and keyword searches without re-reading the source tree on every question.
 ## Events
 ### Emits
 - `map_updated` — when the graph database has been updated (bootstrap or incremental)
-  - Payload: `{ project_path, symbols_count, edges_count, files_parsed, duration_ms, mode }`
+  - Payload: `{ project_path, symbols_count, edges_count, file_edges_count, reference_count, files_parsed, duration_ms, mode }`
 - `task_completed` — when any ftm-map operation finishes
 ### Listens To
@@ -43,8 +43,9 @@ Bootstrap:    "map this codebase" / "index this project" / no map.db exists yet
 Incremental:  Triggered by code_committed event or PostToolUse hook
               Parses only changed files and updates their graph entries.
-Query:        Structural or keyword question about existing graph
+Query:        Structural, keyword, or context question about existing graph
               Detects query type and runs appropriate script.
+              Includes context selection for token-budgeted file retrieval.
 ```
 If `.ftm-map/map.db` does not exist when a query arrives, fall back to offering bootstrap (see Graceful Degradation below).
@@ -98,15 +99,24 @@ Trigger: user asks a structural or keyword question about the codebase.
 | "find X in the codebase" | FTS5 keyword search | `--search "X"` |
 | "tell me about function X" | symbol info | `--info X` |
 | "show dependencies for X" | dependency chain | `--deps X` |
+| "what's relevant for X" | context selection | `--context --seed-keywords X` |
+| "context for X" | context selection | `--context --seed-keywords X` |
+| "important files for X" | context selection | `--context --seed-files X` |
+| "what should I look at for X" | context selection | `--context --seed-keywords X` |
+| "show stats" / "how big is the index" | statistics | `--stats` |
 ### Execution
 Run the appropriate query script with the venv python:
 ```
-ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --blast-radius <symbol>
-ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --deps <symbol>
-ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --search "<keywords>"
-ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --info <symbol>
+ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --blast-radius <symbol> --project-root .
+ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --deps <symbol> --project-root .
+ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --search "<keywords>" --project-root .
+ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --info <symbol> --project-root .
+ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --context --seed-files src/auth.py --token-budget 4000 --project-root .
+ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --context --seed-keywords authenticate --project-root .
+ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --context --seed-symbols handleAuth --token-budget 8000 --project-root .
+ftm-map/scripts/.venv/bin/python3 ftm-map/scripts/query.py --stats --project-root .
 ```
 ### Output Formatting
@@ -151,9 +161,29 @@ Symbol: authenticateUser
   Signature:  authenticateUser(token: string, opts?: AuthOptions) → Promise<Session>
   Callers:    3 direct, 5 transitive
   Callees:    validateToken, decodeJWT, createSession
+  References: 12 across codebase
   Dependents: 8 symbols total
 ```
+**Context selection** — PageRank-ranked files with token budget:
+```
+Context for "authenticate" (budget: 4000 tokens):
+  1. src/auth/index.ts          score: 0.142   tokens: 850
+  2. src/handlers/auth.ts       score: 0.098   tokens: 620
+  3. src/middleware/session.ts   score: 0.076   tokens: 540
+  Total: 2010 / 4000 tokens
+```
+**Stats** — database overview:
+```
+Index statistics:
+  Files:        42
+  Symbols:      318
+  References:   1204
+  File edges:   86
+  Symbol edges: 542
+```
 ## Graceful Degradation
 If `.ftm-map/map.db` does not exist when a query is requested:
@@ -170,11 +200,12 @@ All heavy lifting is done by Python scripts in `ftm-map/scripts/`. The skill orc
 | Script | Purpose |
 |--------|---------|
 | `setup.sh` | Creates virtualenv, installs tree-sitter and dependencies |
-| `db.py` | SQLite schema, CRUD operations, graph traversal queries |
-| `parser.py` | tree-sitter parsing and symbol/edge extraction |
-| `index.py` | Full bootstrap scan and incremental file indexing |
-| `query.py` | Blast radius, dependency chain, FTS5 keyword search, symbol info |
-| `views.py` | INTENT.md and .mmd generation from graph data |
+| `db.py` | 5-table SQLite schema (files, symbols, refs, file_edges, symbol_edges), CRUD, graph traversal |
+| `parser.py` | Aider-style def/ref extraction via tree-sitter tags.scm queries |
+| `index.py` | Full bootstrap scan and incremental file indexing with Aider weight heuristics |
+| `query.py` | Blast radius, dependency chain, FTS5 search, symbol info, context selection, stats |
+| `ranker.py` | PageRank-based file ranking with fast-pagerank and scipy sparse matrices |
+| `views.py` | INTENT.md and ARCHITECTURE.mmd generation from the 5-table graph |
 Always use the venv python — never the system python — to ensure tree-sitter bindings are available:
 ```
@@ -215,6 +246,7 @@ After `map_updated` or session end:
 - tool: `ftm-map/scripts/index.py` | required | bootstrap and incremental indexer
 - tool: `ftm-map/scripts/query.py` | required | blast radius, dependency, and FTS5 search queries
 - tool: `ftm-map/scripts/views.py` | required | INTENT.md and .mmd diagram generation from graph
+- tool: `ftm-map/scripts/ranker.py` | required | PageRank file ranking with fast-pagerank and scipy
 - tool: `git` | optional | changed file detection for incremental mode
 - config: `~/.claude/ftm-config.yml` | optional | model profile and skills.ftm-map.enabled flag
@@ -255,5 +287,5 @@ After `map_updated` or session end:
 ### task_completed
 - skill: string — "ftm-map"
 - operation: string — "bootstrap" | "incremental" | "query"
-- query_type: string | null — "blast-radius" | "deps" | "search" | "info" (for query mode)
+- query_type: string | null — "blast-radius" | "deps" | "search" | "info" | "context" | "stats" (for query mode)
 - duration_ms: number — total operation duration