@heytherevibin/skillforge 0.8.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -6,61 +6,78 @@
6
6
  <a href="https://github.com/heytherevibin/skillforge/actions/workflows/ci.yml"><img src="https://github.com/heytherevibin/skillforge/actions/workflows/ci.yml/badge.svg" alt="CI" /></a>
7
7
  </p>
8
8
 
9
- **Skillforge** is a **skill orchestration co-tool for Claude** (and other MCP hosts). It keeps a catalog of **`SKILL.md`** skills, **routes** the few that match each task using **local embeddings** and an optional **Haiku** step, and returns their bodies for **injection into the host model**. Optional **SQLite** learning improves routing over time.
9
+ **Skillforge** is a **local-first orchestration layer** for agent workflows: it maintains a catalog of **`SKILL.md`** documents, **routes** a small subset per task using **embedding-first retrieval**, optional **hybrid** sparse signals and **LLM** stages, optional **project-scoped** policies and notes, and returns **structured context** for downstream models. The **primary integration** is **stdio MCP**; the **CLI** provides parity, operations, and automation hooks.
10
10
 
11
- **Primary interface:** **stdio MCP** (`skillforge mcp`) add it to Claude Desktop, Cursor, or Claude Code.
12
-
13
- **Observability:** run **`skillforge events --watch`** in a terminal (top skills, active sessions, and live **route** / **feedback** lines from SQLite).
11
+ **Published version:** see **`package.json`** and the npm badge above (they should match after each release). **Change history:** [CHANGELOG.md](CHANGELOG.md). **Product direction:** [STRATEGY.md](STRATEGY.md). **Vulnerability reporting:** [SECURITY.md](SECURITY.md). **Release process:** [RELEASING.md](RELEASING.md).
14
12
 
15
13
  ---
16
14
 
17
15
  ## Table of contents
18
16
 
19
- - [Why Skillforge](#why-skillforge)
17
+ - [What Skillforge provides](#what-skillforge-provides)
18
+ - [Architecture at a glance](#architecture-at-a-glance)
20
19
  - [Requirements](#requirements)
21
20
  - [Quick start](#quick-start)
22
21
  - [Installation](#installation)
23
- - [Usage](#usage)
24
- - [Run modes](#run-modes)
25
- - [Model Context Protocol (MCP)](#model-context-protocol-mcp)
26
- - [MCP response contract](#mcp-response-contract)
27
- - [Skills and packs](#skills-and-packs)
22
+ - [Operational interfaces](#operational-interfaces)
23
+ - [Model Context Protocol (MCP)](#model-context-protocol-mcp)
24
+ - [MCP response contract](#mcp-response-contract)
28
25
  - [Routing pipeline](#routing-pipeline)
26
+ - [Route policies and project overlay](#route-policies-and-project-overlay)
27
+ - [Project RAG](#project-rag)
28
+ - [Learning, weights, and portability](#learning-weights-and-portability)
29
+ - [Skills and packs](#skills-and-packs)
29
30
  - [Configuration](#configuration)
30
- - [Local data and operations](#local-data-and-operations)
31
- - [Security considerations](#security-considerations)
31
+ - [Local data and paths](#local-data-and-paths)
32
+ - [Security](#security)
32
33
  - [Contributing and governance](#contributing-and-governance)
33
- - [Releases and maintainers](#releases-and-maintainers)
34
34
  - [License](#license)
35
35
 
36
36
  ---
37
37
 
38
- ## Why Skillforge
38
+ ## What Skillforge provides
39
39
 
40
- | Capability | Description |
41
- |------------|-------------|
42
- | **Focused context** | Injects only a small set of skill documents per turn instead of the full catalog. |
43
- | **Hybrid routing** | Embedding shortlist plus a fast **Claude Haiku** routing step for final selection. |
44
- | **Adaptation** | Re-routes when the conversation topic shifts (configurable threshold). |
45
- | **Learning loop** | Optional weights from usage and explicit feedback improve routing over time. |
46
- | **Observability** | **`skillforge events`**: snapshots of **usage** + **active sessions**, **`--watch`** for realtime; **`--verbose`** for route detail. No browser UI. |
47
- | **Project bootstrap** | MCP tools **`materialize_project`** and **`skillforge_bootstrap`** write `.cursor/rules`, **`docs/SKILLFORGE-PRD.md`**, and a **`CLAUDE.md`** section (map **`/skillforge`** in rules to MCP tools). |
48
- | **Extensibility** | Custom skills, git-based **packs**, and overrides under a single user config directory. |
49
- | **Deployment flexibility** | **MCP stdio** as the default integration; **CLI** helpers (`route`, `events`, `index`). |
40
+ | Area | Capability |
41
+ |------|------------|
42
+ | **Context control** | Returns only **relevant** skill (and optional project) chunks instead of an entire catalog. |
43
+ | **Routing** | Dense embeddings on skill **cards** (title, description, optional triggers); optional **keyword / BM25** fusion; optional **LLM** rerank and final pick—or **embedding-only** or **host-delegated** selection. |
44
+ | **Conversation-aware retrieval** | Recent turns can influence the **shortlist query** when enabled via environment (see [Configuration](#configuration)). |
45
+ | **Governance** | Regex **policies** to append skills after routing; **project overlay** for excludes, score boosts, and **project notes** (notes require a declared **project root**). |
46
+ | **Adaptation** | Per-user **SQLite** statistics and explicit feedback adjust routing over time (portable via **export/import**). |
47
+ | **Project grounding** | Optional **index** of repository text into the same SQLite DB used for sessions (**project RAG**). |
48
+ | **Observability** | Versioned **`_meta`** on MCP responses; route **events** in SQLite; **`skillforge events`** for operators. |
49
+ | **Reliability hooks** | **`skillforge health`** (preflight) and **`skillforge route-eval`** (fixture-driven smoke checks; used in CI). |
50
50
 
51
- Bundled content includes **200+** curated skills (coding, security, research, frontend/backend patterns, and more). Exact counts are validated in CI.
51
+ ---
52
+
53
+ ## Architecture at a glance
54
+
55
+ ```
56
+ Host (Claude / Cursor / Claude Code / …)
57
+ │ MCP JSON-RPC (stdio)
58
+
59
+ ┌───────────────────────────────────────────────────────────┐
60
+ │ skillforge mcp → Python: embed → shortlist → optional │
61
+ │ LLM stages → policies/overlay → context assembly │
62
+ └───────────────────────────────────────────────────────────┘
63
+
64
+ ├── SQLite (global ~/.skillforge or <project>/.skillforge)
65
+ └── Optional: Anthropic API (Haiku) in-process when enabled
66
+ ```
67
+
68
+ **Trust boundary:** Skillforge runs **on the operator’s machine** (or your CI runner). Prompts and retrieved text should be handled per your org’s data policy. See [Security](#security).
52
69
 
53
70
  ---
54
71
 
55
72
  ## Requirements
56
73
 
57
- | Dependency | Version | Notes |
58
- |------------|---------|--------|
59
- | **Node.js** | **>= 18** | Required for the CLI bootstrapper. Continuous integration runs on **Node 22**. |
60
- | **Python** | **>= 3.10** | Used on the host PATH for embeddings and routing (via the CLI-spawned **venv**). |
61
- | **Anthropic API** | | **`ANTHROPIC_API_KEY`** enables the **full** (Haiku) router when you want it. **MCP** can run **without** it using **embedding-only** routing (default when the key is omitted; see [MCP](#model-context-protocol-mcp)). |
74
+ | Dependency | Notes |
75
+ |------------|--------|
76
+ | **Node.js** | **≥ 18** (CLI bootstrap). CI validates on **Node 22** (see `.github/workflows/ci.yml`). |
77
+ | **Python** | **≥ 3.10**; the CLI creates **`~/.skillforge/venv`** and installs `python/requirements.txt`. |
78
+ | **Anthropic API** | **Optional.** Without **`ANTHROPIC_API_KEY`**, routing stays **embedding-first** unless you delegate picks to the host. |
62
79
 
63
- **First run:** The CLI creates **`~/.skillforge/`**, a dedicated **Python venv**, installs Python dependencies, and caches the default embedding model (typically on the order of one to two minutes once; subsequent starts are fast).
80
+ **First run** installs the virtualenv and Python dependencies and may download the default sentence-transformer model once; subsequent starts are typically fast.
64
81
 
65
82
  ---
66
83
 
@@ -70,15 +87,25 @@ Bundled content includes **200+** curated skills (coding, security, research, fr
70
87
  npx --yes @heytherevibin/skillforge --help
71
88
  ```
72
89
 
73
- Add Skillforge to your MCP config (see [MCP](#model-context-protocol-mcp)). No `ANTHROPIC_API_KEY` is required for **embedding-only** routing.
90
+ Configure **MCP** in your host (see [Model Context Protocol](#model-context-protocol-mcp)). **Embedding-only** operation does not require an Anthropic key.
74
91
 
75
- Live log (usage + routes): **`skillforge events --watch`**.
92
+ **Operator visibility:**
93
+
94
+ ```bash
95
+ skillforge events --watch
96
+ ```
97
+
98
+ **Preflight (after install):**
99
+
100
+ ```bash
101
+ skillforge health --quick
102
+ ```
76
103
 
77
104
  ---
78
105
 
79
106
  ## Installation
80
107
 
81
- **One-shot (recommended for evaluation)**
108
+ **Evaluate without global install**
82
109
 
83
110
  ```bash
84
111
  npx --yes @heytherevibin/skillforge --help
@@ -91,280 +118,261 @@ npm install -g @heytherevibin/skillforge
91
118
  skillforge --help
92
119
  ```
93
120
 
94
- Package on npm: [@heytherevibin/skillforge](https://www.npmjs.com/package/@heytherevibin/skillforge).
95
- Source and issues: [github.com/heytherevibin/skillforge](https://github.com/heytherevibin/skillforge).
121
+ - **npm:** [@heytherevibin/skillforge](https://www.npmjs.com/package/@heytherevibin/skillforge)
122
+ - **Source / issues:** [github.com/heytherevibin/skillforge](https://github.com/heytherevibin/skillforge)
96
123
 
97
124
  ---
98
125
 
99
- ## Usage
100
-
101
- ### Run modes
102
-
103
- | Command | Purpose |
104
- |---------|---------|
105
- | `skillforge --help` | Recommended first step; **MCP** is the main integration. |
106
- | `skillforge mcp` | **stdio** MCP server (Claude, Cursor, …). |
107
- | `skillforge events [--watch]` | **Terminal** log: usage snapshot + routes; see **`skillforge events --help`**. |
108
- | `skillforge route […]` | **Terminal** routing — same pipeline as MCP **`route_skills`** (loads embed model); see **`skillforge route --help`**. |
109
- | `skillforge mcp config [--local] [--with-anthropic]` | **stdout**: JSON snippet for **`mcp.json`** (merge manually). |
110
-
111
- ### Model Context Protocol (MCP)
112
-
113
- You can run the MCP server **without** `ANTHROPIC_API_KEY`: routing uses **embeddings + shortlist only** (no Haiku call). The host still uses its own billing for the conversation.
114
-
115
- | `SKILLFORGE_ROUTER_MODE` | `ANTHROPIC_API_KEY` | MCP routing |
116
- |--------------------------|---------------------|-------------|
117
- | *(unset)* — auto | omitted | Embedding-only (keyless) |
118
- | *(unset)* — auto | set | Full router (Haiku) |
119
- | `embedding` | either | Embedding-only |
120
- | `full` | set recommended | Full router (Haiku); falls back on API errors |
126
+ ## Operational interfaces
121
127
 
122
- Add to your MCP host configuration (paths vary by product). Example for Claude Desktop on macOS (`claude_desktop_config.json`) **without** an extra API key:
128
+ Skillforge is organized around a small **CLI** surface (implemented in **Node** spawning **Python** modules). Use **`skillforge <command> --help`** for flags.
123
129
 
124
- ```json
125
- {
126
- "mcpServers": {
127
- "skillforge": {
128
- "command": "npx",
129
- "args": ["-y", "@heytherevibin/skillforge", "mcp"]
130
- }
131
- }
132
- }
133
- ```
130
+ | Group | Commands | Purpose |
131
+ |-------|----------|--------|
132
+ | **Core** | `mcp`, `route`, `events`, `index` | Primary routing, logs, project indexing. |
133
+ | **Reliability** | `health`, `route-eval` | Preflight checks; embedding-mode fixture evaluation (CI uses both). |
134
+ | **Learning portability** | `weights export`, `weights import` | Snapshot / restore **`skill_weights`** rows (JSON). |
135
+ | **Catalog** | `skills`, `pack` | User skills and git-backed **packs**. |
136
+ | **Setup** | `install`, `hosts init`, `reset` | Bootstrap venv, global `/skillforge` commands, wipe local DB state. |
134
137
 
135
- With **Haiku** routing (uses your Anthropic key in the MCP process):
138
+ **MCP config snippet (stdout):**
136
139
 
137
- ```json
138
- {
139
- "mcpServers": {
140
- "skillforge": {
141
- "command": "npx",
142
- "args": ["-y", "@heytherevibin/skillforge", "mcp"],
143
- "env": {
144
- "ANTHROPIC_API_KEY": "sk-ant-..."
145
- }
146
- }
147
- }
148
- }
140
+ ```bash
141
+ skillforge mcp config
142
+ # Optional: --local (checkout), --with-anthropic (env placeholder)
149
143
  ```
150
144
 
151
- **If the server shows as connected but lists “No tools”:** the MCP host only understands **JSON-RPC on stdout**. Older Skillforge builds printed setup text to **stdout**, which breaks **`tools/list`**. Update the npm package, run **`skillforge install`** once if needed, then **fully quit and reopen** Claude / Cursor. The CLI now sends banners and pip output to **stderr** only.
152
-
153
- **MCP tools exposed**
145
+ **Important:** MCP requires a **clean stdout** stream (JSON-RPC). Logs belong on **stderr**. If tools do not appear in the host, update the package and fully restart the host after setup.
154
146
 
155
- | Tool | Purpose |
156
- |------|---------|
157
- | `route_skills` | Returns routed **`SKILL.md`** context (chunks or full body). Pass **`project_root`** for per-repo SQLite under **`.skillforge/orchestrator.db`**. Optional **`include_project_rag`** (after **`skillforge index --project-root=…`**), **`session_id`**, **`user_id`** / **`SKILLFORGE_MCP_USER_ID`**, or env **`SKILLFORGE_PROJECT_ROOT`**. Route **`event.policy`** in SQLite logs policy merge audit when rules apply. |
158
- | `search_skills` | Embedding-only shortlist for a **`query`** (scores + description snippets); does not run Haiku or mutate sessions. Optional **`limit`** (max 50). |
159
- | `explain_route` | Same routing signal as **`route_skills`** without writing SQLite (**`picked_before_policy`**, **`picked_after_policy`**, shortlist facets, policy audit). For debugging. |
160
- | `get_skill` | Fetch one catalog skill by **`skill_name`**; **`format`**: **`full`** or **`summary`**; optional **`max_chars`**. |
161
- | `list_skills` | Catalog overview; optional **`user_id`** scopes usage stats. |
162
- | `skill_feedback` | Feedback for the learning loop; optional **`user_id`**, **`session_id`** (stored with events). |
163
- | `skill_referenced` | Mark a routed skill as **used** in the reply (increments **`referenced`** + weight; optional **`user_id`**). |
164
- | `disable_skill` | Toggle skills; optional **`user_id`**. |
165
- | `materialize_project` | Writes **`.cursor/rules/skillforge.mdc`**, **`docs/SKILLFORGE-PRD.md`**, updates **`CLAUDE.md`** (Skillforge block). Args: **`project_root`**, **`skill_names`** from **`route_skills`**. |
166
- | `skillforge_bootstrap` | **`route_skills`** + **`materialize_project`** in one call (needs **`project_root`**). |
147
+ ---
167
148
 
168
- ### MCP response contract
149
+ ## Model Context Protocol (MCP)
169
150
 
170
- Structured diagnostics for tool **`route_skills`** live in **`result._meta`** (hosts may ignore or log them):
151
+ ### Router modes (`SKILLFORGE_ROUTER_MODE`)
171
152
 
172
- - **`schema_version`**: **`1.4`** same as **1.3** plus optional **`context_redaction`** (`enabled`, `secret_hits`, `path_hits`) on **`route_skills`** success.
173
- - **`sources`**: citations; each item has **`kind`** (`skill` or **`file`**), **`ref`**, **`line_start`** / **`line_end`**, **`score`**, optional **`mmr_rank`** after fusion.
174
- - **`fusion`**: present when MMR fusion ran (**`enabled`: true**): **`lambda`**, **`budget_chars`**, **`pool_skill`**, **`pool_project`**, **`mmr_trace`**,
175
- - **`context_redaction`**: optional; when scrubbing is enabled, reports **`enabled`**, **`secret_hits`**, **`path_hits`** for exported context.
176
- - **`budget`**: **`chars_skill_bodies`**, **`chars_project_chunks`**, **`chars_context_items_total`**, **`chars_response_total`**, **`est_tokens_approx`** (rough `chars/4`).
177
- - **`candidates_preview`**: up to 15 shortlist entries **`{ name, score }`** for debugging.
178
- - **`picked`**, **`reasoning`**, **`session_id`**, **`user_id`**, **`rerouted`**, **`change_pct`**, **`route_ms`**, **`orchestrator_db`**.
153
+ | Mode / default | Anthropic key | Behavior |
154
+ |----------------|---------------|----------|
155
+ | *(unset)* **auto** | optional | Embedding-first when key absent; full LLM routing when key present. |
156
+ | `embedding` | ignored for routing | No in-process LLM pick; top candidates drive selection. |
157
+ | `full` | recommended | LLM final pick; falls back per implementation on errors. |
158
+ | `host` | optional for pick | **Two-step:** first `route_skills` returns shortlist; second call passes **`picked_names`**. |
179
159
 
180
- On validation errors (e.g. empty **`prompt`**), the tool returns **`isError`: true** and **`_meta.error`** (e.g. **`empty_prompt`**), still with **`schema_version`** and **`sources`: `[]`**.
160
+ ### MCP tools (summary)
181
161
 
182
- `/skillforge` is not registered by npm installs; add a **Cursor rule** or **CLAUDE.md** instruction so the agent calls these tools when the user asks.
162
+ | Tool | Role |
163
+ |------|------|
164
+ | `route_skills` | Main routing: **prompt**, optional **conversation**, **`project_root`**, **`include_project_rag`**, **`session_id`**, **`user_id`**, **`picked_names`** (host or override). |
165
+ | `search_skills` | Embedding shortlist for a **query** (read-only). |
166
+ | `explain_route` | Diagnostics: shortlist + picks + policy audit **without** writing sessions. |
167
+ | `get_skill`, `list_skills` | Catalog access. |
168
+ | `skill_feedback`, `skill_referenced`, `disable_skill` | Learning loop and toggles. |
169
+ | `materialize_project`, `skillforge_bootstrap` | Project file materialization (bootstrap **errors** in `host` mode by design—use two-step routing + materialize). |
183
170
 
184
- Route events go to **`~/.skillforge/data/orchestrator.db`** (or per-repo **`.skillforge/orchestrator.db`** when **`project_root`** is set); use **`skillforge events`** to inspect them.
171
+ Full argument lists: tool definitions in **`python/app/mcp_server.py`** (source of truth).
185
172
 
186
173
  ---
187
174
 
188
- ## Skills and packs
189
-
190
- **Bundled skills** ship inside the package. List them:
191
-
192
- ```bash
193
- skillforge skills list
194
- ```
195
-
196
- **Custom skill** layout: a directory containing **`SKILL.md`** with YAML frontmatter at minimum:
175
+ ## MCP response contract
197
176
 
198
- ```yaml
199
- ---
200
- name: my-skill
201
- description: Clear trigger conditions—used by the router.
202
- triggers: When the user asks about X or mentions Y.
203
- anti_triggers: Not for production deploy checks.
204
- ---
205
- # My Skill
206
- ```
177
+ Successful **`route_skills`** responses include **`_meta`** built in **`python/app/mcp_contract.py`**. The **`schema_version`** string tracks additive JSON shape changes (hosts may rely on it for parsing).
207
178
 
208
- Optional **`triggers`** / **`anti_triggers`** strings are embedded with the summary card and shown to the Haiku router (they do not change chunk RAG, which still keys off the current user message).
179
+ **Authoritative version:** constant **`MCP_RESPONSE_SCHEMA_VERSION`** in **`app/mcp_contract.py`** (do not rely on this README if the two drift).
209
180
 
210
- Register with `skillforge skills add ./my-skill` or copy the folder to **`~/.skillforge/skills/`**.
181
+ **Notable `_meta` fields (non-exhaustive):**
211
182
 
212
- **Skill packs** are git repositories with a root **`skillforge.json`** manifest listing skill folder names. Install:
183
+ | Field | Description |
184
+ |-------|-------------|
185
+ | `schema_version` | Contract version string. |
186
+ | `sources`, `budget` | Chunk citations and size accounting. |
187
+ | `fusion` | Present when MMR-style fusion ran. |
188
+ | `context_redaction` | Redaction hit counts when enabled. |
189
+ | `route_quality` | Shortlist / router / policy / session telemetry for calibration. |
190
+ | `feedback_effect` | Per-pick **learned weight** snapshot (uses / thumbs / reference rate). |
191
+ | `routing_overlay` | Audit of **exclude** / **boost** / **project notes** application when configured. |
192
+ | `host_pick_shortlist`, `host_pick_candidates` | Host-pick phase payloads. |
213
193
 
214
- ```bash
215
- skillforge pack install <org/repo>
216
- skillforge pack install https://example.com/repo.git
217
- skillforge pack list
218
- skillforge pack update <name>
219
- skillforge pack remove <name>
220
- ```
194
+ Structured errors (e.g. empty prompt) return **`isError`: true** with **`_meta.error`** and **`schema_version`**.
221
195
 
222
196
  ---
223
197
 
224
198
  ## Routing pipeline
225
199
 
226
200
  ```
227
- User prompt (+ optional recent conversation for the shortlist query)
228
- Local embeddings (sentence-transformers) on skill **cards** (title, description, optional triggers)
229
- Cosine similarity ± hybrid keyword/BM25 fusion + per-user weights
230
- Top-K candidates
231
- → Optional Haiku **rerank** on the shortlist (`SKILLFORGE_HAIKU_RERANK`)
232
- Router model (Haiku) selects final active skills *or* embedding-only mode takes top-N from candidates
233
- Skill bodies injected; response model answers (e.g. Opus)
234
- → Usage signals update weights (optional)
201
+ User prompt (+ optional conversation-aware routing query)
202
+ Encode routing query (skill cards + optional hybrid sparse signal)
203
+ Fuse scores + per-user weights + optional project overlay boosts
204
+ Shortlist (top-K)
205
+ → Optional LLM rerank / final pick (or embedding / host selection)
206
+ Assemble context (skill chunks ± project chunks, optional fusion)
207
+ Return markdown + _meta; optional SQLite events
235
208
  ```
236
209
 
237
- Re-route: when overlap between successive active sets falls below a configurable threshold, the pipeline selects a new set for the next turn. Events are stored in SQLite; stream them with **`skillforge events --watch`**.
210
+ Re-routing when the active skill set changes significantly is controlled by **`SKILLFORGE_REROUTE_THRESHOLD`** (see [Configuration](#configuration)).
238
211
 
239
212
  ---
240
213
 
241
- ## Route policies (optional)
214
+ ## Route policies and project overlay
215
+
216
+ ### Regex policies (post-pick merge)
217
+
218
+ Rules match the user **`prompt`** with **`re.search`** (**`re.DOTALL`**). Matched **`include`** skills append after the router, capped by **`SKILLFORGE_MAX_ACTIVE`**. Audit lands on route **events** under **`policy`**.
242
219
 
243
- Rules use **`if_text_matches`** as a Python **`re.search`** pattern (with **`re.DOTALL`**) on the user **`prompt`**. **`include`** is a skill name or list of names. Matched skills are **appended** after Haiku/embedding picks until **`SKILLFORGE_MAX_ACTIVE`**.
220
+ **Load order:** **`SKILLFORGE_ROUTE_POLICIES`** (inline JSON) **`SKILLFORGE_ROUTE_POLICIES_FILE`** **`<project_root>/.skillforge/policies.json`** **`<project_root>/skillforge-policies.json`**.
244
221
 
245
- **Load order:** env **`SKILLFORGE_ROUTE_POLICIES`** (inline JSON) → **`SKILLFORGE_ROUTE_POLICIES_FILE`** → **`<project_root>/.skillforge/policies.json`** → **`<project_root>/skillforge-policies.json`**.
222
+ ### Project routing overlay (same JSON document)
246
223
 
247
- Example **`.skillforge/policies.json`**:
224
+ Optional keys alongside **`rules`**:
225
+
226
+ | Key | Aliases | Purpose |
227
+ |-----|---------|--------|
228
+ | `exclude_skills` | `host_exclude`, `denylist` | Remove skills from the embedding shortlist. |
229
+ | `routing_boosts` | `skill_boosts` | Additive score delta after learned weight (clamped; see **route_policies** module). |
230
+ | `project_notes` | `routing_notes`, `rag_notes` | Free text **prepended** to the internal routing query when **`project_root`** is set (not applied without a project root—mitigates accidental global injection from shared policy files). |
231
+
232
+ **Example fragment** (illustrative—adjust skill ids to your catalog):
248
233
 
249
234
  ```json
250
235
  {
251
236
  "rules": [
252
237
  {
253
- "if_text_matches": "(?i)(auth|oauth|jwt|password|login)",
238
+ "if_text_matches": "(?i)(auth|oauth|jwt)",
254
239
  "include": ["security-review"]
255
240
  }
256
- ]
241
+ ],
242
+ "project_notes": "Service stack and conventions for this repo (short, factual).",
243
+ "routing_boosts": { "python-testing": 0.15 },
244
+ "exclude_skills": ["legacy-skill-id"]
257
245
  }
258
246
  ```
259
247
 
260
248
  ---
261
249
 
250
+ ## Project RAG
251
+
252
+ 1. **Index** repository text into **`<project>/.skillforge/orchestrator.db`**:
253
+
254
+ ```bash
255
+ skillforge index --project-root=/path/to/repo
256
+ ```
257
+
258
+ 2. Call **`route_skills`** with **`project_root`** and **`include_project_rag`** (or CLI **`--include-project-rag`**) when embeddings and schema match (see **`project_index.py`** for model/dimension guards).
259
+
260
+ Chunk caps and ignore rules are **environment-driven** (see configuration table).
261
+
262
+ ---
263
+
264
+ ## Learning, weights, and portability
265
+
266
+ - **Signals:** route **`uses`**, **`skill_referenced`**, **`skill_feedback`** (thumbs), and **`disable_skill`** feed **SQLite** **`skill_weights`**.
267
+ - **Transparency:** **`_meta.feedback_effect`** summarizes per-pick weight context after the route’s **`uses`** increment.
268
+ - **Portability:**
269
+
270
+ ```bash
271
+ skillforge weights export -o weights.json
272
+ skillforge weights import weights.json
273
+ ```
274
+
275
+ Use **`--project-root`** / **`--user-id`** / **`--replace-user`** as documented in **`skillforge weights --help`** (implemented in **`python/app/weights_cli.py`**).
276
+
277
+ ---
278
+
279
+ ## Skills and packs
280
+
281
+ **Bundled catalog** ships inside the npm package. **CI** enforces a **minimum** bundled **`SKILL.md`** count so releases cannot silently ship an empty tree—the threshold lives in **`ci/bundle-gate.json`** (`minSkillMdFiles`); **`.github/workflows/ci.yml`** reads that file at build time.
282
+
283
+ **Custom skills:** directory with **`SKILL.md`** and YAML frontmatter (`name`, `description`; optional **`triggers`** / **`anti_triggers`**).
284
+
285
+ ```bash
286
+ skillforge skills add ./path/to/skill
287
+ ```
288
+
289
+ **Packs:** repositories with **`skillforge.json`**:
290
+
291
+ ```bash
292
+ skillforge pack install <org/repo>
293
+ skillforge pack list
294
+ ```
295
+
296
+ ---
297
+
262
298
  ## Configuration
263
299
 
264
- Environment variables (see also inline help and server defaults):
265
-
266
- | Variable | Default | Role |
267
- |----------|---------|------|
268
- | `ANTHROPIC_API_KEY` | | **Optional** for MCP: omit for embedding-only routing (default when unset); set for **full** (Haiku) routing. |
269
- | `SKILLFORGE_ROUTER_MODE` | *(auto)* | `full` = always use Haiku for final pick (requires key). `embedding` = skip Haiku; top `SKILLFORGE_MAX_ACTIVE` from shortlist. Unset = **auto**: embedding-only when `ANTHROPIC_API_KEY` is absent, else full. |
270
- | `SKILLFORGE_EMBED_MODEL` | `all-MiniLM-L6-v2` | Embedding model id. |
271
- | `SKILLFORGE_ROUTER_MODEL` | `claude-haiku-4-5-20251001` | Routing model (Haiku). |
272
- | `SKILLFORGE_TOP_K` | `15` | Embedding shortlist size. |
273
- | `SKILLFORGE_MAX_ACTIVE` | `7` | Maximum skills injected per turn. |
274
- | `SKILLFORGE_REROUTE_THRESHOLD` | `0.4` | Re-route sensitivity (Jaccard distance). |
275
- | `SKILLFORGE_ROUTER_CONV_MAX_TURNS` | `0` | Include this many recent **conversation** messages in the **embedding shortlist** query (`0` = current user message only, legacy). |
276
- | `SKILLFORGE_ROUTER_CONV_MSG_CHARS` | `320` | Max characters per message when building the shortlist query. |
277
- | `SKILLFORGE_ROUTER_HYBRID` | `off` | `off` = dense cosine only. `keyword` = fuse with token overlap on skill cards. `bm25` = fuse with **BM25** (requires **`rank-bm25`**; falls back to keyword if missing). |
278
- | `SKILLFORGE_ROUTER_HYBRID_ALPHA` | `0.72` | Hybrid weight on **dense** similarity (`1` = dense only; `0` = sparse only). |
279
- | `SKILLFORGE_ROUTER_PROMPT_HISTORY_MSGS` | `8` | Max conversation turns sent to the **Haiku** router and reranker. |
280
- | `SKILLFORGE_ROUTER_PROMPT_HISTORY_CHARS` | `360` | Max characters per turn in router / rerank prompts. |
281
- | `SKILLFORGE_ROUTER_CATALOG_PREVIEW_CHARS` | `280` | Max characters of each skill **routing card** in the Haiku pick prompt. |
282
- | `SKILLFORGE_HAIKU_RERANK` | `0` | Set **`1`** / **`true`** to rerank the Top-K shortlist with Haiku before the final pick (extra API call). |
283
- | `SKILLFORGE_HAIKU_RERANK_MAX` | `SKILLFORGE_TOP_K` | Max candidates passed to the reranker. |
284
- | `SKILLFORGE_HAIKU_RERANK_MODEL` | *(same as router)* | Model id for reranking when set; otherwise **`SKILLFORGE_ROUTER_MODEL`**. |
285
- | `SKILLFORGE_CONTEXT_MODE` | `chunks` | `chunks` = embed **line-bounded chunks** from each picked skill body (RAG) up to **`SKILLFORGE_ROUTE_MAX_CHARS`**. `full_body` = inject entire **SKILL.md** per pick (legacy). |
286
- | `SKILLFORGE_CHUNK_MAX_CHARS` | `1200` | Max characters per chunk (before overlap split). |
287
- | `SKILLFORGE_CHUNK_OVERLAP` | `200` | Character overlap when hard-splitting an oversized section. |
288
- | `SKILLFORGE_ROUTE_MAX_CHARS` | `60000` | Skill chunk char cap when **`SKILLFORGE_CONTEXT_FUSION`** is off (append path); also part of default unified budget sum when fusion is on. |
289
- | `SKILLFORGE_PROJECT_RAG_MAX_CHARS` | `24000` | Project chunk char cap when fusion is off (append path); part of default unified budget when fusion is on. |
290
- | `SKILLFORGE_CONTEXT_BUDGET_CHARS` | *(route + project RAG defaults)* | Single cap for **MMR-fused** skill + project context. |
291
- | `SKILLFORGE_CONTEXT_FUSION` | `1` | **`0`** / **`false`**: disable MMR fusion; append project chunks after skills. |
292
- | `SKILLFORGE_CONTEXT_MMR_LAMBDA` | `0.7` | MMR tradeoff: higher ⇒ query relevance; lower ⇒ diversity vs. already-selected chunks. |
293
- | `SKILLFORGE_FUSION_POOL_SKILL` | `96` | Max skill chunks in the fusion candidate pool. |
294
- | `SKILLFORGE_FUSION_POOL_PROJECT` | `96` | Max project chunks in the fusion candidate pool. |
295
- | `SKILLFORGE_FUSION_FULL_BODY_PREVIEW_CHARS` | `4000` | **`SKILL.md`** prefix length for embedding full-body / fallback fusion rows. |
296
- | `SKILLFORGE_PROJECT_RAG_MAX_CHUNKS` | `20000` | Max **project_chunks** rows loaded for one retrieval. |
297
- | `SKILLFORGE_INDEX_MAX_FILE_BYTES` | `524288` | Skip indexing files larger than this (bytes). |
298
- | `SKILLFORGE_INDEX_IGNORE_DIRS` | `""` | Extra comma-separated directory **basename** ignores (e.g. `out,tmp`). |
299
- | `SKILLFORGE_REDACT_CONTEXT` | `1` | When **`1`** (default), scrub common secret shapes and (with home path scrub) exported context before MCP/CLI output and route events. |
300
- | `SKILLFORGE_REDACT_HOME_IN_PATHS` | `1` | Replace resolved home-directory prefixes in chunk paths / DB path hints with **`[HOME]`**. |
301
- | `SKILLFORGE_MCP_USER_ID` | `""` | Default logical **user id** for MCP tool calls when arguments omit `user_id` (weights, sessions, events). |
302
- | `SKILLFORGE_PROJECT_ROOT` | `""` | Default workspace root when MCP **`project_root`** is omitted: events/weights/sessions live in **`<root>/.skillforge/orchestrator.db`**. Prefer passing **`project_root`** on each tool call from the host. |
303
- | `SKILLFORGE_SKILL_HOT_RELOAD` | `1` | When **`0`** / **`false`**, disable **SKILL.md** hot-reload; restart the MCP process to refresh the catalog. |
304
- | `SKILLFORGE_WATCH_SKILLS_INTERVAL` | `30` | Seconds between background catalog checks when hot reload is on. **`0`**: no background polling and no MCP **`tools.listChanged`**; **`tools/list`** and **`tools/call`** still reload when files change. |
305
- | `SKILLFORGE_MCP_LIST_CHANGED` | `1` | When **`0`** / **`false`**, never emit **`notifications/tools/list_changed`** (and **`listChanged`** is not advertised), even if a background interval is set. |
306
- | `SKILLFORGE_ROUTE_POLICIES` | `""` | Optional inline JSON policies document (see [Route policies](#route-policies-optional)). |
307
- | `SKILLFORGE_ROUTE_POLICIES_FILE` | `""` | Path to a policies JSON file. |
300
+ Environment variables tune routing, context budgets, redaction, MCP defaults, and file watchers. **Authoritative defaults and parsing** live in **`python/app/main.py`** and related modules—treat the table below as **operator reference**, not a legal spec.
301
+
302
+ | Variable | Role |
303
+ |----------|------|
304
+ | `ANTHROPIC_API_KEY` | Enables in-process **Haiku** routing / rerank when configured. |
305
+ | `SKILLFORGE_ROUTER_MODE` | `full` · `embedding` · `host` · auto. |
306
+ | `SKILLFORGE_EMBED_MODEL`, `SKILLFORGE_ROUTER_MODEL` | Model identifiers for embeddings / routing LLM. |
307
+ | `SKILLFORGE_TOP_K`, `SKILLFORGE_MAX_ACTIVE` | Shortlist size and max simultaneous skills. |
308
+ | `SKILLFORGE_REROUTE_THRESHOLD` | Re-route sensitivity (Jaccard distance). |
309
+ | `SKILLFORGE_ROUTER_CONV_MAX_TURNS`, `SKILLFORGE_ROUTER_CONV_MSG_CHARS` | Conversation-aware routing query. |
310
+ | `SKILLFORGE_ROUTER_HYBRID`, `SKILLFORGE_ROUTER_HYBRID_ALPHA` | Hybrid sparse/dense fusion. |
311
+ | `SKILLFORGE_HAIKU_RERANK`, `SKILLFORGE_HAIKU_RERANK_MAX`, `SKILLFORGE_HAIKU_RERANK_MODEL` | Optional rerank stage. |
312
+ | `SKILLFORGE_CONTEXT_MODE`, `SKILLFORGE_ROUTE_MAX_CHARS`, chunk envs | Skill body chunking vs full-body legacy. |
313
+ | `SKILLFORGE_CONTEXT_FUSION`, `SKILLFORGE_CONTEXT_BUDGET_CHARS`, `SKILLFORGE_CONTEXT_MMR_LAMBDA`, pool sizes | Skill + project **MMR** fusion. |
314
+ | `SKILLFORGE_PROJECT_RAG_MAX_CHARS`, `SKILLFORGE_PROJECT_RAG_MAX_CHUNKS` | Project chunk retrieval caps. |
315
+ | `SKILLFORGE_PROJECT_NOTES_MAX_CHARS` | Cap for **`project_notes`** prepended to routing query. |
316
+ | `SKILLFORGE_REDACT_CONTEXT`, `SKILLFORGE_REDACT_HOME_IN_PATHS` | Output redaction behavior. |
317
+ | `SKILLFORGE_MCP_USER_ID`, `SKILLFORGE_PROJECT_ROOT` | MCP defaults for user scoping and DB resolution. |
318
+ | `SKILLFORGE_ROUTE_POLICIES`, `SKILLFORGE_ROUTE_POLICIES_FILE` | Inline or file-backed policy JSON. |
319
+ | `SKILLFORGE_HOST_PICK_MAX`, `SKILLFORGE_HOST_PICK_LINE_CHARS` | Host-mode shortlist sizing / formatting. |
320
+ | Hot reload | `SKILLFORGE_SKILL_HOT_RELOAD`, `SKILLFORGE_WATCH_SKILLS_INTERVAL`, `SKILLFORGE_MCP_LIST_CHANGED`. |
321
+ | Install hooks | `SKILLFORGE_SKIP_*`, `SKILLFORGE_*_GLOBAL_COMMAND`, etc. |
308
322
 
309
323
  ---
310
324
 
311
- ## Local data and operations
325
+ ## Local data and paths
312
326
 
313
- Optional **per-project** state (when **`project_root`** or **`SKILLFORGE_PROJECT_ROOT`** is set, or MCP passes **`project_root`** on tools):
327
+ **Per-project** (when **`project_root`** / **`SKILLFORGE_PROJECT_ROOT`** is used):
314
328
 
315
329
  ```
316
330
  <workspace>/.skillforge/
317
- ├── orchestrator.db # SQLite: sessions, weights, events, **project_chunks** (after `skillforge index`)
318
- ├── policies.json # Optional route policies (see README)
319
- └── last_route.json # Last route_skills snapshot (after a routed call)
331
+ ├── orchestrator.db # SQLite: sessions, weights, events, project_chunks (after index)
332
+ ├── policies.json # Optional policies + overlay (or repo-root skillforge-policies.json)
333
+ └── last_route.json # Last CLI route snapshot (when applicable)
320
334
  ```
321
335
 
322
- Global default when no project root:
336
+ **Global default:**
323
337
 
324
338
  ```
325
339
  ~/.skillforge/
326
- ├── venv/ # Python virtual environment
327
- ├── data/orchestrator.db # SQLite (sessions, weights, events)
328
- ├── skills/ # User-added skills
329
- ├── packs/<hash>/ # Cloned pack repositories
330
- ├── packs.json # Pack registry
340
+ ├── venv/
341
+ ├── data/orchestrator.db
342
+ ├── skills/ # User skills
343
+ └── packs/ # Pack working copies
331
344
  ```
332
345
 
333
- | Command | Effect |
334
- |---------|--------|
335
- | `skillforge events` | Prints a **usage** snapshot and recent **`route`** / **`feedback`** rows; **`--watch`**, **`--project-root`** (per-repo DB), **`--user`**, **`--verbose`** (see **`--help`**). |
336
- | `skillforge index` | Chunk/embed text files under **`--project-root`** into **`project_chunks`**. **`--reset`**, **`--stats-only`**, **`--quiet`** (see **`--help`**). |
337
- | `skillforge reset` | Clears learning state and event history in the database. |
338
- | `skillforge install` | Re-runs bootstrap (venv and dependencies). |
339
- | `rm -rf ~/.skillforge` | Full removal of local state and venv. |
346
+ | Command | Role |
347
+ |---------|------|
348
+ | `skillforge events` | Usage + recent **`route`** / **`feedback`** rows (`--watch`, `--project-root`, `--user`, `--verbose`). |
349
+ | `skillforge index` | (Re)build **`project_chunks`**. |
350
+ | `skillforge reset` | Clears learning + events in the target DB. |
351
+ | `skillforge health` | Validates paths, catalog discovery, optional deep router load. |
352
+ | `skillforge route-eval` | Runs JSON fixtures (CI). |
340
353
 
341
354
  ---
342
355
 
343
- ## Security considerations
356
+ ## Security
344
357
 
345
- - **Redaction is best-effort regex scrubbing**, not a guarantee. Do not paste production secrets into prompts; treat routed context like **untrusted text** until reviewed.
346
- - Treat **`ANTHROPIC_API_KEY`** as a **secret** when you set it (e.g. for Haiku routing in MCP). Prefer environment injection or secret stores, not committed files.
347
- - Vulnerability disclosure: see **[SECURITY.md](SECURITY.md)**.
358
+ - **Redaction is best-effort.** Do not treat scrubbed output as a certified wipe of secrets.
359
+ - **Secrets:** keep **`ANTHROPIC_API_KEY`** and workspace tokens out of VCS; inject via host or OS secret stores.
360
+ - **Project notes** intentionally **do not apply** without **`project_root`** to reduce cross-talk from global policy config.
361
+ - **Disclosure:** follow **[SECURITY.md](SECURITY.md)** (private channels for undisclosed issues).
348
362
 
349
363
  ---
350
364
 
351
365
  ## Contributing and governance
352
366
 
353
- - **[CONTRIBUTING.md](CONTRIBUTING.md)** workflow, local checks, branch policy expectations.
354
- - **[CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md)** — community standards.
355
- - **[SECURITY.md](SECURITY.md)** reporting security issues.
356
-
357
- ---
358
-
359
- ## Releases and maintainers
360
-
361
- - **Continuous integration:** `.github/workflows/ci.yml` (push and pull request to **`main`**).
362
- - **Release & npm publish:** **Skillforge release** runs when you push tag **`vX.Y.Z`** and **`package.json`** **`version`** is exactly **`X.Y.Z`** (e.g. **`v0.2.1`** ↔ **`0.2.1`**). That same number is what **`npm publish`** ships. GitHub releases are titled **`Skillforge <tag>`**.
363
- - **Procedure and npm tokens:** **[RELEASING.md](RELEASING.md)** (granular npm access tokens, **Bypass 2FA** for CI publish where applicable).
364
- - **License:** MIT — see **[LICENSE](LICENSE)**.
365
-
366
- ---
367
+ | Document | Purpose |
368
+ |----------|---------|
369
+ | [CONTRIBUTING.md](CONTRIBUTING.md) | Workflow and local checks. |
370
+ | [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) | Community standards. |
371
+ | [SECURITY.md](SECURITY.md) | Reporting vulnerabilities. |
372
+ | [RELEASING.md](RELEASING.md) | Tags, **`NPM_TOKEN`**, npm **2FA** / granular tokens, CI vs release workflows. |
373
+ | [CHANGELOG.md](CHANGELOG.md) | Version-by-version changes. |
374
+ | [STRATEGY.md](STRATEGY.md) | Product direction and non-goals. |
367
375
 
368
- ## License
376
+ **Continuous integration:** `.github/workflows/ci.yml` (**push** / **PR** to **`main`**).
369
377
 
370
- MIT © see [LICENSE](LICENSE).
378
+ **License:** [LICENSE](LICENSE) (MIT).