@karmaniverous/jeeves-watcher 0.4.4 → 0.5.0-1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,200 @@
1
+ ---
2
+ name: jeeves-watcher-admin
3
+ description: >
4
+ Instance management for a jeeves-watcher deployment. Use when you need to
5
+ author or validate config, trigger reindexing, diagnose embedding failures,
6
+ or manage helper registrations.
7
+ ---
8
+
9
+ # jeeves-watcher — Instance Administration
10
+
11
+ ## Tools
12
+
13
+ ### `watcher_validate`
14
+ Validate config and optionally test file paths.
15
+ - `config` (object, optional) — candidate config (partial or full). Omit to validate current config.
16
+ - `testPaths` (string[], optional) — file paths to test against the config
17
+
18
+ Partial configs merge with current config by rule name. If `config` is omitted, tests against the running config.
19
+
20
+ ### `watcher_config_apply`
21
+ Apply config changes atomically.
22
+ - `config` (object, required) — full or partial config to apply
23
+
24
+ Validates, writes to disk, and triggers configured reindex behavior. Returns validation errors if invalid.
25
+
26
+ ### `watcher_reindex`
27
+ Trigger a reindex.
28
+ - `scope` (string, optional) — `"rules"` (default) or `"full"`
29
+
30
+ Rules scope re-applies inference rules without re-embedding (lightweight). Full scope re-processes all files.
31
+
32
+ ### `watcher_issues`
33
+ Get runtime embedding failures. Returns `{ filePath: IssueRecord }` showing files that failed and why.
34
+
35
+ ### `watcher_query`
36
+ Query config and runtime state via JSONPath (same tool as consumer skill).
37
+
38
+ ### `watcher_status`
39
+ Service health check including reindex progress.
40
+
41
+ ## Qdrant Filter Syntax
42
+
43
+ Filters use Qdrant's native JSON filter format, passed as the `filter` parameter to `watcher_search`.
44
+
45
+ ### Basic Patterns
46
+
47
+ **Match exact value:**
48
+ ```json
49
+ { "must": [{ "key": "domain", "match": { "value": "email" } }] }
50
+ ```
51
+
52
+ **Match text (full-text search within field):**
53
+ ```json
54
+ { "must": [{ "key": "chunk_text", "match": { "text": "authentication" } }] }
55
+ ```
56
+
57
+ **Combine conditions (AND):**
58
+ ```json
59
+ {
60
+ "must": [
61
+ { "key": "domain", "match": { "value": "jira" } },
62
+ { "key": "status", "match": { "value": "In Progress" } }
63
+ ]
64
+ }
65
+ ```
66
+
67
+ **Exclude (NOT):**
68
+ ```json
69
+ {
70
+ "must_not": [{ "key": "domain", "match": { "value": "repos" } }]
71
+ }
72
+ ```
73
+
74
+ **Any of (OR):**
75
+ ```json
76
+ {
77
+ "should": [
78
+ { "key": "domain", "match": { "value": "email" } },
79
+ { "key": "domain", "match": { "value": "slack" } }
80
+ ]
81
+ }
82
+ ```
83
+
84
+ **Nested (combine AND + NOT):**
85
+ ```json
86
+ {
87
+ "must": [{ "key": "domain", "match": { "value": "jira" } }],
88
+ "must_not": [{ "key": "status", "match": { "value": "Done" } }]
89
+ }
90
+ ```
91
+
92
+ ### Key Differences
93
+ - `match.value` — exact match (case-sensitive, for keyword fields like `domain`, `status`)
94
+ - `match.text` — full-text match (for text fields like `chunk_text`)
95
+
96
+ ## Search Result Shape
97
+
98
+ Each result from `watcher_search` contains:
99
+
100
+ | Field | Type | Description |
101
+ |-------|------|-------------|
102
+ | `id` | string | Qdrant point ID |
103
+ | `score` | number | Similarity score (0-1, higher = more relevant) |
104
+ | `payload.file_path` | string | Source file path |
105
+ | `payload.chunk_text` | string | The matched text chunk |
106
+ | `payload.chunk_index` | number | Chunk position within the file |
107
+ | `payload.total_chunks` | number | Total chunks for this file |
108
+ | `payload.content_hash` | string | Hash of the full document content |
109
+ | `payload.matched_rules` | string[] | Names of inference rules that matched |
110
+
111
+ Additional metadata fields depend on the deployment's inference rules (e.g., `domain`, `status`, `author`). Use `watcher_query` to discover available fields.
112
+
113
+ ## JSONPath Patterns for Schema Discovery
114
+
115
+ Use `watcher_query` to explore the merged virtual document. Common patterns:
116
+
117
+ ### Orientation
118
+ ```
119
+ $.inferenceRules[*].['name','description'] — List all rules with descriptions
120
+ $.search.scoreThresholds — Score interpretation thresholds
121
+ $.slots — Named filter patterns (e.g., memory)
122
+ ```
123
+
124
+ ### Schema Discovery
125
+ ```
126
+ $.inferenceRules[?(@.name=='jira-issue')] — Full rule details
127
+ $.inferenceRules[?(@.name=='jira-issue')].values — Distinct values for a rule
128
+ $.inferenceRules[?(@.name=='jira-issue')].values.status — Values for a specific field
129
+ ```
130
+
131
+ ### Helper Enumeration
132
+ ```
133
+ $.mapHelpers — All JsonMap helper namespaces
134
+ $.mapHelpers.slack.exports — Exports from the 'slack' helper
135
+ $.templateHelpers — All Handlebars helper namespaces
136
+ ```
137
+
138
+ ### Issues
139
+ ```
140
+ $.issues — All runtime embedding failures
141
+ ```
142
+
143
+ ### Full Config Introspection
144
+ ```
145
+ $.schemas — Global named schemas
146
+ $.maps — Named JsonMap transforms
147
+ $.templates — Named Handlebars templates
148
+ ```
149
+
150
+ ## Config Authoring
151
+
152
+ ### Rule Structure
153
+ Each inference rule has:
154
+ - `name` (required) — unique identifier
155
+ - `description` (optional) — human-readable purpose
156
+ - `match` — JSON Schema with picomatch glob for path matching
157
+ - `set` — metadata fields to set on match
158
+ - `map` (optional) — named JsonMap transform
159
+ - `template` (optional) — named Handlebars template
160
+
161
+ ### Config Workflow
162
+ 1. Edit config (or build partial config object)
163
+ 2. Validate: `watcher_validate` with optional `testPaths` for dry-run preview
164
+ 3. Apply: `watcher_config_apply` — validates, writes, triggers reindex
165
+ 4. Monitor: `watcher_issues` for runtime embedding failures
166
+
167
+ ### When to Reindex
168
+ - **Rules scope** (`"rules"`): Changed rule matching patterns, set expressions, schema mappings. No re-embedding needed.
169
+ - **Full scope** (`"full"`): Changed embedding config, added watch paths, broad schema restructuring. Re-embeds everything.
170
+
171
+ ## Diagnostics
172
+
173
+ ### Escalation Path
174
+ 1. `watcher_status` — is the service healthy? Is a reindex running?
175
+ 2. `watcher_issues` — what files are failing and why?
176
+ 3. `watcher_query` with `$.issues` — same data via JSONPath
177
+ 4. Check logs at the configured log path
178
+
179
+ ### Error Categories
180
+ - `type_collision` — metadata field type mismatch during extraction
181
+ - `interpolation` — template/set expression failed to resolve
182
+ - `read_failure` — file couldn't be read (permissions, encoding)
183
+ - `embedding` — embedding API error
184
+
185
+ ## Helper Management
186
+
187
+ Helpers use namespace prefixing: config key becomes prefix. A helper named `slack` exports `slack_extractParticipants`.
188
+
189
+ Enumerate loaded helpers:
190
+ ```
191
+ $.mapHelpers — JsonMap helper namespaces with exports
192
+ $.templateHelpers — Handlebars helper namespaces with exports
193
+ ```
194
+
195
+ ## CLI Fallbacks
196
+
197
+ If the watcher API is down:
198
+ - `jeeves-watcher status` — check if the service is running
199
+ - `jeeves-watcher validate` — validate config from CLI
200
+ - Restart via NSSM (Windows) or systemctl (Linux)
package/package.json CHANGED
@@ -33,6 +33,7 @@
33
33
  "ignore": "^7.0.5",
34
34
  "js-yaml": "*",
35
35
  "json5": "*",
36
+ "jsonpath-plus": "^10.4.0",
36
37
  "mammoth": "^1.11.0",
37
38
  "mdast-util-from-adf": "^2.2.0",
38
39
  "mdast-util-to-markdown": "^2.1.2",
@@ -171,7 +172,8 @@
171
172
  },
172
173
  "scripts": {
173
174
  "generate:schema": "tsx src/config/generate-schema.ts",
174
- "build": "npm run generate:schema && rimraf dist && cross-env NO_COLOR=1 rollup --config rollup.config.ts --configPlugin @rollup/plugin-typescript && node -e \"const fs=require('fs-extra');fs.copySync('plugin/openclaw.plugin.json','dist/plugin/openclaw.plugin.json');fs.copySync('plugin/skill','dist/plugin/skill');\"",
175
+ "build:skills": "node scripts/build-skills.js",
176
+ "build": "npm run generate:schema && rimraf dist && cross-env NO_COLOR=1 rollup --config rollup.config.ts --configPlugin @rollup/plugin-typescript && npm run build:skills && node -e \"const fs=require('fs-extra');fs.copySync('plugin/openclaw.plugin.json','dist/plugin/openclaw.plugin.json');\"",
175
177
  "changelog": "auto-changelog",
176
178
  "diagrams": "cd diagrams && plantuml -tpng -o ../assets -r .",
177
179
  "docs": "typedoc",
@@ -185,5 +187,5 @@
185
187
  },
186
188
  "type": "module",
187
189
  "types": "dist/index.d.ts",
188
- "version": "0.4.4"
190
+ "version": "0.5.0-1"
189
191
  }
@@ -1,149 +0,0 @@
1
- ---
2
- name: jeeves-watcher
3
- description: >
4
- Semantic search and metadata enrichment via a jeeves-watcher instance.
5
- Use as your primary method for finding relevant files and context before
6
- falling back to filesystem browsing.
7
- ---
8
-
9
- # Jeeves Watcher — Context Discovery
10
-
11
- ## Behavioral Directive
12
-
13
- **Use `watcher_search` as your first step when you need to find information.** Before browsing the filesystem, before grepping, before guessing at file paths — search the index. It covers far more ground than manual exploration and finds things you wouldn't know to look for.
14
-
15
- **When search doesn't find what you need, fall back to your usual methods.** The index covers configured watch paths but not everything on the system. Active repos, system tools, and files outside the watch scope won't appear. Absence of results means the content may not be indexed, not that it doesn't exist.
16
-
17
- ## Workflow
18
-
19
- ### 1. Discover (once per session)
20
-
21
- Call `watcher_status` early in your session to learn what's available:
22
-
23
- ```json
24
- {}
25
- ```
26
-
27
- This returns collection stats and — critically — the set of payload fields with their types. Cache this mentally; these fields won't change during a session. Use them to construct targeted filters.
28
-
29
- ### 2. Search (primary context discovery)
30
-
31
- Use `watcher_search` to find relevant files:
32
-
33
- ```json
34
- { "query": "authentication flow", "limit": 5 }
35
- ```
36
-
37
- Results include `chunk_text` in the payload. For quick context, the chunks may be sufficient without reading the full file. Only load the file when you need complete content or plan to edit it.
38
-
39
- ### 3. Read (when needed)
40
-
41
- Use the `file_path` from search results to read the actual file. Group results by `file_path` when multiple chunks come from the same document.
42
-
43
- ### 4. Fall back (when search misses)
44
-
45
- If search returns nothing useful or low-scoring results (below ~0.3), the content likely isn't indexed. Fall back to filesystem browsing, directory listing, or grep. This is expected — not everything is in the index.
46
-
47
- ## Tools
48
-
49
- ### `watcher_status`
50
-
51
- Get service health, collection stats, and discover available payload fields.
52
-
53
- | Parameter | Type | Required | Description |
54
- | --------- | ---- | -------- | ----------- |
55
- | _(none)_ | | | |
56
-
57
- **Returns:** `status`, `uptime`, `collection` (name, pointCount, dimensions), `payloadFields` (field names with types).
58
-
59
- ### `watcher_search`
60
-
61
- Semantic similarity search with optional Qdrant filters.
62
-
63
- | Parameter | Type | Required | Description |
64
- | --------- | ------ | -------- | ------------------------------------ |
65
- | `query` | string | yes | Natural-language search query |
66
- | `limit` | number | no | Max results to return (default: 10) |
67
- | `filter` | object | no | Qdrant filter object (see below) |
68
-
69
- **Plain search:**
70
-
71
- ```json
72
- { "query": "error handling", "limit": 5 }
73
- ```
74
-
75
- **Filtered search:**
76
-
77
- ```json
78
- {
79
- "query": "error handling",
80
- "limit": 10,
81
- "filter": {
82
- "must": [{ "key": "domain", "match": { "value": "backend" } }]
83
- }
84
- }
85
- ```
86
-
87
- ### `watcher_enrich`
88
-
89
- Set or update metadata on a document by file path.
90
-
91
- | Parameter | Type | Required | Description |
92
- | ---------- | ------ | -------- | ----------------------------------- |
93
- | `path` | string | yes | File path of the document |
94
- | `metadata` | object | yes | Key-value metadata to set |
95
-
96
- ```json
97
- {
98
- "path": "docs/auth.md",
99
- "metadata": { "domain": "auth", "reviewed": true }
100
- }
101
- ```
102
-
103
- ## Qdrant Filter Patterns
104
-
105
- Build filters using fields discovered via `watcher_status`.
106
-
107
- **Exact match:**
108
-
109
- ```json
110
- { "must": [{ "key": "domain", "match": { "value": "email" } }] }
111
- ```
112
-
113
- **Multiple conditions:**
114
-
115
- ```json
116
- {
117
- "must": [
118
- { "key": "domain", "match": { "value": "codebase" } },
119
- { "key": "file_path", "match": { "text": "auth" } }
120
- ]
121
- }
122
- ```
123
-
124
- **Exclude results:**
125
-
126
- ```json
127
- {
128
- "must_not": [{ "key": "domain", "match": { "value": "codebase" } }]
129
- }
130
- ```
131
-
132
- **Full-text match** (tokenized, for longer text fields):
133
-
134
- ```json
135
- { "must": [{ "key": "chunk_text", "match": { "text": "authentication" } }] }
136
- ```
137
-
138
- ## Score Interpretation
139
-
140
- - **0.7+** — Strong semantic match. Trust these results.
141
- - **0.4–0.7** — Relevant but may need verification. Worth reading.
142
- - **Below 0.3** — Likely noise. The content you need may not be indexed.
143
-
144
- ## Tips
145
-
146
- - **Start broad, then narrow.** A plain query without filters shows you what's available. Add filters once you know which payload field values are relevant.
147
- - **Group by file.** Multiple chunks from the same file appear as separate results. Look at `file_path` to see when you're getting multiple views of one document.
148
- - **Chunk text is a preview.** It's useful for quick triage but may be truncated or split mid-sentence. Read the actual file for complete context.
149
- - **Enrich after analysis.** When you review a document and learn something about it, use `watcher_enrich` to tag it. Future searches can filter on those tags.