@astrofoundry/grimoire 1.2.4 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +9 -122
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,148 +1,35 @@
1
1
  # grimoire
2
2
 
3
- Documentation RAG System scrape docs, embed, search with reranking.
3
+ Documentation search powered by vector embeddings.
4
4
 
5
- ## Consumer Setup
5
+ ## Install
6
6
 
7
7
  ```bash
8
8
  npm install -g @astrofoundry/grimoire
9
9
  grimoire init
10
- # Enter API URL and API key (provided by admin)
11
- grimoire search "how to query firestore"
12
10
  ```
13
11
 
14
- ## Admin Setup
12
+ ## Usage
15
13
 
16
14
  ```bash
17
- pnpm install
18
- pnpm build
19
- pnpm link --global
20
- ```
21
-
22
- ### Firebase / GCP
23
-
24
- ```bash
25
- # Authenticate for Firestore access (grimoire-docs project)
26
- gcloud auth application-default login --project=grimoire-docs
27
- ```
28
-
29
- ### Vector indexes (one-time, before first search)
30
-
31
- ```bash
32
- gcloud firestore indexes composite create \
33
- --collection-group=grimoire_chunks \
34
- --query-scope=COLLECTION \
35
- --field-config='field-path=embedding,vector-config={"dimension":"768","flat":{}}' \
36
- --database="(default)" \
37
- --project=grimoire-docs
38
-
39
- gcloud firestore indexes composite create \
40
- --collection-group=grimoire_chunks \
41
- --query-scope=COLLECTION \
42
- --field-config='field-path=source,order=ASCENDING' \
43
- --field-config='field-path=embedding,vector-config={"dimension":"768","flat":{}}' \
44
- --database="(default)" \
45
- --project=grimoire-docs
15
+ grimoire search "<query>"
16
+ grimoire search "<query>" --source <name>
17
+ grimoire list
18
+ grimoire stats
46
19
  ```
47
20
 
48
- ## Commands
21
+ ## Admin
49
22
 
50
23
  ```bash
51
- # Add a documentation source (interactive)
52
- grimoire add <name> --url <start_url>
53
-
54
- # Refresh a source (scrape → convert → chunk → embed → store)
24
+ grimoire add <name> --url <url>
55
25
  grimoire refresh <source>
56
-
57
- # Full refresh (purge all data, re-scrape everything)
58
26
  grimoire refresh <source> --full
59
-
60
- # Re-run from cached HTML (skip scraping)
61
27
  grimoire refresh <source> --from-raw
62
-
63
- # Re-store from cached embeddings (skip scraping + embedding)
64
28
  grimoire refresh <source> --from-store
65
-
66
- # Override concurrency (default: 10)
67
29
  grimoire refresh <source> --concurrency 20
68
-
69
- # Refresh all sources
70
30
  grimoire refresh --all
71
-
72
- # Search across all sources
73
- grimoire search "<query>"
74
-
75
- # Search within a specific source
76
- grimoire search "<query>" --source <name>
77
-
78
- # List all configured sources
79
- grimoire list
80
-
81
- # Show statistics
82
- grimoire stats
83
-
84
- # Export source as JSON
85
31
  grimoire export <source>
86
-
87
- # API key management (admin only)
88
32
  grimoire apikey create <name>
89
33
  grimoire apikey list
90
34
  grimoire apikey revoke <name>
91
35
  ```
92
-
93
- ## Configuration
94
-
95
- Sources are defined in `config/sources.yaml`. Each source needs site-specific cleanup config.
96
-
97
- ```yaml
98
- sources:
99
- my-source:
100
- name: My Docs # Display name
101
- start_url: https://example.com/docs
102
- nav_selector: nav # CSS selector for navigation element
103
- content_selector: article # CSS selector for main content
104
- include_patterns: # URL patterns to include
105
- - /docs
106
- exclude_patterns: # URL patterns to exclude (optional)
107
- - /docs/legacy
108
- remove_selectors: # CSS selectors to strip from content (site-specific)
109
- - footer
110
- - nav
111
- - .sidebar
112
- remove_text_patterns: # Regex patterns to strip from markdown (site-specific)
113
- - "^Cookie notice.*$"
114
- concurrency: 10 # Parallel browser tabs (default: 10)
115
- rate_limit_ms: 1000 # Delay between requests (optional)
116
- ```
117
-
118
- The converter only strips `style`, `script`, `noscript`, `iframe`, `svg` by default. All other cleanup (nav, footer, banners, site-specific UI elements) must be configured per source via `remove_selectors` and `remove_text_patterns`.
119
-
120
- See `config/sources.yaml` for the Firebase Firestore example with full cleanup config.
121
-
122
- ## Environment Variables
123
-
124
- Set in `.env` at project root (auto-loaded by CLI):
125
-
126
- ```bash
127
- GOOGLE_CLOUD_PROJECT=grimoire-docs # Firebase/GCP project ID
128
- GEMINI_API_KEY=... # Google Gemini API key
129
- RERANKER_URL=... # llama-cpp reranker endpoint
130
- ```
131
-
132
- ## Releasing
133
-
134
- ```bash
135
- pnpm release:patch # bump, commit, tag, push → GH Actions deploys functions + publishes npm
136
- pnpm release:minor
137
- pnpm release:major
138
- ```
139
-
140
- ## Development
141
-
142
- ```bash
143
- pnpm test # Run tests
144
- pnpm lint # ESLint
145
- pnpm check # Typecheck + lint + test
146
- pnpm build # Compile TypeScript
147
- pnpm build:watch # Watch mode
148
- ```
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@astrofoundry/grimoire",
3
- "version": "1.2.4",
3
+ "version": "1.3.0",
4
4
  "description": "Documentation RAG System",
5
5
  "keywords": [],
6
6
  "author": "",