@ragieai/skills 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,8 @@
1
+ {
2
+ "name": "ragieai",
3
+ "description": "Ragie AI — managed RAG platform for document ingestion, search, and retrieval. Provides skills for integrating Ragie into applications and using the Ragie MCP server.",
4
+ "author": {
5
+ "name": "Ragie",
6
+ "url": "https://ragie.ai"
7
+ }
8
+ }
package/.mcp.json ADDED
@@ -0,0 +1,11 @@
1
+ {
2
+ "mcpServers": {
3
+ "ragie": {
4
+ "type": "http",
5
+ "url": "https://api.ragie.ai/mcp/${RAGIE_PARTITION}",
6
+ "headers": {
7
+ "Authorization": "Bearer ${RAGIE_API_KEY}"
8
+ }
9
+ }
10
+ }
11
+ }
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Ragie Corp
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,88 @@
1
+ # @ragieai/skills
2
+
3
+ Ragie skills for AI coding agents. Install once and your agent understands how to ingest documents, run retrievals, build RAG pipelines, configure the MCP server, and handle multi-tenancy with Ragie.
4
+
5
+ Works with Claude Code, Cursor, Cline, Copilot, Windsurf, and [40+ other agents](https://skills.sh).
6
+
7
+ ## Installation
8
+
9
+ ### Via skills CLI (recommended)
10
+
11
+ Installs into your current project for whichever agent you use:
12
+
13
+ ```bash
14
+ npx skills add ragieai/skills
15
+ ```
16
+
17
+ ### Via npm
18
+
19
+ For programmatic access to skill and reference content:
20
+
21
+ ```bash
22
+ npm install @ragieai/skills
23
+ ```
24
+
25
+ ### Local development
26
+
27
+ Symlink the skill directory directly into Claude Code's skills folder so edits are reflected immediately:
28
+
29
+ ```bash
30
+ ln -s /path/to/ragieai/skills/skills/ragie ~/.claude/skills/ragie
31
+ ```
32
+
33
+ ## What's Included
34
+
35
+ ### Skill
36
+
37
+ One skill — `ragie` — that activates when you ask about integrating Ragie, building RAG pipelines, ingesting documents, configuring the MCP server, or working with retrievals.
38
+
39
+ ### References
40
+
41
+ | File | Content |
42
+ |------|---------|
43
+ | `quickstart.md` | Install SDK, ingest a document, run first retrieval |
44
+ | `ingestion.md` | Files, URLs, raw text, readiness polling, webhooks, bulk ingest |
45
+ | `retrieval.md` | Search options, `topK`, reranking, hybrid search, metadata filters |
46
+ | `mcp.md` | MCP server URL pattern, configuration, the `retrieve` tool |
47
+ | `partitions.md` | Multi-tenancy, partition isolation, partition management |
48
+ | `metadata-filtering.md` | Tagging documents, filtering at retrieval time |
49
+ | `rag-patterns.md` | RAG responses, streaming, citations, tool use, production checklist |
50
+ | `api-reference.md` | Full REST endpoint reference, error codes |
51
+ | `python.md` | Python SDK equivalents for all patterns |
52
+
53
+ References are loaded on demand — only what's relevant to the current task is pulled into context.
54
+
55
+ ## MCP Server
56
+
57
+ The Ragie MCP server exposes a `retrieve` tool scoped to a partition, letting your agent search your knowledge base directly. Configure it with two environment variables:
58
+
59
+ ```bash
60
+ export RAGIE_API_KEY=ragie_...
61
+ export RAGIE_PARTITION=your-partition
62
+ ```
63
+
64
+ The plugin's `.mcp.json` handles the rest. See `mcp.md` for multi-partition setup.
65
+
66
+ ## Programmatic API
67
+
68
+ ```typescript
69
+ import {
70
+ getSkill,
71
+ getReference,
72
+ getSkillPath,
73
+ getReferencePath,
74
+ getSkillsDir,
75
+ } from "@ragieai/skills";
76
+
77
+ // Get skill or reference content as a string
78
+ const skill = getSkill("ragie");
79
+ const quickstart = getReference("quickstart");
80
+
81
+ // Get absolute file paths
82
+ const skillPath = getSkillPath("ragie");
83
+ const refPath = getReferencePath("rag-patterns");
84
+ ```
85
+
86
+ ## License
87
+
88
+ MIT
package/dist/index.cjs ADDED
@@ -0,0 +1,55 @@
1
+ "use strict";
2
+ var __defProp = Object.defineProperty;
3
+ var __getOwnPropDesc = Object.getOwnPropertyDescriptor;
4
+ var __getOwnPropNames = Object.getOwnPropertyNames;
5
+ var __hasOwnProp = Object.prototype.hasOwnProperty;
6
+ var __export = (target, all) => {
7
+ for (var name in all)
8
+ __defProp(target, name, { get: all[name], enumerable: true });
9
+ };
10
+ var __copyProps = (to, from, except, desc) => {
11
+ if (from && typeof from === "object" || typeof from === "function") {
12
+ for (let key of __getOwnPropNames(from))
13
+ if (!__hasOwnProp.call(to, key) && key !== except)
14
+ __defProp(to, key, { get: () => from[key], enumerable: !(desc = __getOwnPropDesc(from, key)) || desc.enumerable });
15
+ }
16
+ return to;
17
+ };
18
+ var __toCommonJS = (mod) => __copyProps(__defProp({}, "__esModule", { value: true }), mod);
19
+
20
+ // src/index.ts
21
+ var index_exports = {};
22
+ __export(index_exports, {
23
+ getReference: () => getReference,
24
+ getReferencePath: () => getReferencePath,
25
+ getSkill: () => getSkill,
26
+ getSkillPath: () => getSkillPath,
27
+ getSkillsDir: () => getSkillsDir
28
+ });
29
+ module.exports = __toCommonJS(index_exports);
30
+ var import_fs = require("fs");
31
+ var import_path = require("path");
32
+ var root = (0, import_path.join)(__dirname, "..");
33
+ function getReferencePath(name) {
34
+ return (0, import_path.join)(root, "skills", "ragie", "references", `${name}.md`);
35
+ }
36
+ function getSkillPath(name) {
37
+ return (0, import_path.join)(root, "skills", name, "SKILL.md");
38
+ }
39
+ function getSkillsDir() {
40
+ return (0, import_path.join)(root, "skills");
41
+ }
42
+ function getReference(name) {
43
+ return (0, import_fs.readFileSync)(getReferencePath(name), "utf-8");
44
+ }
45
+ function getSkill(name) {
46
+ return (0, import_fs.readFileSync)(getSkillPath(name), "utf-8");
47
+ }
48
+ // Annotate the CommonJS export names for ESM import in node:
49
+ 0 && (module.exports = {
50
+ getReference,
51
+ getReferencePath,
52
+ getSkill,
53
+ getSkillPath,
54
+ getSkillsDir
55
+ });
@@ -0,0 +1,16 @@
1
+ declare const REFERENCES: readonly ["quickstart", "ingestion", "retrieval", "mcp", "partitions", "metadata-filtering", "rag-patterns", "api-reference", "python"];
2
+ declare const SKILLS: readonly ["ragie"];
3
+ type ReferenceName = (typeof REFERENCES)[number];
4
+ type SkillName = (typeof SKILLS)[number];
5
+ /** Returns the absolute path to a reference file. */
6
+ declare function getReferencePath(name: ReferenceName): string;
7
+ /** Returns the absolute path to a skill's SKILL.md. */
8
+ declare function getSkillPath(name: SkillName): string;
9
+ /** Returns the absolute path to the skills directory. */
10
+ declare function getSkillsDir(): string;
11
+ /** Returns the content of a reference file as a string. */
12
+ declare function getReference(name: ReferenceName): string;
13
+ /** Returns the content of a skill's SKILL.md as a string. */
14
+ declare function getSkill(name: SkillName): string;
15
+
16
+ export { type ReferenceName, type SkillName, getReference, getReferencePath, getSkill, getSkillPath, getSkillsDir };
@@ -0,0 +1,16 @@
1
+ declare const REFERENCES: readonly ["quickstart", "ingestion", "retrieval", "mcp", "partitions", "metadata-filtering", "rag-patterns", "api-reference", "python"];
2
+ declare const SKILLS: readonly ["ragie"];
3
+ type ReferenceName = (typeof REFERENCES)[number];
4
+ type SkillName = (typeof SKILLS)[number];
5
+ /** Returns the absolute path to a reference file. */
6
+ declare function getReferencePath(name: ReferenceName): string;
7
+ /** Returns the absolute path to a skill's SKILL.md. */
8
+ declare function getSkillPath(name: SkillName): string;
9
+ /** Returns the absolute path to the skills directory. */
10
+ declare function getSkillsDir(): string;
11
+ /** Returns the content of a reference file as a string. */
12
+ declare function getReference(name: ReferenceName): string;
13
+ /** Returns the content of a skill's SKILL.md as a string. */
14
+ declare function getSkill(name: SkillName): string;
15
+
16
+ export { type ReferenceName, type SkillName, getReference, getReferencePath, getSkill, getSkillPath, getSkillsDir };
package/dist/index.js ADDED
@@ -0,0 +1,26 @@
1
+ // src/index.ts
2
+ import { readFileSync } from "fs";
3
+ import { join } from "path";
4
+ var root = join(__dirname, "..");
5
+ function getReferencePath(name) {
6
+ return join(root, "skills", "ragie", "references", `${name}.md`);
7
+ }
8
+ function getSkillPath(name) {
9
+ return join(root, "skills", name, "SKILL.md");
10
+ }
11
+ function getSkillsDir() {
12
+ return join(root, "skills");
13
+ }
14
+ function getReference(name) {
15
+ return readFileSync(getReferencePath(name), "utf-8");
16
+ }
17
+ function getSkill(name) {
18
+ return readFileSync(getSkillPath(name), "utf-8");
19
+ }
20
+ export {
21
+ getReference,
22
+ getReferencePath,
23
+ getSkill,
24
+ getSkillPath,
25
+ getSkillsDir
26
+ };
package/package.json ADDED
@@ -0,0 +1,43 @@
1
+ {
2
+ "name": "@ragieai/skills",
3
+ "version": "0.1.0",
4
+ "description": "Ragie skills for AI coding agents — document ingestion, retrieval, RAG patterns, and MCP integration",
5
+ "keywords": [
6
+ "ragie",
7
+ "rag",
8
+ "retrieval",
9
+ "ai",
10
+ "skills",
11
+ "claude",
12
+ "cursor"
13
+ ],
14
+ "license": "MIT",
15
+ "type": "module",
16
+ "main": "./dist/index.cjs",
17
+ "module": "./dist/index.js",
18
+ "types": "./dist/index.d.ts",
19
+ "exports": {
20
+ ".": {
21
+ "types": "./dist/index.d.ts",
22
+ "import": "./dist/index.js",
23
+ "require": "./dist/index.cjs"
24
+ }
25
+ },
26
+ "files": [
27
+ "dist",
28
+ "skills",
29
+ ".claude-plugin",
30
+ ".mcp.json"
31
+ ],
32
+ "scripts": {
33
+ "build": "tsup src/index.ts --format esm,cjs --dts",
34
+ "dev": "tsup src/index.ts --format esm,cjs --dts --watch",
35
+ "test": "vitest run"
36
+ },
37
+ "devDependencies": {
38
+ "@types/node": "^22.0.0",
39
+ "tsup": "^8.0.0",
40
+ "typescript": "^5.0.0",
41
+ "vitest": "^4.1.4"
42
+ }
43
+ }
@@ -0,0 +1,50 @@
1
+ ---
2
+ name: ragie
3
+ description: >
4
+ This skill should be used when the user wants to "add Ragie to my project", "integrate Ragie",
5
+ "use Ragie for RAG", "ingest documents with Ragie", "search documents with Ragie",
6
+ "set up document retrieval", "build a RAG pipeline", "use the Ragie MCP", "query my knowledge base",
7
+ "connect Ragie to Claude", or mentions Ragie in the context of document search, retrieval-augmented
8
+ generation, or knowledge base management. Provides end-to-end guidance for the Ragie managed RAG
9
+ platform: SDK setup, document ingestion, retrieval, MCP usage, and RAG application patterns.
10
+ version: "1.0.0"
11
+ ---
12
+
13
+ # Ragie
14
+
15
+ Ragie is a fully managed RAG (Retrieval-Augmented Generation) platform. It handles document ingestion, chunking, embedding, and retrieval — available via REST API, Python/TypeScript SDKs, and an MCP server.
16
+
17
+ ## References
18
+
19
+ Load the relevant reference for the task at hand:
20
+
21
+ | Reference | When to load |
22
+ |-----------|--------------|
23
+ | `references/quickstart.md` | Getting started, first integration, install instructions |
24
+ | `references/ingestion.md` | Uploading files/URLs/text, readiness polling, webhooks, bulk ingest |
25
+ | `references/retrieval.md` | Search options, `top_k`, reranking, hybrid search, filters |
26
+ | `references/mcp.md` | MCP server setup, `retrieve` tool, URL pattern, multi-partition config |
27
+ | `references/partitions.md` | Multi-tenancy, partition isolation, partition management |
28
+ | `references/metadata-filtering.md` | Tagging documents, filtering at retrieval time |
29
+ | `references/rag-patterns.md` | Building RAG responses, streaming, citations, tool use, production checklist |
30
+ | `references/api-reference.md` | Full REST endpoint reference, error codes, SDK auth |
31
+ | `references/python.md` | Python SDK equivalents for all patterns |
32
+
33
+ ## Core Concepts
34
+
35
+ | Concept | Description |
36
+ |---------|-------------|
37
+ | **Document** | Any ingested file, URL, or raw text. Processed asynchronously. |
38
+ | **Chunk** | A segment produced by Ragie's splitting pipeline. The unit of retrieval. |
39
+ | **Retrieval** | Hybrid semantic + keyword search across chunks. Returns `scored_chunks`. |
40
+ | **Partition** | Logical namespace for isolation (multi-tenancy, environments). |
41
+ | **Metadata** | Key-value pairs on documents; used for filtering at retrieval time. |
42
+
43
+ ## Quick Decision Guide
44
+
45
+ - **New to Ragie?** → `references/quickstart.md`
46
+ - **Uploading documents?** → `references/ingestion.md`
47
+ - **Searching / querying?** → `references/retrieval.md`
48
+ - **Using Claude Code's MCP tool?** → `references/mcp.md`
49
+ - **Building a RAG app end-to-end?** → `references/rag-patterns.md`
50
+ - **Multiple tenants or environments?** → `references/partitions.md`
@@ -0,0 +1,203 @@
1
+ # Ragie REST API Reference
2
+
3
+ Base URL: `https://api.ragie.ai`
4
+
5
+ Authentication: `Authorization: Bearer <RAGIE_API_KEY>` on every request.
6
+
7
+ ---
8
+
9
+ ## Documents
10
+
11
+ ### Create document from URL
12
+
13
+ ```
14
+ POST /documents
15
+ Content-Type: application/json
16
+
17
+ {
18
+ "url": "https://example.com/page",
19
+ "name": "My Doc", // optional display name
20
+ "partition": "tenant-id", // optional partition
21
+ "metadata": {} // optional key-value metadata
22
+ }
23
+ ```
24
+
25
+ ### Create document from raw bytes
26
+
27
+ ```
28
+ POST /documents/raw
29
+ Content-Type: multipart/form-data
30
+
31
+ content <file bytes>
32
+ content_type application/pdf | text/plain | text/markdown | ...
33
+ name <string>
34
+ partition <string> (optional)
35
+ metadata <json> (optional)
36
+ ```
37
+
38
+ ### Get document
39
+
40
+ ```
41
+ GET /documents/{document_id}
42
+ ```
43
+
44
+ Response fields: `id`, `name`, `status` (`pending` | `partitioning` | `partitioned` | `refined` | `chunked` | `indexed` | `summary_indexed` | `keyword_indexed` | `ready` | `failed`), `metadata`, `partition`, `created_at`, `updated_at`.
45
+
46
+ ### List documents
47
+
48
+ ```
49
+ GET /documents?page_size=<n>&cursor=<c>&filter=<json>
50
+ Partition: <partition> ← partition is a header, not a query param
51
+ ```
52
+
53
+ Returns `{ "results": [...], "pagination": { "next_cursor": "..." } }`.
54
+
55
+ ### Update document metadata
56
+
57
+ ```
58
+ PATCH /documents/{document_id}/metadata
59
+ Content-Type: application/json
60
+
61
+ {
62
+ "metadata": { "key": "value" }
63
+ }
64
+ ```
65
+
66
+ Performs a partial update. Keys set to `null` are deleted.
67
+
68
+ ### Delete document
69
+
70
+ ```
71
+ DELETE /documents/{document_id}
72
+ ```
73
+
74
+ ---
75
+
76
+ ## Retrievals
77
+
78
+ ### Retrieve chunks
79
+
80
+ ```
81
+ POST /retrievals
82
+ Content-Type: application/json
83
+
84
+ {
85
+ "query": "What are the rate limits?",
86
+ "top_k": 8, // default 8
87
+ "rerank": true, // cross-encoder rerank (recommended)
88
+ "partition": "tenant-id", // scope to partition (optional)
89
+ "filter": { "product": "api" }, // metadata filter (optional)
90
+ "max_chunks_per_document": 2, // limit chunks per source doc (optional)
91
+ "recency_bias": false // favor recently ingested docs (optional)
92
+ }
93
+ ```
94
+
95
+ Response:
96
+
97
+ ```json
98
+ {
99
+ "scored_chunks": [
100
+ {
101
+ "id": "chunk_...",
102
+ "index": 0,
103
+ "text": "...",
104
+ "score": 0.92,
105
+ "document_id": "doc_...",
106
+ "document_name": "API Docs",
107
+ "document_metadata": {},
108
+ "metadata": {}
109
+ }
110
+ ]
111
+ }
112
+ ```
113
+
114
+ ---
115
+
116
+ ## Partitions
117
+
118
+ ### List partitions
119
+
120
+ ```
121
+ GET /partitions
122
+ ```
123
+
124
+ ### Create partition
125
+
126
+ ```
127
+ POST /partitions
128
+ Content-Type: application/json
129
+
130
+ { "name": "tenant-42", "description": "optional" }
131
+ ```
132
+
133
+ ### Get partition (usage metrics)
134
+
135
+ ```
136
+ GET /partitions/{partition_id}
137
+ ```
138
+
139
+ Returns document count, pages processed/hosted monthly/total.
140
+
141
+ ### Delete partition (and all its documents)
142
+
143
+ ```
144
+ DELETE /partitions/{partition_id}
145
+ ```
146
+
147
+ ---
148
+
149
+ ## Webhooks
150
+
151
+ Register a webhook endpoint to receive document status change events:
152
+
153
+ ```
154
+ POST /webhook_endpoints
155
+ Content-Type: application/json
156
+
157
+ { "url": "https://your-server.com/ragie-webhook" }
158
+ ```
159
+
160
+ Ragie sends `document_status_updated` events when documents reach `indexed`, `keyword_indexed`, `ready`, or `failed` states.
161
+
162
+ ---
163
+
164
+ ## Error Codes
165
+
166
+ | HTTP | Meaning |
167
+ |------|---------|
168
+ | 401 | Invalid or missing API key |
169
+ | 402 | Usage limit exceeded |
170
+ | 404 | Document / partition not found |
171
+ | 422 | Validation error — response body has `detail` array |
172
+ | 429 | Rate limited — retry with exponential back-off |
173
+ | 5xx | Server error — retry |
174
+
175
+ ---
176
+
177
+ ## SDK Install & Auth
178
+
179
+ ### TypeScript / Node
180
+
181
+ ```bash
182
+ npm install ragie
183
+ ```
184
+
185
+ ```typescript
186
+ import { Ragie } from "ragie";
187
+ const client = new Ragie({ auth: process.env.RAGIE_API_KEY });
188
+ ```
189
+
190
+ ### Python
191
+
192
+ ```bash
193
+ pip install ragie
194
+ ```
195
+
196
+ ```python
197
+ from ragie import Ragie
198
+ client = Ragie(auth=os.environ["RAGIE_API_KEY"])
199
+ ```
200
+
201
+ All SDK methods mirror the REST endpoints and return typed response objects. The SDKs handle pagination, retries, and multipart uploads automatically.
202
+
203
+ Note: The TypeScript SDK uses camelCase (`createDocumentFromUrl`, `topK`, `scoredChunks`, `patchMetadata`). The REST API and Python SDK use snake_case (`create_document_from_url`, `top_k`, `scored_chunks`, `patch_metadata`).
@@ -0,0 +1,127 @@
1
+ # Ragie Document Ingestion
2
+
3
+ > Python user? See `references/python.md` for Python equivalents.
4
+
5
+ ## Ingestion Methods
6
+
7
+ | Method | SDK call | Use when |
8
+ |--------|----------|----------|
9
+ | File upload | `documents.create()` | Uploading files — supports all file types (PDF, DOCX, PPTX, images, …) |
10
+ | In-memory data | `documents.createRaw()` | Creating documents from in-memory text or JSON (scraped content, generated text, structured data) |
11
+ | URL | `documents.createDocumentFromUrl()` | Web pages, public S3/GCS links |
12
+
13
+ **Prefer `documents.create()`** when uploading files, as it supports all file types including binary formats. **Prefer `createRaw()`** when your data is already in memory as a string or object — it is simpler and avoids unnecessary file/Blob wrapping, but only handles text and JSON.
14
+
15
+ ## From a File
16
+
17
+ Use `documents.create()` with a `Blob`. This is the only method that supports all file types including binary formats (PDF, DOCX, images, etc.).
18
+
19
+ ```typescript
20
+ import { openAsBlob } from "fs";
21
+
22
+ const doc = await client.documents.create({
23
+ file: await openAsBlob("doc.pdf"),
24
+ name: "doc.pdf",
25
+ partition: "tenant-42", // optional
26
+ metadata: { type: "report", year: "2024" }, // optional
27
+ });
28
+ ```
29
+
30
+ ## From In-Memory Data (Raw Text or JSON)
31
+
32
+ **This is the preferred method when your data is already in memory** (e.g., scraped content, generated text, API responses). It accepts strings and plain objects — not binary data.
33
+
34
+ ```typescript
35
+ const doc = await client.documents.createRaw({
36
+ data: "Your text content here...", // string or plain object (not binary)
37
+ name: "my-note",
38
+ partition: "tenant-42", // optional
39
+ });
40
+ ```
41
+
42
+ ## From a URL
43
+
44
+ ```typescript
45
+ const doc = await client.documents.createDocumentFromUrl({
46
+ url: "https://example.com/report.pdf",
47
+ name: "Q4 Report", // optional display name
48
+ partition: "tenant-42", // optional partition
49
+ metadata: { type: "report", year: "2024" }, // optional
50
+ });
51
+ ```
52
+
53
+ ## Document Lifecycle
54
+
55
+ Documents are processed asynchronously through several stages:
56
+
57
+ `pending` → `partitioning` → `partitioned` → `refined` → `chunked` → `indexed` → `summary_indexed` → `keyword_indexed` → `ready` (or `failed`)
58
+
59
+ For polling, check `status === "ready"` or `status === "failed"`. Intermediate stages are informational.
60
+
61
+ ### Polling for Readiness
62
+
63
+ ```typescript
64
+ async function waitForReady(
65
+ client: Ragie,
66
+ docId: string,
67
+ timeoutMs = 120_000
68
+ ): Promise<void> {
69
+ const start = Date.now();
70
+ while (Date.now() - start < timeoutMs) {
71
+ const doc = await client.documents.get({ documentId: docId });
72
+ if (doc.status === "ready") return;
73
+ if (doc.status === "failed") throw new Error(`Document ${docId} failed`);
74
+ await new Promise((r) => setTimeout(r, 3000));
75
+ }
76
+ throw new Error(`Document ${docId} not ready after ${timeoutMs}ms`);
77
+ }
78
+
79
+ const doc = await client.documents.createDocumentFromUrl({ url });
80
+ await waitForReady(client, doc.id);
81
+ // now safe to retrieve
82
+ ```
83
+
84
+ ### Webhooks
85
+
86
+ Ragie can POST to your server when a document's status changes. Register a webhook endpoint via the Ragie dashboard or `POST /webhook_endpoints`. Ragie sends `document_status_updated` events when documents reach `indexed`, `keyword_indexed`, `ready`, or `failed` states.
87
+
88
+ Use polling during local development; register a webhook endpoint for production.
89
+
90
+ ## Bulk Ingestion
91
+
92
+ ```typescript
93
+ const docs = await Promise.all(
94
+ urls.map((url) =>
95
+ client.documents.createDocumentFromUrl({ url, partition: "my-partition" })
96
+ )
97
+ );
98
+ ```
99
+
100
+ ## Document Management
101
+
102
+ ```typescript
103
+ // Get a document
104
+ const doc = await client.documents.get({ documentId: docId });
105
+
106
+ // List documents (returns a PageIterator — async iterable)
107
+ for await (const page of client.documents.list({ partition: "tenant-42", pageSize: 50 })) {
108
+ for (const doc of page.result.documents) {
109
+ console.log(doc.id, doc.name, doc.status);
110
+ }
111
+ }
112
+
113
+ // Update metadata (partial update — keys set to null are deleted)
114
+ await client.documents.patchMetadata({
115
+ documentId: docId,
116
+ patchDocumentMetadataParams: { metadata: { reviewed: "true", version: "v4" } },
117
+ });
118
+
119
+ // Delete a document
120
+ await client.documents.delete({ documentId: docId });
121
+ ```
122
+
123
+ ## Gotchas
124
+
125
+ - Always check `status === "ready"` before querying — newly ingested documents are not immediately searchable.
126
+ - **Prefer `createRaw()` for in-memory data** — it's simpler when you already have a string or object. **Prefer `documents.create()` for file uploads** — it supports all file types. `createRaw()` only handles text and JSON (`data: string | object`); binary files (PDF, DOCX, etc.) must use `documents.create({ file: blob })`.
127
+ - Supported file types include PDF, DOCX, PPTX, TXT, MD, HTML, and more. Check the dashboard for the full list.