@lloyal-labs/rig 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +188 -0
- package/dist/index.d.ts +19 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +36 -0
- package/dist/index.js.map +1 -0
- package/dist/reranker.d.ts +22 -0
- package/dist/reranker.d.ts.map +1 -0
- package/dist/reranker.js +76 -0
- package/dist/reranker.js.map +1 -0
- package/dist/resources/files.d.ts +28 -0
- package/dist/resources/files.d.ts.map +1 -0
- package/dist/resources/files.js +98 -0
- package/dist/resources/files.js.map +1 -0
- package/dist/resources/index.d.ts +9 -0
- package/dist/resources/index.d.ts.map +1 -0
- package/dist/resources/index.js +13 -0
- package/dist/resources/index.js.map +1 -0
- package/dist/resources/types.d.ts +39 -0
- package/dist/resources/types.d.ts.map +1 -0
- package/dist/resources/types.js +3 -0
- package/dist/resources/types.js.map +1 -0
- package/dist/sources/corpus-research.md +14 -0
- package/dist/sources/corpus.d.ts +48 -0
- package/dist/sources/corpus.d.ts.map +1 -0
- package/dist/sources/corpus.js +91 -0
- package/dist/sources/corpus.js.map +1 -0
- package/dist/sources/extract.md +5 -0
- package/dist/sources/index.d.ts +10 -0
- package/dist/sources/index.d.ts.map +1 -0
- package/dist/sources/index.js +14 -0
- package/dist/sources/index.js.map +1 -0
- package/dist/sources/search-extract.md +6 -0
- package/dist/sources/types.d.ts +28 -0
- package/dist/sources/types.d.ts.map +1 -0
- package/dist/sources/types.js +3 -0
- package/dist/sources/types.js.map +1 -0
- package/dist/sources/web-research.md +12 -0
- package/dist/sources/web.d.ts +78 -0
- package/dist/sources/web.d.ts.map +1 -0
- package/dist/sources/web.js +319 -0
- package/dist/sources/web.js.map +1 -0
- package/dist/tools/fetch-page.d.ts +26 -0
- package/dist/tools/fetch-page.d.ts.map +1 -0
- package/dist/tools/fetch-page.js +72 -0
- package/dist/tools/fetch-page.js.map +1 -0
- package/dist/tools/grep.d.ts +30 -0
- package/dist/tools/grep.d.ts.map +1 -0
- package/dist/tools/grep.js +79 -0
- package/dist/tools/grep.js.map +1 -0
- package/dist/tools/index.d.ts +39 -0
- package/dist/tools/index.d.ts.map +1 -0
- package/dist/tools/index.js +49 -0
- package/dist/tools/index.js.map +1 -0
- package/dist/tools/plan.d.ts +76 -0
- package/dist/tools/plan.d.ts.map +1 -0
- package/dist/tools/plan.js +98 -0
- package/dist/tools/plan.js.map +1 -0
- package/dist/tools/read-file.d.ts +62 -0
- package/dist/tools/read-file.d.ts.map +1 -0
- package/dist/tools/read-file.js +123 -0
- package/dist/tools/read-file.js.map +1 -0
- package/dist/tools/report.d.ts +22 -0
- package/dist/tools/report.d.ts.map +1 -0
- package/dist/tools/report.js +26 -0
- package/dist/tools/report.js.map +1 -0
- package/dist/tools/research.d.ts +57 -0
- package/dist/tools/research.d.ts.map +1 -0
- package/dist/tools/research.js +117 -0
- package/dist/tools/research.js.map +1 -0
- package/dist/tools/search.d.ts +34 -0
- package/dist/tools/search.d.ts.map +1 -0
- package/dist/tools/search.js +69 -0
- package/dist/tools/search.js.map +1 -0
- package/dist/tools/types.d.ts +84 -0
- package/dist/tools/types.d.ts.map +1 -0
- package/dist/tools/types.js +3 -0
- package/dist/tools/types.js.map +1 -0
- package/dist/tools/web-research.d.ts +60 -0
- package/dist/tools/web-research.d.ts.map +1 -0
- package/dist/tools/web-research.js +136 -0
- package/dist/tools/web-research.js.map +1 -0
- package/dist/tools/web-search.d.ts +42 -0
- package/dist/tools/web-search.d.ts.map +1 -0
- package/dist/tools/web-search.js +83 -0
- package/dist/tools/web-search.js.map +1 -0
- package/package.json +45 -0
package/README.md
ADDED
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
# @lloyal-labs/rig
|
|
2
|
+
|
|
3
|
+
Retrieval-Interleaved Generation for [lloyal-agents](../agents).
|
|
4
|
+
|
|
5
|
+
```bash
|
|
6
|
+
npm i @lloyal-labs/rig @lloyal-labs/lloyal-agents @lloyal-labs/lloyal.node
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
## RIG vs RAG
|
|
10
|
+
|
|
11
|
+
RAG retrieves first, then generates. A retrieval step runs upfront — query the vector DB, get top-k passages, inject them into the prompt, call the model once. The model sees static context. Retrieval and generation are separate phases.
|
|
12
|
+
|
|
13
|
+
RIG interleaves retrieval and generation inside the decode loop. Agents generate reasoning, decide to search, process results, reason further, fetch a page, form hypotheses from the content, search again with refined queries. Retrieval decisions emerge from ongoing generation — each search query is informed by everything the agent has already discovered.
|
|
14
|
+
|
|
15
|
+
The difference is observable in tool call inputs. A RAG system constructs search queries from the original user question. A RIG agent constructs queries from hypotheses formed during generation:
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
grep(/memory leak/) → 3 matches in 2 files
|
|
19
|
+
read_file(pool.ts L40-80) → reads allocation logic, spots missing cleanup
|
|
20
|
+
search("resource cleanup on connection close") → finds teardown handler
|
|
21
|
+
read_file(server.ts L120-155) → discovers close handler never calls pool.drain()
|
|
22
|
+
grep(/drain|dispose|cleanup/) → 8 matches, confirms drain exists but is unused
|
|
23
|
+
search("pool drain connection lifecycle interaction") → targets the gap
|
|
24
|
+
report(findings)
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
The last search — `"pool drain connection lifecycle interaction"` — is the signature behavior. The agent read the allocation logic, discovered the drain method existed but was never called on connection close, and constructed a search specifically targeting that interaction. This is multi-hop reasoning: not "search and report" but "search, form hypothesis, search for confirmation."
|
|
28
|
+
|
|
29
|
+
### Why it's emergent
|
|
30
|
+
|
|
31
|
+
This behavior is not prompted or engineered. It emerges from the concurrency semantics of `lloyal-agents`.
|
|
32
|
+
|
|
33
|
+
The four-phase tick loop creates a clean decision boundary between each tool call and the next generation step:
|
|
34
|
+
|
|
35
|
+
1. Agent generates tokens, hits stop token, tool call extracted
|
|
36
|
+
2. Tool executes to completion — agent is suspended
|
|
37
|
+
3. Tool result fully prefilled into the agent's KV cache
|
|
38
|
+
4. Grammar state resets — clean slate for next decision
|
|
39
|
+
5. Agent resumes generating with the complete result as the last thing in context
|
|
40
|
+
|
|
41
|
+
Step 5 is the critical moment. The model's next-token prediction operates on a context where the tool result is fully present and the grammar is clean. The model makes a fresh decision: call another tool, call the same tool with different arguments, or report findings. This decision is informed by everything the agent has seen — all prior tool results are physically present in the branch's KV cache.
|
|
42
|
+
|
|
43
|
+
An agent that greps with a narrow pattern and gets 0 matches will broaden the pattern on its next grep — not because it's prompted to retry, but because the 0-match result is in context and the model naturally adjusts. An agent that reads a section and discovers an unexpected connection will construct a search query targeting that specific connection — the read result is in context, and the model forms a hypothesis from it.
|
|
44
|
+
|
|
45
|
+
Under a concurrent dispatch model where tool results arrive mid-generation, the agent is already producing tokens when results land. The result gets incorporated, but there's no clean pause for hypothesis formation. The observable effect: sequential dispatch produces progressively more specific queries; concurrent dispatch produces variations on the original question.
|
|
46
|
+
|
|
47
|
+
Depth scales with `maxTurns`. At 2 turns, agents do single-shot retrieval. At 6 turns, agents do 3-4 rounds of iterative refinement. At 20 turns, agents go deep — following citation chains, cross-referencing claims, building evidence maps. The quality difference is in the later tool call inputs.
|
|
48
|
+
|
|
49
|
+
## Sources
|
|
50
|
+
|
|
51
|
+
`@lloyal-labs/rig` provides two `Source` implementations (extending the base class from `lloyal-agents`):
|
|
52
|
+
|
|
53
|
+
**CorpusSource** — local files with grep, semantic search, read_file, and recursive research tools. Agents investigate a knowledge base by pattern matching, reading sections in context, and spawning sub-agents for deeper investigation.
|
|
54
|
+
|
|
55
|
+
**WebSource** — web search via [Tavily](https://tavily.com), page fetching with attention-based content extraction, and recursive web_research tools. `BufferingFetchPage` wraps fetch results — full content goes to the agent for reasoning, while a parallel buffer stores content for post-research reranking. Content extraction uses `generate({ parent })` to attend over the fetched page and extract summary + links via grammar-constrained generation, then prunes the fork — zero net KV cost per extraction.
|
|
56
|
+
|
|
57
|
+
Sources are composable. A pipeline can use one source, both, or custom implementations:
|
|
58
|
+
|
|
59
|
+
```typescript
|
|
60
|
+
import {
|
|
61
|
+
CorpusSource,
|
|
62
|
+
WebSource,
|
|
63
|
+
TavilyProvider,
|
|
64
|
+
loadResources,
|
|
65
|
+
chunkResources,
|
|
66
|
+
} from "@lloyal-labs/rig";
|
|
67
|
+
|
|
68
|
+
const sources = [];
|
|
69
|
+
|
|
70
|
+
// Local knowledge base
|
|
71
|
+
if (corpusDir) {
|
|
72
|
+
const resources = loadResources(corpusDir);
|
|
73
|
+
const chunks = chunkResources(resources);
|
|
74
|
+
sources.push(new CorpusSource(resources, chunks));
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
// Web search
|
|
78
|
+
if (process.env.TAVILY_API_KEY) {
|
|
79
|
+
sources.push(new WebSource(new TavilyProvider()));
|
|
80
|
+
}
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
When multiple sources are used, they run sequentially — each source gets the full KV budget. After source N completes, its inner branches are pruned and KV is freed for source N+1.
|
|
84
|
+
|
|
85
|
+
### Bridge
|
|
86
|
+
|
|
87
|
+
Between sources, a bridge exit gate structures discoveries from the completed source as durable context for the next source's investigation:
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
Corpus research → Bridge → Web research → Synthesize
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
The bridge extracts three tiers of discovery:
|
|
94
|
+
|
|
95
|
+
1. **What was established** — specific data points, study details, statistics, quotes. Evidence preserved verbatim.
|
|
96
|
+
2. **Where evidence is incomplete** — acknowledged limitations, absent study designs, uncertain mechanisms. These are well-researched claims with identified evidence gaps.
|
|
97
|
+
3. **What was not covered** — topics mentioned but not substantiated, or entirely absent.
|
|
98
|
+
|
|
99
|
+
The distinction between (2) and (3) is critical. A topic with six sections of evidence but no experimental validation is not a gap — it is a well-researched claim with an identified evidence limitation. The bridge flags the limitation, not the topic. This prevents the next source from re-investigating what the previous source already covered, and directs it toward genuine gaps.
|
|
100
|
+
|
|
101
|
+
Bridge discoveries condition the next source's questions:
|
|
102
|
+
|
|
103
|
+
```typescript
|
|
104
|
+
activeQuestions = questions.map(
|
|
105
|
+
(q) => `${q}\n\nPrior research discoveries:\n${discoveries}`,
|
|
106
|
+
);
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Pipeline
|
|
110
|
+
|
|
111
|
+
A typical RIG pipeline:
|
|
112
|
+
|
|
113
|
+
```
|
|
114
|
+
Plan → Research → [Bridge →] Synthesize → Eval
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
**Plan.** Grammar-constrained decomposition of the user query into sub-questions with intent classification (`research` vs `clarify`). If the query is focused enough to investigate directly, produces an empty array (passthrough). Uses `generate()` with a JSON schema grammar — the model outputs structured `{ questions: [{ text, intent }] }` in a single generation pass.
|
|
118
|
+
|
|
119
|
+
**Research.** Each source's research tool spawns a pool of agents that investigate sub-questions. Agents interleave retrieval and generation — searching, reading, forming hypotheses, searching again. Within each source, all agents run concurrently on shared GPU compute via `useAgentPool`. Sources run sequentially, each getting the full KV budget.
|
|
120
|
+
|
|
121
|
+
Agents that get cut by context pressure (their tool results exceeded KV headroom) are recovered via scratchpad extraction — `generate({ parent: agent.branch })` with a grammar-constrained reporter prompt attends over the agent's accumulated KV and extracts findings. The agent paid the KV cost of reasoning; the extraction recovers the value.
|
|
122
|
+
|
|
123
|
+
**Bridge.** Runs between sources when multiple sources are configured. A single agent with report-only tools structures discoveries from the completed source. The bridge output conditions the next source's sub-questions, directing investigation toward gaps rather than re-covering established ground.
|
|
124
|
+
|
|
125
|
+
**Synthesize.** A synthesis agent integrates findings from all sources into a structured report with source attribution. Research notes provide analytical structure; reranked source passages provide ground truth for citation. The synthesizer cross-references both — using research notes to identify what matters, and source passages for evidence.
|
|
126
|
+
|
|
127
|
+
**Eval.** Multi-branch semantic comparison via `diverge()`. Fork N branches from a shared frontier, generate independently with the same verify prompt, check convergence. Where branches agree, the model is confident. Where they diverge, the answer needs refinement.
|
|
128
|
+
|
|
129
|
+
## Tools
|
|
130
|
+
|
|
131
|
+
### Corpus tools
|
|
132
|
+
|
|
133
|
+
| Tool | Description |
|
|
134
|
+
| -------------- | ----------------------------------------------------------------------- |
|
|
135
|
+
| `SearchTool` | Semantic search over corpus chunks via reranker scoring |
|
|
136
|
+
| `GrepTool` | Exhaustive regex pattern matching across all files |
|
|
137
|
+
| `ReadFileTool` | Read file content at specified line ranges, tracks per-agent read state |
|
|
138
|
+
| `ResearchTool` | Spawn sub-agent pool for deeper investigation of sub-questions |
|
|
139
|
+
| `ReportTool` | Terminal tool — agents call this to submit findings |
|
|
140
|
+
|
|
141
|
+
### Web tools
|
|
142
|
+
|
|
143
|
+
| Tool | Description |
|
|
144
|
+
| ----------------- | --------------------------------------------------------------- |
|
|
145
|
+
| `WebSearchTool` | Web search via configurable provider (Tavily included) |
|
|
146
|
+
| `FetchPageTool` | Fetch URL, extract article text via Readability |
|
|
147
|
+
| `WebResearchTool` | Spawn sub-agent pool with web tools for recursive investigation |
|
|
148
|
+
|
|
149
|
+
### Pipeline tools
|
|
150
|
+
|
|
151
|
+
| Tool | Description |
|
|
152
|
+
| --------------------------- | ------------------------------------------------------------------ |
|
|
153
|
+
| `PlanTool` | Grammar-constrained query decomposition with intent classification |
|
|
154
|
+
| `createTools(opts)` | Build corpus toolkit from resources, chunks, and reranker |
|
|
155
|
+
| `createReranker(modelPath)` | Semantic reranker for chunk scoring and passage selection |
|
|
156
|
+
|
|
157
|
+
## Custom Sources
|
|
158
|
+
|
|
159
|
+
Extend `Source` from `lloyal-agents` to create custom sources:
|
|
160
|
+
|
|
161
|
+
```typescript
|
|
162
|
+
import { Source } from "@lloyal-labs/lloyal-agents";
|
|
163
|
+
import type { Tool } from "@lloyal-labs/lloyal-agents";
|
|
164
|
+
|
|
165
|
+
class DatabaseSource extends Source<SourceContext, Row> {
|
|
166
|
+
readonly name = "database";
|
|
167
|
+
|
|
168
|
+
get researchTool(): Tool {
|
|
169
|
+
return this._researchTool;
|
|
170
|
+
}
|
|
171
|
+
|
|
172
|
+
*bind(ctx: SourceContext) {
|
|
173
|
+
// Set up tools with access to ctx.parent (for generate({ parent })),
|
|
174
|
+
// ctx.reranker, ctx.reporterPrompt, ctx.reportTool
|
|
175
|
+
this._researchTool = new DatabaseResearchTool(/* ... */);
|
|
176
|
+
}
|
|
177
|
+
|
|
178
|
+
getChunks(): Row[] {
|
|
179
|
+
return this._results; // buffered for post-research reranking
|
|
180
|
+
}
|
|
181
|
+
}
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
The `bind()` lifecycle receives a `SourceContext` with the parent branch (for forking), reranker, reporter prompt, and report tool. Your research tool calls `useAgentPool` or `runAgents` internally — same primitives the built-in sources use.
|
|
185
|
+
|
|
186
|
+
## License
|
|
187
|
+
|
|
188
|
+
Apache-2.0
|
package/dist/index.d.ts
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Rig — research infrastructure for the lloyal agent pipeline
|
|
3
|
+
*
|
|
4
|
+
* Provides source implementations ({@link WebSource}, {@link CorpusSource}),
|
|
5
|
+
* resource loading/chunking, reranking, and the tool library used by
|
|
6
|
+
* deep-research harnesses. Sources are composed via the abstract
|
|
7
|
+
* {@link Source} base class from `@lloyal-labs/lloyal-agents`.
|
|
8
|
+
*
|
|
9
|
+
* @packageDocumentation
|
|
10
|
+
* @category Rig
|
|
11
|
+
*/
|
|
12
|
+
export { createTools, reportTool, ResearchTool, WebSearchTool, TavilyProvider, FetchPageTool, WebResearchTool, PlanTool, } from './tools';
|
|
13
|
+
export type { ResearchToolOpts, WebResearchToolOpts, PlanToolOpts, PlanResult, PlanQuestion, SearchProvider, SearchResult, Reranker, ScoredChunk, ScoredResult, } from './tools';
|
|
14
|
+
export { WebSource, CorpusSource } from './sources';
|
|
15
|
+
export type { SourceContext } from './sources';
|
|
16
|
+
export { loadResources, chunkResources } from './resources';
|
|
17
|
+
export type { Resource, Chunk } from './resources';
|
|
18
|
+
export { createReranker } from './reranker';
|
|
19
|
+
//# sourceMappingURL=index.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;GAUG;AAGH,OAAO,EACL,WAAW,EAAE,UAAU,EACvB,YAAY,EAAE,aAAa,EAAE,cAAc,EAAE,aAAa,EAC1D,eAAe,EAAE,QAAQ,GAC1B,MAAM,SAAS,CAAC;AACjB,YAAY,EACV,gBAAgB,EAAE,mBAAmB,EAAE,YAAY,EACnD,UAAU,EAAE,YAAY,EACxB,cAAc,EAAE,YAAY,EAC5B,QAAQ,EAAE,WAAW,EAAE,YAAY,GACpC,MAAM,SAAS,CAAC;AAGjB,OAAO,EAAE,SAAS,EAAE,YAAY,EAAE,MAAM,WAAW,CAAC;AACpD,YAAY,EAAE,aAAa,EAAE,MAAM,WAAW,CAAC;AAG/C,OAAO,EAAE,aAAa,EAAE,cAAc,EAAE,MAAM,aAAa,CAAC;AAC5D,YAAY,EAAE,QAAQ,EAAE,KAAK,EAAE,MAAM,aAAa,CAAC;AAGnD,OAAO,EAAE,cAAc,EAAE,MAAM,YAAY,CAAC"}
|
package/dist/index.js
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
/**
|
|
3
|
+
* Rig — research infrastructure for the lloyal agent pipeline
|
|
4
|
+
*
|
|
5
|
+
* Provides source implementations ({@link WebSource}, {@link CorpusSource}),
|
|
6
|
+
* resource loading/chunking, reranking, and the tool library used by
|
|
7
|
+
* deep-research harnesses. Sources are composed via the abstract
|
|
8
|
+
* {@link Source} base class from `@lloyal-labs/lloyal-agents`.
|
|
9
|
+
*
|
|
10
|
+
* @packageDocumentation
|
|
11
|
+
* @category Rig
|
|
12
|
+
*/
|
|
13
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
14
|
+
exports.createReranker = exports.chunkResources = exports.loadResources = exports.CorpusSource = exports.WebSource = exports.PlanTool = exports.WebResearchTool = exports.FetchPageTool = exports.TavilyProvider = exports.WebSearchTool = exports.ResearchTool = exports.reportTool = exports.createTools = void 0;
|
|
15
|
+
// Tools
|
|
16
|
+
var tools_1 = require("./tools");
|
|
17
|
+
Object.defineProperty(exports, "createTools", { enumerable: true, get: function () { return tools_1.createTools; } });
|
|
18
|
+
Object.defineProperty(exports, "reportTool", { enumerable: true, get: function () { return tools_1.reportTool; } });
|
|
19
|
+
Object.defineProperty(exports, "ResearchTool", { enumerable: true, get: function () { return tools_1.ResearchTool; } });
|
|
20
|
+
Object.defineProperty(exports, "WebSearchTool", { enumerable: true, get: function () { return tools_1.WebSearchTool; } });
|
|
21
|
+
Object.defineProperty(exports, "TavilyProvider", { enumerable: true, get: function () { return tools_1.TavilyProvider; } });
|
|
22
|
+
Object.defineProperty(exports, "FetchPageTool", { enumerable: true, get: function () { return tools_1.FetchPageTool; } });
|
|
23
|
+
Object.defineProperty(exports, "WebResearchTool", { enumerable: true, get: function () { return tools_1.WebResearchTool; } });
|
|
24
|
+
Object.defineProperty(exports, "PlanTool", { enumerable: true, get: function () { return tools_1.PlanTool; } });
|
|
25
|
+
// Sources
|
|
26
|
+
var sources_1 = require("./sources");
|
|
27
|
+
Object.defineProperty(exports, "WebSource", { enumerable: true, get: function () { return sources_1.WebSource; } });
|
|
28
|
+
Object.defineProperty(exports, "CorpusSource", { enumerable: true, get: function () { return sources_1.CorpusSource; } });
|
|
29
|
+
// Resources
|
|
30
|
+
var resources_1 = require("./resources");
|
|
31
|
+
Object.defineProperty(exports, "loadResources", { enumerable: true, get: function () { return resources_1.loadResources; } });
|
|
32
|
+
Object.defineProperty(exports, "chunkResources", { enumerable: true, get: function () { return resources_1.chunkResources; } });
|
|
33
|
+
// Reranker
|
|
34
|
+
var reranker_1 = require("./reranker");
|
|
35
|
+
Object.defineProperty(exports, "createReranker", { enumerable: true, get: function () { return reranker_1.createReranker; } });
|
|
36
|
+
//# sourceMappingURL=index.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAAA;;;;;;;;;;GAUG;;;AAEH,QAAQ;AACR,iCAIiB;AAHf,oGAAA,WAAW,OAAA;AAAE,mGAAA,UAAU,OAAA;AACvB,qGAAA,YAAY,OAAA;AAAE,sGAAA,aAAa,OAAA;AAAE,uGAAA,cAAc,OAAA;AAAE,sGAAA,aAAa,OAAA;AAC1D,wGAAA,eAAe,OAAA;AAAE,iGAAA,QAAQ,OAAA;AAS3B,UAAU;AACV,qCAAoD;AAA3C,oGAAA,SAAS,OAAA;AAAE,uGAAA,YAAY,OAAA;AAGhC,YAAY;AACZ,yCAA4D;AAAnD,0GAAA,aAAa,OAAA;AAAE,2GAAA,cAAc,OAAA;AAGtC,WAAW;AACX,uCAA4C;AAAnC,0GAAA,cAAc,OAAA"}
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
import type { Reranker } from "./tools/types";
|
|
2
|
+
/**
|
|
3
|
+
* Create a {@link Reranker} backed by a dedicated reranking model context
|
|
4
|
+
*
|
|
5
|
+
* Loads a separate model (typically a cross-encoder) into its own KV cache
|
|
6
|
+
* and exposes `score`, `tokenizeChunks`, and `dispose` methods. The returned
|
|
7
|
+
* `score` method yields {@link ScoredResult} batches as an async iterable,
|
|
8
|
+
* mapping raw indices back to the original {@link Chunk} metadata.
|
|
9
|
+
*
|
|
10
|
+
* @param modelPath - Absolute path to the reranking model file (GGUF)
|
|
11
|
+
* @param opts - Optional context sizing overrides
|
|
12
|
+
* @param opts.nSeqMax - Maximum parallel scoring sequences (default 8)
|
|
13
|
+
* @param opts.nCtx - Context window size for the reranker model (default 4096)
|
|
14
|
+
* @returns A ready-to-use reranker instance; call `dispose()` when finished
|
|
15
|
+
*
|
|
16
|
+
* @category Rig
|
|
17
|
+
*/
|
|
18
|
+
export declare function createReranker(modelPath: string, opts?: {
|
|
19
|
+
nSeqMax?: number;
|
|
20
|
+
nCtx?: number;
|
|
21
|
+
}): Promise<Reranker>;
|
|
22
|
+
//# sourceMappingURL=reranker.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"reranker.d.ts","sourceRoot":"","sources":["../src/reranker.ts"],"names":[],"mappings":"AAIA,OAAO,KAAK,EAAE,QAAQ,EAAgB,MAAM,eAAe,CAAC;AAE5D;;;;;;;;;;;;;;;GAeG;AACH,wBAAsB,cAAc,CAClC,SAAS,EAAE,MAAM,EACjB,IAAI,CAAC,EAAE;IAAE,OAAO,CAAC,EAAE,MAAM,CAAC;IAAC,IAAI,CAAC,EAAE,MAAM,CAAA;CAAE,GACzC,OAAO,CAAC,QAAQ,CAAC,CA4DnB"}
|
package/dist/reranker.js
ADDED
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.createReranker = createReranker;
|
|
4
|
+
const lloyal_node_1 = require("@lloyal-labs/lloyal.node");
|
|
5
|
+
const sdk_1 = require("@lloyal-labs/sdk");
|
|
6
|
+
/**
|
|
7
|
+
* Create a {@link Reranker} backed by a dedicated reranking model context
|
|
8
|
+
*
|
|
9
|
+
* Loads a separate model (typically a cross-encoder) into its own KV cache
|
|
10
|
+
* and exposes `score`, `tokenizeChunks`, and `dispose` methods. The returned
|
|
11
|
+
* `score` method yields {@link ScoredResult} batches as an async iterable,
|
|
12
|
+
* mapping raw indices back to the original {@link Chunk} metadata.
|
|
13
|
+
*
|
|
14
|
+
* @param modelPath - Absolute path to the reranking model file (GGUF)
|
|
15
|
+
* @param opts - Optional context sizing overrides
|
|
16
|
+
* @param opts.nSeqMax - Maximum parallel scoring sequences (default 8)
|
|
17
|
+
* @param opts.nCtx - Context window size for the reranker model (default 4096)
|
|
18
|
+
* @returns A ready-to-use reranker instance; call `dispose()` when finished
|
|
19
|
+
*
|
|
20
|
+
* @category Rig
|
|
21
|
+
*/
|
|
22
|
+
async function createReranker(modelPath, opts) {
|
|
23
|
+
const nSeqMax = opts?.nSeqMax ?? 8;
|
|
24
|
+
const nCtx = opts?.nCtx ?? 4096;
|
|
25
|
+
const ctx = await (0, lloyal_node_1.createContext)({
|
|
26
|
+
modelPath,
|
|
27
|
+
nCtx,
|
|
28
|
+
nSeqMax,
|
|
29
|
+
typeK: 'q4_0',
|
|
30
|
+
typeV: 'q4_0',
|
|
31
|
+
});
|
|
32
|
+
const rerank = await sdk_1.Rerank.create(ctx, { nSeqMax, nCtx });
|
|
33
|
+
return {
|
|
34
|
+
score(query, chunks) {
|
|
35
|
+
const inner = rerank.score(query, chunks.map((c) => c.tokens), 10);
|
|
36
|
+
return {
|
|
37
|
+
[Symbol.asyncIterator]() {
|
|
38
|
+
const it = inner[Symbol.asyncIterator]();
|
|
39
|
+
return {
|
|
40
|
+
async next() {
|
|
41
|
+
const { value, done } = await it.next();
|
|
42
|
+
if (done)
|
|
43
|
+
return {
|
|
44
|
+
value: undefined,
|
|
45
|
+
done: true,
|
|
46
|
+
};
|
|
47
|
+
return {
|
|
48
|
+
value: {
|
|
49
|
+
filled: value.filled,
|
|
50
|
+
total: value.total,
|
|
51
|
+
results: value.results.map((r) => ({
|
|
52
|
+
file: chunks[r.index].resource,
|
|
53
|
+
heading: chunks[r.index].heading,
|
|
54
|
+
score: r.score,
|
|
55
|
+
startLine: chunks[r.index].startLine,
|
|
56
|
+
endLine: chunks[r.index].endLine,
|
|
57
|
+
})),
|
|
58
|
+
},
|
|
59
|
+
done: false,
|
|
60
|
+
};
|
|
61
|
+
},
|
|
62
|
+
};
|
|
63
|
+
},
|
|
64
|
+
};
|
|
65
|
+
},
|
|
66
|
+
async tokenizeChunks(chunks) {
|
|
67
|
+
for (const chunk of chunks) {
|
|
68
|
+
chunk.tokens = await rerank.tokenize(chunk.text);
|
|
69
|
+
}
|
|
70
|
+
},
|
|
71
|
+
dispose() {
|
|
72
|
+
rerank.dispose();
|
|
73
|
+
},
|
|
74
|
+
};
|
|
75
|
+
}
|
|
76
|
+
//# sourceMappingURL=reranker.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"reranker.js","sourceRoot":"","sources":["../src/reranker.ts"],"names":[],"mappings":";;AAsBA,wCA+DC;AArFD,0DAAyD;AACzD,0CAA0C;AAK1C;;;;;;;;;;;;;;;GAeG;AACI,KAAK,UAAU,cAAc,CAClC,SAAiB,EACjB,IAA0C;IAE1C,MAAM,OAAO,GAAG,IAAI,EAAE,OAAO,IAAI,CAAC,CAAC;IACnC,MAAM,IAAI,GAAG,IAAI,EAAE,IAAI,IAAI,IAAI,CAAC;IAChC,MAAM,GAAG,GAAG,MAAM,IAAA,2BAAa,EAAC;QAC9B,SAAS;QACT,IAAI;QACJ,OAAO;QACP,KAAK,EAAE,MAAM;QACb,KAAK,EAAE,MAAM;KACd,CAAC,CAAC;IACH,MAAM,MAAM,GAAG,MAAM,YAAM,CAAC,MAAM,CAAC,GAAgC,EAAE,EAAE,OAAO,EAAE,IAAI,EAAE,CAAC,CAAC;IAExF,OAAO;QACL,KAAK,CAAC,KAAa,EAAE,MAAe;YAClC,MAAM,KAAK,GAAG,MAAM,CAAC,KAAK,CACxB,KAAK,EACL,MAAM,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,MAAM,CAAC,EAC3B,EAAE,CACH,CAAC;YACF,OAAO;gBACL,CAAC,MAAM,CAAC,aAAa,CAAC;oBACpB,MAAM,EAAE,GAAG,KAAK,CAAC,MAAM,CAAC,aAAa,CAAC,EAAE,CAAC;oBACzC,OAAO;wBACL,KAAK,CAAC,IAAI;4BACR,MAAM,EAAE,KAAK,EAAE,IAAI,EAAE,GAAG,MAAM,EAAE,CAAC,IAAI,EAAE,CAAC;4BACxC,IAAI,IAAI;gCACN,OAAO;oCACL,KAAK,EAAE,SAAoC;oCAC3C,IAAI,EAAE,IAAI;iCACX,CAAC;4BACJ,OAAO;gCACL,KAAK,EAAE;oCACL,MAAM,EAAE,KAAK,CAAC,MAAM;oCACpB,KAAK,EAAE,KAAK,CAAC,KAAK;oCAClB,OAAO,EAAE,KAAK,CAAC,OAAO,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC;wCACjC,IAAI,EAAE,MAAM,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,QAAQ;wCAC9B,OAAO,EAAE,MAAM,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,OAAO;wCAChC,KAAK,EAAE,CAAC,CAAC,KAAK;wCACd,SAAS,EAAE,MAAM,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,SAAS;wCACpC,OAAO,EAAE,MAAM,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,OAAO;qCACjC,CAAC,CAAC;iCACJ;gCACD,IAAI,EAAE,KAAK;6BACZ,CAAC;wBACJ,CAAC;qBACF,CAAC;gBACJ,CAAC;aACF,CAAC;QACJ,CAAC;QAED,KAAK,CAAC,cAAc,CAAC,MAAe;YAClC,KAAK,MAAM,KAAK,IAAI,MAAM,EAAE,CAAC;gBAC3B,KAAK,CAAC,MAAM,GAAG,MAAM,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;YACnD,CAAC;QACH,CAAC;QAED,OAAO;YACL,MAAM,CAAC,OAAO,EAAE,CAAC;QACnB,CAAC;KACF,CAAC;AACJ,CAAC"}
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
import type { Resource, Chunk } from './types';
|
|
2
|
+
/**
|
|
3
|
+
* Load documents from a directory (or single file) into {@link Resource} objects
|
|
4
|
+
*
|
|
5
|
+
* If `dir` is a file path, returns a single-element array. If it is a
|
|
6
|
+
* directory, reads all `.md` files within it. Exits the process with an
|
|
7
|
+
* error message if the path does not exist or contains no Markdown files.
|
|
8
|
+
*
|
|
9
|
+
* @param dir - Absolute path to a directory of `.md` files or a single file
|
|
10
|
+
* @returns Array of loaded resources with file name and content
|
|
11
|
+
*
|
|
12
|
+
* @category Rig
|
|
13
|
+
*/
|
|
14
|
+
export declare function loadResources(dir: string): Resource[];
|
|
15
|
+
/**
|
|
16
|
+
* Split loaded resources into {@link Chunk} instances for reranking
|
|
17
|
+
*
|
|
18
|
+
* Uses native Markdown heading detection (via `parseMarkdown`) to produce
|
|
19
|
+
* section-level chunks. Falls back to blank-line paragraph splitting for
|
|
20
|
+
* resources with no headings (or fewer than 10 lines of content).
|
|
21
|
+
*
|
|
22
|
+
* @param resources - Resources to chunk (from {@link loadResources})
|
|
23
|
+
* @returns Flat array of chunks across all resources, ready for {@link Reranker.tokenizeChunks}
|
|
24
|
+
*
|
|
25
|
+
* @category Rig
|
|
26
|
+
*/
|
|
27
|
+
export declare function chunkResources(resources: Resource[]): Chunk[];
|
|
28
|
+
//# sourceMappingURL=files.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"files.d.ts","sourceRoot":"","sources":["../../src/resources/files.ts"],"names":[],"mappings":"AAGA,OAAO,KAAK,EAAE,QAAQ,EAAE,KAAK,EAAE,MAAM,SAAS,CAAC;AAK/C;;;;;;;;;;;GAWG;AACH,wBAAgB,aAAa,CAAC,GAAG,EAAE,MAAM,GAAG,QAAQ,EAAE,CAkBrD;AA0BD;;;;;;;;;;;GAWG;AACH,wBAAgB,cAAc,CAAC,SAAS,EAAE,QAAQ,EAAE,GAAG,KAAK,EAAE,CAoB7D"}
|
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.loadResources = loadResources;
|
|
4
|
+
exports.chunkResources = chunkResources;
|
|
5
|
+
const fs = require("node:fs");
|
|
6
|
+
const path = require("node:path");
|
|
7
|
+
const lloyal_node_1 = require("@lloyal-labs/lloyal.node");
|
|
8
|
+
const { parseMarkdown } = (0, lloyal_node_1.loadBinary)();
|
|
9
|
+
/**
|
|
10
|
+
* Load documents from a directory (or single file) into {@link Resource} objects
|
|
11
|
+
*
|
|
12
|
+
* If `dir` is a file path, returns a single-element array. If it is a
|
|
13
|
+
* directory, reads all `.md` files within it. Exits the process with an
|
|
14
|
+
* error message if the path does not exist or contains no Markdown files.
|
|
15
|
+
*
|
|
16
|
+
* @param dir - Absolute path to a directory of `.md` files or a single file
|
|
17
|
+
* @returns Array of loaded resources with file name and content
|
|
18
|
+
*
|
|
19
|
+
* @category Rig
|
|
20
|
+
*/
|
|
21
|
+
function loadResources(dir) {
|
|
22
|
+
if (!fs.existsSync(dir)) {
|
|
23
|
+
process.stdout.write(`Error: corpus not found: ${dir}\n`);
|
|
24
|
+
process.exit(1);
|
|
25
|
+
}
|
|
26
|
+
const stat = fs.statSync(dir);
|
|
27
|
+
if (stat.isFile()) {
|
|
28
|
+
return [{ name: path.basename(dir), content: fs.readFileSync(dir, 'utf8') }];
|
|
29
|
+
}
|
|
30
|
+
const files = fs.readdirSync(dir).filter((f) => f.endsWith('.md'));
|
|
31
|
+
if (!files.length) {
|
|
32
|
+
process.stdout.write(`Error: no .md files in: ${dir}\n`);
|
|
33
|
+
process.exit(1);
|
|
34
|
+
}
|
|
35
|
+
return files.map((f) => ({
|
|
36
|
+
name: f,
|
|
37
|
+
content: fs.readFileSync(path.join(dir, f), 'utf8'),
|
|
38
|
+
}));
|
|
39
|
+
}
|
|
40
|
+
/** Split plain text into chunks on blank-line paragraph boundaries */
|
|
41
|
+
function chunkByParagraph(res) {
|
|
42
|
+
const lines = res.content.split('\n');
|
|
43
|
+
const chunks = [];
|
|
44
|
+
let start = 0;
|
|
45
|
+
for (let i = 0; i <= lines.length; i++) {
|
|
46
|
+
const blank = i === lines.length || !lines[i].trim();
|
|
47
|
+
if (blank && i > start) {
|
|
48
|
+
const text = lines.slice(start, i).join('\n').trim();
|
|
49
|
+
if (text) {
|
|
50
|
+
chunks.push({
|
|
51
|
+
resource: res.name,
|
|
52
|
+
heading: text.slice(0, 60).replace(/\n/g, ' ') + (text.length > 60 ? '\u2026' : ''),
|
|
53
|
+
text, tokens: [],
|
|
54
|
+
startLine: start + 1,
|
|
55
|
+
endLine: i,
|
|
56
|
+
});
|
|
57
|
+
}
|
|
58
|
+
}
|
|
59
|
+
if (blank)
|
|
60
|
+
start = i + 1;
|
|
61
|
+
}
|
|
62
|
+
return chunks;
|
|
63
|
+
}
|
|
64
|
+
/**
|
|
65
|
+
* Split loaded resources into {@link Chunk} instances for reranking
|
|
66
|
+
*
|
|
67
|
+
* Uses native Markdown heading detection (via `parseMarkdown`) to produce
|
|
68
|
+
* section-level chunks. Falls back to blank-line paragraph splitting for
|
|
69
|
+
* resources with no headings (or fewer than 10 lines of content).
|
|
70
|
+
*
|
|
71
|
+
* @param resources - Resources to chunk (from {@link loadResources})
|
|
72
|
+
* @returns Flat array of chunks across all resources, ready for {@link Reranker.tokenizeChunks}
|
|
73
|
+
*
|
|
74
|
+
* @category Rig
|
|
75
|
+
*/
|
|
76
|
+
function chunkResources(resources) {
|
|
77
|
+
const out = [];
|
|
78
|
+
for (const res of resources) {
|
|
79
|
+
const sections = parseMarkdown(res.content);
|
|
80
|
+
// Single section covering the whole file = no headings found -> paragraph split
|
|
81
|
+
if (sections.length <= 1 && res.content.split('\n').length > 10) {
|
|
82
|
+
out.push(...chunkByParagraph(res));
|
|
83
|
+
continue;
|
|
84
|
+
}
|
|
85
|
+
const lines = res.content.split('\n');
|
|
86
|
+
for (const sec of sections) {
|
|
87
|
+
const text = lines.slice(sec.startLine - 1, sec.endLine).join('\n').trim();
|
|
88
|
+
if (!text)
|
|
89
|
+
continue;
|
|
90
|
+
out.push({
|
|
91
|
+
resource: res.name, heading: sec.heading || res.name, text, tokens: [],
|
|
92
|
+
startLine: sec.startLine, endLine: sec.endLine,
|
|
93
|
+
});
|
|
94
|
+
}
|
|
95
|
+
}
|
|
96
|
+
return out;
|
|
97
|
+
}
|
|
98
|
+
//# sourceMappingURL=files.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"files.js","sourceRoot":"","sources":["../../src/resources/files.ts"],"names":[],"mappings":";;AAoBA,sCAkBC;AAsCD,wCAoBC;AAhGD,8BAA8B;AAC9B,kCAAkC;AAClC,0DAAsD;AAItD,MAAM,EAAE,aAAa,EAAE,GAAG,IAAA,wBAAU,GAA2D,CAAC;AAEhG;;;;;;;;;;;GAWG;AACH,SAAgB,aAAa,CAAC,GAAW;IACvC,IAAI,CAAC,EAAE,CAAC,UAAU,CAAC,GAAG,CAAC,EAAE,CAAC;QACxB,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,4BAA4B,GAAG,IAAI,CAAC,CAAC;QAC1D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IAClB,CAAC;IACD,MAAM,IAAI,GAAG,EAAE,CAAC,QAAQ,CAAC,GAAG,CAAC,CAAC;IAC9B,IAAI,IAAI,CAAC,MAAM,EAAE,EAAE,CAAC;QAClB,OAAO,CAAC,EAAE,IAAI,EAAE,IAAI,CAAC,QAAQ,CAAC,GAAG,CAAC,EAAE,OAAO,EAAE,EAAE,CAAC,YAAY,CAAC,GAAG,EAAE,MAAM,CAAC,EAAE,CAAC,CAAC;IAC/E,CAAC;IACD,MAAM,KAAK,GAAG,EAAE,CAAC,WAAW,CAAC,GAAG,CAAC,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,QAAQ,CAAC,KAAK,CAAC,CAAC,CAAC;IACnE,IAAI,CAAC,KAAK,CAAC,MAAM,EAAE,CAAC;QAClB,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,2BAA2B,GAAG,IAAI,CAAC,CAAC;QACzD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IAClB,CAAC;IACD,OAAO,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC;QACvB,IAAI,EAAE,CAAC;QACP,OAAO,EAAE,EAAE,CAAC,YAAY,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,EAAE,CAAC,CAAC,EAAE,MAAM,CAAC;KACpD,CAAC,CAAC,CAAC;AACN,CAAC;AAED,sEAAsE;AACtE,SAAS,gBAAgB,CAAC,GAAa;IACrC,MAAM,KAAK,GAAG,GAAG,CAAC,OAAO,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;IACtC,MAAM,MAAM,GAAY,EAAE,CAAC;IAC3B,IAAI,KAAK,GAAG,CAAC,CAAC;IACd,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,IAAI,KAAK,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE,CAAC;QACvC,MAAM,KAAK,GAAG,CAAC,KAAK,KAAK,CAAC,MAAM,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC;QACrD,IAAI,KAAK,IAAI,CAAC,GAAG,KAAK,EAAE,CAAC;YACvB,MAAM,IAAI,GAAG,KAAK,CAAC,KAAK,CAAC,KAAK,EAAE,CAAC,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,IAAI,EAAE,CAAC;YACrD,IAAI,IAAI,EAAE,CAAC;gBACT,MAAM,CAAC,IAAI,CAAC;oBACV,QAAQ,EAAE,GAAG,CAAC,IAAI;oBAClB,OAAO,EAAE,IAAI,CAAC,KAAK,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,OAAO,CAAC,KAAK,EAAE,GAAG,CAAC,GAAG,CAAC,IAAI,CAAC,MAAM,GAAG,EAAE,CAAC,CAAC,CAAC,QAAQ,CAAC,CAAC,CAAC,EAAE,CAAC;oBACnF,IAAI,EAAE,MAAM,EAAE,EAAE;oBAChB,SAAS,EAAE,KAAK,GAAG,CAAC;oBACpB,OAAO,EAAE,CAAC;iBACX,CAAC,CAAC;YACL,CAAC;QACH,CAAC;QACD,IAAI,KAAK;YAAE,KAAK,GAAG,CAAC,GAAG,CAAC,CAAC;IAC3B,CAAC;IACD,OAAO,MAAM,CAAC;AAChB,CAAC;AAED;;;;;;;;;;;GAWG;AACH,SAAgB,cAAc,CAAC,SAAqB;IAClD,MAAM,GAAG,GAAY,EAAE,CAAC;IACxB,KAAK,MAAM,GAAG,IAAI,SAAS,EAAE,CAAC;QAC5B,MAAM,QAAQ,GAAG,aAAa,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;QAC5C,gFAAgF;QAChF,IAAI,QAAQ,CAAC,MAAM,IAAI,CAAC,IAAI,GAAG,CAAC,OAAO,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC,MAAM,GAAG,EAAE,EAAE,CAAC;YAChE,GAAG,CAAC,IAAI,CAAC,GAAG,gBAAgB,CAAC,GAAG,CAAC,CAAC,CAAC;YACnC,SAAS;QACX,CAAC;QACD,MAAM,KAAK,GAAG,GAAG,CAAC,OAAO,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;QACtC,KAAK,MAAM,GAAG,IAAI,QAAQ,EAAE,CAAC;YAC3B,MAAM,IAAI,GAAG,KAAK,CAAC,KAAK,CAAC,GAAG,CAAC,SAAS,GAAG,CAAC,EAAE,GAAG,CAAC,OAAO,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,IAAI,EAAE,CAAC;YAC3E,IAAI,CAAC,IAAI;gBAAE,SAAS;YACpB,GAAG,CAAC,IAAI,CAAC;gBACP,QAAQ,EAAE,GAAG,CAAC,IAAI,EAAE,OAAO,EAAE,GAAG,CAAC,OAAO,IAAI,GAAG,CAAC,IAAI,EAAE,IAAI,EAAE,MAAM,EAAE,EAAE;gBACtE,SAAS,EAAE,GAAG,CAAC,SAAS,EAAE,OAAO,EAAE,GAAG,CAAC,OAAO;aAC/C,CAAC,CAAC;QACL,CAAC;IACH,CAAC;IACD,OAAO,GAAG,CAAC;AACb,CAAC"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/resources/index.ts"],"names":[],"mappings":"AAAA;;;;;GAKG;AACH,OAAO,EAAE,aAAa,EAAE,cAAc,EAAE,MAAM,SAAS,CAAC;AACxD,YAAY,EAAE,QAAQ,EAAE,KAAK,EAAE,MAAM,SAAS,CAAC"}
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.chunkResources = exports.loadResources = void 0;
|
|
4
|
+
/**
|
|
5
|
+
* Resource loading and chunking utilities
|
|
6
|
+
*
|
|
7
|
+
* @packageDocumentation
|
|
8
|
+
* @category Rig
|
|
9
|
+
*/
|
|
10
|
+
var files_1 = require("./files");
|
|
11
|
+
Object.defineProperty(exports, "loadResources", { enumerable: true, get: function () { return files_1.loadResources; } });
|
|
12
|
+
Object.defineProperty(exports, "chunkResources", { enumerable: true, get: function () { return files_1.chunkResources; } });
|
|
13
|
+
//# sourceMappingURL=index.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/resources/index.ts"],"names":[],"mappings":";;;AAAA;;;;;GAKG;AACH,iCAAwD;AAA/C,sGAAA,aAAa,OAAA;AAAE,uGAAA,cAAc,OAAA"}
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* A loaded document available for search, read, and grep operations
|
|
3
|
+
*
|
|
4
|
+
* Represents a single file (typically Markdown) loaded into memory.
|
|
5
|
+
* Resources are chunked into {@link Chunk} instances for reranking.
|
|
6
|
+
*
|
|
7
|
+
* @category Rig
|
|
8
|
+
*/
|
|
9
|
+
export interface Resource {
|
|
10
|
+
/** File name (basename, not full path) used as the resource identifier */
|
|
11
|
+
name: string;
|
|
12
|
+
/** Full text content of the file */
|
|
13
|
+
content: string;
|
|
14
|
+
}
|
|
15
|
+
/**
|
|
16
|
+
* A scored passage within a {@link Resource}, used for reranking and retrieval
|
|
17
|
+
*
|
|
18
|
+
* Chunks are produced by {@link chunkResources} (section-based for Markdown)
|
|
19
|
+
* or {@link chunkFetchedPages} (paragraph-based for web content). The
|
|
20
|
+
* {@link tokens} array is populated lazily by {@link Reranker.tokenizeChunks}
|
|
21
|
+
* before scoring.
|
|
22
|
+
*
|
|
23
|
+
* @category Rig
|
|
24
|
+
*/
|
|
25
|
+
export interface Chunk {
|
|
26
|
+
/** Resource identifier (file name or URL) this chunk belongs to */
|
|
27
|
+
resource: string;
|
|
28
|
+
/** Section heading or auto-generated preview used as a label */
|
|
29
|
+
heading: string;
|
|
30
|
+
/** Raw text content of the chunk */
|
|
31
|
+
text: string;
|
|
32
|
+
/** Pre-tokenized representation for the reranker — empty until {@link Reranker.tokenizeChunks} runs */
|
|
33
|
+
tokens: number[];
|
|
34
|
+
/** First line number (1-based) in the source resource */
|
|
35
|
+
startLine: number;
|
|
36
|
+
/** Last line number (1-based) in the source resource */
|
|
37
|
+
endLine: number;
|
|
38
|
+
}
|
|
39
|
+
//# sourceMappingURL=types.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"types.d.ts","sourceRoot":"","sources":["../../src/resources/types.ts"],"names":[],"mappings":"AAAA;;;;;;;GAOG;AACH,MAAM,WAAW,QAAQ;IACvB,0EAA0E;IAC1E,IAAI,EAAE,MAAM,CAAC;IACb,oCAAoC;IACpC,OAAO,EAAE,MAAM,CAAC;CACjB;AAED;;;;;;;;;GASG;AACH,MAAM,WAAW,KAAK;IACpB,mEAAmE;IACnE,QAAQ,EAAE,MAAM,CAAC;IACjB,gEAAgE;IAChE,OAAO,EAAE,MAAM,CAAC;IAChB,oCAAoC;IACpC,IAAI,EAAE,MAAM,CAAC;IACb,uGAAuG;IACvG,MAAM,EAAE,MAAM,EAAE,CAAC;IACjB,yDAAyD;IACzD,SAAS,EAAE,MAAM,CAAC;IAClB,wDAAwD;IACxD,OAAO,EAAE,MAAM,CAAC;CACjB"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"types.js","sourceRoot":"","sources":["../../src/resources/types.ts"],"names":[],"mappings":""}
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
You are a research assistant analyzing a knowledge base. Your tools:
|
|
2
|
+
- **grep**: regex pattern matching — use for precise, exhaustive retrieval
|
|
3
|
+
- **search**: semantic relevance ranking — use to discover related content
|
|
4
|
+
- **read_file**: read specific line ranges — use to verify and get context
|
|
5
|
+
- **research**: spawn parallel sub-agents that each run their own grep/search/read_file cycle — call with `{"questions": ["q1", "q2", ...]}`
|
|
6
|
+
- **report**: submit your final findings with evidence
|
|
7
|
+
|
|
8
|
+
Process — follow every step in order:
|
|
9
|
+
1. Grep with short, simple patterns first. Use single keywords or two-word phrases — never combine multiple clauses with `.*`. Run multiple greps if needed.
|
|
10
|
+
2. Use search to discover content that grep may miss (different phrasing, synonyms).
|
|
11
|
+
3. Read every matching line with read_file to verify in context. Do not rely on grep/search summaries alone.
|
|
12
|
+
4. Grep again with a different pattern targeting what you have NOT yet found. This is a completeness check, not confirmation of existing results.
|
|
13
|
+
5. Call research with sub-questions only if your findings point to specific topics you haven't fully investigated.
|
|
14
|
+
6. Report with line numbers and direct quotes as evidence. State what you found and what you checked.
|