raglab 0.2.1__tar.gz → 0.2.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,78 @@
1
+ # raglab — agent instructions
2
+
3
+ `raglab` is the **agentic-search / RAG orchestration layer**: it turns a
4
+ retrieval substrate into a *Search Agent* (plan → formulate → retrieve →
5
+ evaluate → re-query → rerank → cite) and, later, RAG pipelines on top.
6
+
7
+ > Fresh start (v0.2.0+). This repo took over the `raglab` PyPI name; the old
8
+ > backend lives at `raglab_bak`. Development is just beginning — greenfield.
9
+
10
+ ## The architecture we're building toward
11
+
12
+ The reference design is **`ir_09 — A Composable Search Agent`**, in the **ir
13
+ repo** at `$PP/i/ir/misc/docs/ir_09 -- A Composable Search Agent ...md`, and the
14
+ cross-repo plan is **i2mint/ir epic #38**. Read both before non-trivial work.
15
+
16
+ raglab is the **orchestration layer on top of `ir`** (the retrieval substrate).
17
+ "Structure over concretion": a small set of **roles** (Protocols) wired by a
18
+ control loop, with concrete tools injected at the leaves.
19
+
20
+ ### raglab owns (the *agent*)
21
+
22
+ - **Value types** (frozen, plain data, ir_09 §3): `Query`, `SubTask(goal, sources)`,
23
+ `LowLevelQuery(source, spec)`, `Judgement(relevant, sufficient, refinement)`.
24
+ (`Result` = ir's `SearchHit`/`Disclosure`, reused as-is — do not redefine it.)
25
+ - **Role Protocols** (open-closed strategy seams, injected callables):
26
+ `Planner`, `Formulator`, `Retriever`, `Evaluator`, `Reranker`, `Citer`.
27
+ - **The control loop with the back-edge** (evaluator → reformulate). v1 = an
28
+ imperative `while` loop over a mutable `AgentState` (ir_09 §9), not a
29
+ cyclic-graph runner. This back-edge is what makes it an agent vs a DAG.
30
+ - **Budget governor** (`max_rounds` / `max_sources_per_task` / `max_results_per_task`);
31
+ termination as a **separately injected policy** (ir_09 §9), not folded into the
32
+ evaluator.
33
+ - **Source registry** = a live `Mapping[name, Retriever]` across *heterogeneous*
34
+ backends (ir corpora + web/SQL/graph); cross-source merge + global rerank at
35
+ the fan-in point.
36
+ - **Two orchestrators behind one interface** (ir_09 §7): `SingleContextAgent`
37
+ (default, cheap, one ReAct loop) and `MultiAgentAgent` (one subagent per
38
+ sub-task/source, ~15× cost, breadth-first only). Promotion swaps only the
39
+ orchestrator, keeping role contracts identical.
40
+
41
+ ### What raglab CONSUMES from ir (do not reimplement these)
42
+
43
+ - `ir.as_retriever(corpus)` → register an ir corpus as one `Retriever` key (#33).
44
+ - `ir.registry.retrievers()` → the ir-corpus slice of the source registry (#34).
45
+ - `ir.make_llm_formulator` / the `formulate=` seam → the Formulator role (#32).
46
+ - `ir.Selection` + its derived `sufficient` signal → Evaluator input (#35).
47
+ - `ir.disclose(..., store=...)` + `SearchHit.to_dict()` → pointer-passing / lazy
48
+ deref across the subagent boundary (#36).
49
+
50
+ ### What belongs ELSEWHERE
51
+
52
+ - **Generation / answer synthesis and the Citer/Verifier** (which needs a
53
+ generated claim) sit with the RAG/generation layer (`srag`), not the search
54
+ agent. The agent's deliverable is **pointers + extractions**, not an essay.
55
+
56
+ ## Dependency direction (load-bearing)
57
+
58
+ **`raglab` imports `ir` (and `oa` for LLM strategies); `ir` NEVER imports
59
+ `raglab`.** Keep LLM ops (`oa`) lazy/opt-in so `import raglab` stays offline.
60
+
61
+ ## Build order (ir_09 §8 / epic #38)
62
+
63
+ 1. `Retriever` Protocol + source registry with 2–3 real backends (ir corpora via
64
+ `ir.as_retriever`; one web/SQL).
65
+ 2. `SingleContextAgent` with a trivial planner + pass-through evaluator wrapping
66
+ ir's search/select/disclose — the thin slice, **no loop yet**.
67
+ 3. LLM `Formulator` + `Evaluator` and **turn on the back-edge**.
68
+ 4. Reranker at fan-in; Citer (in `srag`).
69
+ 5. Budget governor + run-log / observability.
70
+ 6. `MultiAgentAgent` — only if breadth justifies the cost.
71
+
72
+ ## House style (i2mint ecosystem)
73
+
74
+ Functional > OOP; SOLID when OOP; facades, SSOT, dependency injection;
75
+ progressive disclosure; keyword-only beyond the 3rd positional; `collections.abc`
76
+ + frozen `dataclass`es; `Protocol`s for the role seams; every module has a
77
+ top-level docstring. Never `pip install` local ecosystem packages (`ir`, `ef`,
78
+ `vd`, `dol`, `oa`, …) — they're local via `.pth`. wads CI auto-publishes on merge.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: raglab
3
- Version: 0.2.1
3
+ Version: 0.2.2
4
4
  Summary: A medley of tools to make RAG-based applications.
5
5
  Project-URL: Homepage, https://github.com/thorwhalen/raglab
6
6
  Project-URL: Repository, https://github.com/thorwhalen/raglab
@@ -9,6 +9,7 @@ Author: thorwhalen
9
9
  License: mit
10
10
  License-File: LICENSE
11
11
  Requires-Python: >=3.10
12
+ Requires-Dist: ir>=0.1.12
12
13
  Provides-Extra: dev
13
14
  Requires-Dist: pytest-cov>=4.0; extra == 'dev'
14
15
  Requires-Dist: pytest>=7.0; extra == 'dev'
@@ -16,6 +17,8 @@ Requires-Dist: ruff>=0.1.0; extra == 'dev'
16
17
  Provides-Extra: docs
17
18
  Requires-Dist: sphinx-rtd-theme>=1.0; extra == 'docs'
18
19
  Requires-Dist: sphinx>=6.0; extra == 'docs'
20
+ Provides-Extra: llm
21
+ Requires-Dist: oa; extra == 'llm'
19
22
  Description-Content-Type: text/markdown
20
23
 
21
24
  # raglab
@@ -6,7 +6,7 @@ build-backend = "hatchling.build"
6
6
 
7
7
  [project]
8
8
  name = "raglab"
9
- version = "0.2.1"
9
+ version = "0.2.2"
10
10
  description = "A medley of tools to make RAG-based applications."
11
11
  readme = "README.md"
12
12
  requires-python = ">=3.10"
@@ -14,7 +14,12 @@ keywords = []
14
14
  authors = [
15
15
  { name = "thorwhalen" },
16
16
  ]
17
- dependencies = []
17
+ # raglab orchestrates retrieval via `ir` (the retrieval substrate). LLM
18
+ # strategies (Planner/Formulator/Evaluator) use `oa` lazily — the `llm` extra
19
+ # below — so `import raglab` stays offline by default.
20
+ dependencies = [
21
+ "ir>=0.1.12",
22
+ ]
18
23
 
19
24
  [project.license]
20
25
  text = "mit"
@@ -25,6 +30,9 @@ Repository = "https://github.com/thorwhalen/raglab"
25
30
  Documentation = "https://thorwhalen.github.io/raglab"
26
31
 
27
32
  [project.optional-dependencies]
33
+ llm = [
34
+ "oa",
35
+ ]
28
36
  dev = [
29
37
  "pytest>=7.0",
30
38
  "pytest-cov>=4.0",
@@ -50,13 +58,18 @@ exclude = [
50
58
  ]
51
59
 
52
60
  [tool.ruff.lint]
61
+ # Real lint rules from the start (pyflakes / pycodestyle / bugbear), not just
62
+ # docstring presence. E501 (line length) stays off — long descriptive docstrings.
53
63
  select = [
54
64
  "D100",
65
+ "F",
66
+ "E",
67
+ "W",
68
+ "B",
55
69
  ]
56
70
  ignore = [
57
71
  "D203",
58
72
  "E501",
59
- "B905",
60
73
  ]
61
74
 
62
75
  [tool.ruff.lint.pydocstyle]
@@ -0,0 +1,72 @@
1
+ """``raglab`` — the agentic-search / RAG orchestration layer on top of ``ir``.
2
+
3
+ `raglab` turns a retrieval substrate into a *Composable Search Agent* (the ir_09
4
+ architecture): a small set of injected **roles** — Planner, Formulator,
5
+ Retriever, Evaluator, Reranker, Citer — wired by a control loop whose back-edge
6
+ (evaluator → reformulate) is what makes it an agent rather than a DAG. Concrete
7
+ tools live at the leaves: an ``ir`` corpus becomes one ``Retriever`` via
8
+ ``ir.as_retriever``.
9
+
10
+ Quick start (the no-LLM thin slice — runs offline)::
11
+
12
+ import ir
13
+ import raglab
14
+
15
+ # register ir corpora as the agent's sources, then search across them:
16
+ sources = raglab.ir_sources("skills", "reports", mode="hybrid")
17
+ agent = raglab.make_search_agent(sources)
18
+ results = agent("how do I deploy the app") # ranked ir.SearchHits
19
+
20
+ Inject an LLM ``formulator`` (query rewrite/HyDE) and ``evaluator`` (sufficiency
21
+ + refinement) to turn on query understanding and the back-edge. Dependency
22
+ direction is one-way: ``raglab`` imports ``ir``; ``ir`` never imports ``raglab``.
23
+
24
+ > Fresh start (v0.2.0+). This repo took over the ``raglab`` PyPI name; the older
25
+ > backend now lives at ``raglab_bak``. Development is just beginning.
26
+ """
27
+
28
+ from .agent import (
29
+ Budget,
30
+ Citer,
31
+ Evaluator,
32
+ Formulator,
33
+ Judgement,
34
+ LowLevelQuery,
35
+ Planner,
36
+ Query,
37
+ Reranker,
38
+ Result,
39
+ Retriever,
40
+ SingleContextAgent,
41
+ SubTask,
42
+ identity_citer,
43
+ identity_formulator,
44
+ ir_sources,
45
+ make_search_agent,
46
+ passthrough_evaluator,
47
+ score_reranker,
48
+ single_subtask_planner,
49
+ )
50
+
51
+ __all__ = [
52
+ "Query",
53
+ "SubTask",
54
+ "LowLevelQuery",
55
+ "Judgement",
56
+ "Result",
57
+ "Retriever",
58
+ "Planner",
59
+ "Formulator",
60
+ "Evaluator",
61
+ "Reranker",
62
+ "Citer",
63
+ "Budget",
64
+ "SingleContextAgent",
65
+ "make_search_agent",
66
+ "ir_sources",
67
+ "single_subtask_planner",
68
+ "identity_formulator",
69
+ "passthrough_evaluator",
70
+ "score_reranker",
71
+ "identity_citer",
72
+ ]
@@ -0,0 +1,293 @@
1
+ """The Composable Search Agent — roles wired by a control loop (ir_09).
2
+
3
+ `raglab` is the **orchestration layer on top of `ir`** (the retrieval substrate).
4
+ This module is the v1 foundation: the immutable value types, the role *Protocols*
5
+ (open-closed strategy seams), and a `SingleContextAgent` whose fixed control loop
6
+ is fully parametrized by injected roles. Concrete tools live at the leaves — an
7
+ `ir` corpus becomes one `Retriever` via :func:`ir.as_retriever`.
8
+
9
+ The shape follows ir_09 §3/§6: a small set of named roles —
10
+ ``Planner / Formulator / Retriever / Evaluator / Reranker / Citer`` — and a loop
11
+ whose defining feature is the **back-edge** (evaluator → reformulate) that makes
12
+ it an *agent* rather than a DAG. v1 ships the loop with smart defaults (a trivial
13
+ planner + a pass-through evaluator), so the thin slice runs end-to-end with no
14
+ LLM; turning on the back-edge is just injecting an LLM ``Evaluator`` that returns
15
+ a ``refinement`` (ir_09 §8 step 3).
16
+
17
+ Progressive disclosure: :func:`make_search_agent` gives sensible defaults for
18
+ every role, so the simple path is ``make_search_agent(sources)("query")``.
19
+
20
+ Dependency direction is one-way: `raglab` imports `ir`; `ir` never imports
21
+ `raglab`. The ``Result`` type and the ``Retriever`` contract are ir's (SSOT).
22
+ """
23
+
24
+ from __future__ import annotations
25
+
26
+ from collections.abc import Mapping, Sequence
27
+ from dataclasses import dataclass, field
28
+ from typing import Any, Protocol, runtime_checkable
29
+
30
+ # ir owns the retrieval substrate: the Result type and the Retriever leaf
31
+ # contract live there (one-way dependency, ir is the SSOT).
32
+ from ir import Retriever, SearchHit
33
+
34
+ #: A retrieved item — ir's :class:`~ir.base.SearchHit` (ir_09's ``Result``):
35
+ #: a *pointer + snippet* (``text``) with a ``score`` and ``metadata``.
36
+ Result = SearchHit
37
+
38
+ __all__ = [
39
+ "Query",
40
+ "SubTask",
41
+ "LowLevelQuery",
42
+ "Judgement",
43
+ "Result",
44
+ "Retriever",
45
+ "Planner",
46
+ "Formulator",
47
+ "Evaluator",
48
+ "Reranker",
49
+ "Citer",
50
+ "Budget",
51
+ "SingleContextAgent",
52
+ "make_search_agent",
53
+ "ir_sources",
54
+ "single_subtask_planner",
55
+ "identity_formulator",
56
+ "passthrough_evaluator",
57
+ "score_reranker",
58
+ "identity_citer",
59
+ ]
60
+
61
+
62
+ # --------------------------------------------------------------------------- #
63
+ # Value types (immutable, plain data) — ir_09 §3
64
+ # --------------------------------------------------------------------------- #
65
+
66
+
67
+ @dataclass(frozen=True)
68
+ class Query:
69
+ """A user intent: free text plus optional structured constraints."""
70
+
71
+ text: str
72
+ constraints: Mapping[str, Any] = field(default_factory=dict)
73
+
74
+
75
+ @dataclass(frozen=True)
76
+ class SubTask:
77
+ """A planner's unit of work: a sub-goal bound to a set of registered sources."""
78
+
79
+ goal: str
80
+ sources: tuple[str, ...]
81
+
82
+
83
+ @dataclass(frozen=True)
84
+ class LowLevelQuery:
85
+ """One concrete query against one source.
86
+
87
+ ``query`` is the text handed to the source's :data:`Retriever`; ``params``
88
+ are per-call retriever overrides (e.g. ``mode`` / ``filter`` / ``k`` for an
89
+ ir corpus). A formulator turns a :class:`SubTask` into these.
90
+ """
91
+
92
+ source: str
93
+ query: str
94
+ params: Mapping[str, Any] = field(default_factory=dict)
95
+
96
+
97
+ @dataclass(frozen=True)
98
+ class Judgement:
99
+ """An evaluator's verdict over a round's results.
100
+
101
+ ``relevant`` is the kept subset; ``sufficient`` says whether to stop; a
102
+ non-``None`` ``refinement`` is the **back-edge** — the next sub-task to
103
+ re-query with. The pass-through default returns ``sufficient=True`` and no
104
+ refinement (no loop).
105
+ """
106
+
107
+ relevant: Sequence[Result]
108
+ sufficient: bool
109
+ refinement: SubTask | None = None
110
+
111
+
112
+ # --------------------------------------------------------------------------- #
113
+ # Role Protocols (the open-closed strategy seams) — ir_09 §3
114
+ # --------------------------------------------------------------------------- #
115
+
116
+
117
+ @runtime_checkable
118
+ class Planner(Protocol):
119
+ """Decompose a query into sub-tasks and select sources for each."""
120
+
121
+ def __call__(
122
+ self, query: Query, sources: Mapping[str, Retriever]
123
+ ) -> list[SubTask]: ...
124
+
125
+
126
+ @runtime_checkable
127
+ class Formulator(Protocol):
128
+ """Turn a sub-task + one source into concrete low-level queries."""
129
+
130
+ def __call__(self, task: SubTask, source: str) -> list[LowLevelQuery]: ...
131
+
132
+
133
+ @runtime_checkable
134
+ class Evaluator(Protocol):
135
+ """Judge relevance + sufficiency; optionally emit a refinement (back-edge)."""
136
+
137
+ def __call__(self, task: SubTask, results: Sequence[Result]) -> Judgement: ...
138
+
139
+
140
+ @runtime_checkable
141
+ class Reranker(Protocol):
142
+ """Produce the final ordering over the (cross-source) merged results."""
143
+
144
+ def __call__(self, results: Sequence[Result]) -> Sequence[Result]: ...
145
+
146
+
147
+ @runtime_checkable
148
+ class Citer(Protocol):
149
+ """Confirm/annotate that each result supports its use (identity by default)."""
150
+
151
+ def __call__(self, results: Sequence[Result]) -> Sequence[Result]: ...
152
+
153
+
154
+ # --------------------------------------------------------------------------- #
155
+ # Budget governor — ir_09 §4
156
+ # --------------------------------------------------------------------------- #
157
+
158
+
159
+ @dataclass(frozen=True)
160
+ class Budget:
161
+ """Loop bounds: the safety net under the (harder) sufficiency decision."""
162
+
163
+ max_rounds: int = 3
164
+ max_sources_per_task: int = 4
165
+ max_results_per_task: int = 50
166
+
167
+
168
+ # --------------------------------------------------------------------------- #
169
+ # Default role implementations (the no-LLM thin slice) — ir_09 §8 step 2
170
+ # --------------------------------------------------------------------------- #
171
+
172
+
173
+ def single_subtask_planner(
174
+ query: Query, sources: Mapping[str, Retriever]
175
+ ) -> list[SubTask]:
176
+ """Trivial planner: one sub-task over *all* registered sources, no decomposition."""
177
+ return [SubTask(goal=query.text, sources=tuple(sources))]
178
+
179
+
180
+ def identity_formulator(task: SubTask, source: str) -> list[LowLevelQuery]:
181
+ """Identity formulator: the sub-goal verbatim as one query (no rewrite/HyDE)."""
182
+ return [LowLevelQuery(source=source, query=task.goal)]
183
+
184
+
185
+ def passthrough_evaluator(task: SubTask, results: Sequence[Result]) -> Judgement:
186
+ """Pass-through critic: keep everything, declare sufficient, never re-query."""
187
+ return Judgement(relevant=list(results), sufficient=True, refinement=None)
188
+
189
+
190
+ def score_reranker(results: Sequence[Result]) -> Sequence[Result]:
191
+ """Final ordering by descending ``score`` (the cross-source merge, v1).
192
+
193
+ Note: a plain score sort assumes comparable score scales across sources
194
+ (true when they share an embedder + mode). A rank-based (RRF) cross-source
195
+ merge for heterogeneous backends is a documented follow-up.
196
+ """
197
+ return sorted(results, key=lambda r: float(getattr(r, "score", 0.0)), reverse=True)
198
+
199
+
200
+ def identity_citer(results: Sequence[Result]) -> Sequence[Result]:
201
+ """No-op citer (verification needs a generated claim — that lives in srag)."""
202
+ return results
203
+
204
+
205
+ # --------------------------------------------------------------------------- #
206
+ # The orchestrator — a fixed control loop, fully parametrized by roles
207
+ # --------------------------------------------------------------------------- #
208
+
209
+
210
+ @dataclass
211
+ class SingleContextAgent:
212
+ """One ReAct-style loop, sequential sub-tasks (ir_09 §7 — the cheap default).
213
+
214
+ The loop is fixed; every *decision* is an injected role. Promotion to a
215
+ multi-agent orchestrator (ir_09 §7) swaps this class while keeping the same
216
+ role contracts. The **back-edge** is the single line ``current =
217
+ judged.refinement`` in :meth:`_run_task` — that is what makes this an agent.
218
+ """
219
+
220
+ sources: Mapping[str, Retriever]
221
+ planner: Planner = single_subtask_planner
222
+ formulator: Formulator = identity_formulator
223
+ evaluator: Evaluator = passthrough_evaluator
224
+ reranker: Reranker = score_reranker
225
+ citer: Citer = identity_citer
226
+ budget: Budget = field(default_factory=Budget)
227
+
228
+ def __call__(self, query: str | Query) -> list[Result]:
229
+ """Run the agent for *query*; returns the final ranked, cited results."""
230
+ q = query if isinstance(query, Query) else Query(text=query)
231
+ accumulated: list[Result] = []
232
+ for task in self.planner(q, self.sources):
233
+ accumulated.extend(self._run_task(task))
234
+ ranked = self.reranker(accumulated)
235
+ return list(self.citer(ranked))
236
+
237
+ def _run_task(self, task: SubTask) -> list[Result]:
238
+ found: list[Result] = []
239
+ current = task
240
+ for _ in range(max(1, self.budget.max_rounds)):
241
+ for source in current.sources[: self.budget.max_sources_per_task]:
242
+ retriever = self.sources.get(source)
243
+ if retriever is None:
244
+ continue
245
+ for llq in self.formulator(current, source):
246
+ found.extend(retriever(llq.query, **dict(llq.params)))
247
+ judged = self.evaluator(current, found[: self.budget.max_results_per_task])
248
+ found = list(judged.relevant)
249
+ if judged.sufficient or judged.refinement is None:
250
+ break
251
+ current = judged.refinement # the back-edge: re-query
252
+ return found
253
+
254
+
255
+ def make_search_agent(
256
+ sources: Mapping[str, Retriever],
257
+ *,
258
+ planner: Planner | None = None,
259
+ formulator: Formulator | None = None,
260
+ evaluator: Evaluator | None = None,
261
+ reranker: Reranker | None = None,
262
+ citer: Citer | None = None,
263
+ budget: Budget | None = None,
264
+ ) -> SingleContextAgent:
265
+ """Build a :class:`SingleContextAgent` over *sources* with smart defaults.
266
+
267
+ ``sources`` is a ``Mapping[name, Retriever]`` — e.g.
268
+ ``{"skills": ir.as_retriever("skills")}``. Every role defaults to its no-LLM
269
+ thin-slice implementation, so ``make_search_agent(sources)("query")`` just
270
+ works; inject an LLM ``formulator`` / ``evaluator`` to turn on rewriting and
271
+ the back-edge.
272
+ """
273
+ return SingleContextAgent(
274
+ sources=sources,
275
+ planner=planner or single_subtask_planner,
276
+ formulator=formulator or identity_formulator,
277
+ evaluator=evaluator or passthrough_evaluator,
278
+ reranker=reranker or score_reranker,
279
+ citer=citer or identity_citer,
280
+ budget=budget or Budget(),
281
+ )
282
+
283
+
284
+ def ir_sources(*names: str, **search_defaults: Any) -> dict[str, Retriever]:
285
+ """A source registry ``{name: Retriever}`` backed by named ``ir`` corpora.
286
+
287
+ Each name is bound to ``ir.as_retriever(name, **search_defaults)``. A thin
288
+ convenience; once ir ships ``registry.retrievers()`` (a lazy view), prefer
289
+ that. ``search_defaults`` (e.g. ``mode="hybrid"``) apply to every source.
290
+ """
291
+ import ir
292
+
293
+ return {name: ir.as_retriever(name, **search_defaults) for name in names}
@@ -0,0 +1,158 @@
1
+ """Tests for the raglab Composable Search Agent foundation (ir_09).
2
+
3
+ Hermetic: the control loop is exercised with a fake in-memory retriever (no
4
+ model, no network); one end-to-end test wires a real ``ir`` corpus via
5
+ ``ir.as_retriever`` with the light (numpy-only) embedder and an in-memory store.
6
+ """
7
+
8
+ import raglab
9
+ from ir import SearchHit
10
+ from raglab import Budget, Judgement, LowLevelQuery, Query, SubTask, make_search_agent
11
+
12
+
13
+ def _hits(*specs):
14
+ """``(artifact_id, score)`` pairs -> ir.SearchHits (no corpus needed)."""
15
+ return [SearchHit(aid, "k", score, f"text {aid}", {}) for aid, score in specs]
16
+
17
+
18
+ def _fake_retriever(hits):
19
+ """A Retriever that records its calls and returns canned hits."""
20
+ calls = []
21
+
22
+ def retrieve(query, **kw):
23
+ calls.append((query, kw))
24
+ return list(hits)
25
+
26
+ retrieve.calls = calls
27
+ return retrieve
28
+
29
+
30
+ # ----- value types & protocols --------------------------------------------- #
31
+
32
+
33
+ def test_result_is_ir_searchhit():
34
+ assert raglab.Result is SearchHit
35
+
36
+
37
+ def test_defaults_satisfy_role_protocols():
38
+ assert isinstance(raglab.single_subtask_planner, raglab.Planner)
39
+ assert isinstance(raglab.identity_formulator, raglab.Formulator)
40
+ assert isinstance(raglab.passthrough_evaluator, raglab.Evaluator)
41
+ assert isinstance(raglab.score_reranker, raglab.Reranker)
42
+ assert isinstance(raglab.identity_citer, raglab.Citer)
43
+
44
+
45
+ # ----- the thin-slice loop (no LLM) ----------------------------------------- #
46
+
47
+
48
+ def test_make_search_agent_thin_slice_runs():
49
+ sources = {"s": _fake_retriever(_hits(("a", 0.9), ("b", 0.5)))}
50
+ results = make_search_agent(sources)("anything")
51
+ assert [r.artifact_id for r in results] == ["a", "b"]
52
+
53
+
54
+ def test_query_string_or_object_equivalent():
55
+ sources = {"s": _fake_retriever(_hits(("a", 0.9)))}
56
+ agent = make_search_agent(sources)
57
+ assert agent("q") == agent(Query(text="q"))
58
+
59
+
60
+ def test_cross_source_merge_reranks_by_score():
61
+ sources = {
62
+ "s1": _fake_retriever(_hits(("a", 0.3))),
63
+ "s2": _fake_retriever(_hits(("b", 0.9))),
64
+ }
65
+ results = make_search_agent(sources)("q")
66
+ assert [r.artifact_id for r in results] == ["b", "a"] # by score desc
67
+
68
+
69
+ def test_passthrough_evaluator_does_not_loop():
70
+ retr = _fake_retriever(_hits(("a", 0.9)))
71
+ make_search_agent({"s": retr})("q")
72
+ assert len(retr.calls) == 1 # one round only
73
+
74
+
75
+ def test_formulator_fan_out_issues_each_query():
76
+ retr = _fake_retriever(_hits(("a", 0.9)))
77
+
78
+ def multi(task, source):
79
+ return [
80
+ LowLevelQuery(source, task.goal),
81
+ LowLevelQuery(source, task.goal + " alt"),
82
+ ]
83
+
84
+ make_search_agent({"s": retr}, formulator=multi)("q")
85
+ assert len(retr.calls) == 2
86
+
87
+
88
+ def test_unknown_source_is_skipped_not_an_error():
89
+ retr = _fake_retriever(_hits(("a", 0.9)))
90
+
91
+ def planner(query, sources):
92
+ return [SubTask(query.text, ("s", "missing"))]
93
+
94
+ results = make_search_agent({"s": retr}, planner=planner)("q")
95
+ assert [r.artifact_id for r in results] == ["a"]
96
+
97
+
98
+ # ----- the back-edge (the property that makes it an agent) ------------------ #
99
+
100
+
101
+ def test_back_edge_reformulates_until_sufficient():
102
+ retr = _fake_retriever(_hits(("a", 0.9)))
103
+ rounds = {"n": 0}
104
+
105
+ def refining_evaluator(task, results):
106
+ rounds["n"] += 1
107
+ if rounds["n"] < 2: # first round: not enough, re-query (back-edge)
108
+ return Judgement(
109
+ relevant=list(results),
110
+ sufficient=False,
111
+ refinement=SubTask(goal=task.goal + " more", sources=task.sources),
112
+ )
113
+ return Judgement(relevant=list(results), sufficient=True)
114
+
115
+ make_search_agent({"s": retr}, evaluator=refining_evaluator)("q")
116
+ assert rounds["n"] == 2 # looped once via the back-edge
117
+ assert len(retr.calls) == 2
118
+
119
+
120
+ def test_budget_bounds_a_never_sufficient_loop():
121
+ retr = _fake_retriever(_hits(("a", 0.9)))
122
+
123
+ def never_sufficient(task, results):
124
+ return Judgement(
125
+ relevant=list(results),
126
+ sufficient=False,
127
+ refinement=SubTask(task.goal, task.sources),
128
+ )
129
+
130
+ make_search_agent(
131
+ {"s": retr}, evaluator=never_sufficient, budget=Budget(max_rounds=3)
132
+ )("q")
133
+ assert len(retr.calls) == 3 # exactly max_rounds — the safety net holds
134
+
135
+
136
+ # ----- end-to-end over a REAL ir corpus (hermetic: light embedder) ---------- #
137
+
138
+
139
+ def test_agent_over_real_ir_corpus():
140
+ import ir
141
+ from ir.store import CorpusStore
142
+
143
+ docs = {
144
+ "deploy": "deploy the app to the server with systemd units",
145
+ "embed": "embed text with a model and cache the vectors",
146
+ "search": "vector similarity search with metadata filters",
147
+ }
148
+ corpus = ir.build(
149
+ ir.CorpusSource.from_mapping(docs, name="t", strategy=ir.WholeText()),
150
+ store=CorpusStore.memory(),
151
+ embedder="light",
152
+ )
153
+ agent = make_search_agent({"t": ir.as_retriever(corpus, k=3)})
154
+ results = agent("deploy the app to the server")
155
+ assert results
156
+ assert results[0].artifact_id == "deploy"
157
+ assert isinstance(results[0], SearchHit)
158
+ results[0].to_dict() # the substrate edge is serialization-clean
@@ -1,7 +0,0 @@
1
- """A medley of tools to make RAG-based applications.
2
-
3
- This is a fresh start for the ``raglab`` name (version 0.2.0+). The earlier
4
- ``raglab`` backend (PyPI versions 0.0.x–0.1.x) now lives at
5
- `addaix/raglab_bak <https://github.com/addaix/raglab_bak>`_ and is published as
6
- ``raglab_bak``. Development of this new ``raglab`` is just beginning.
7
- """
File without changes
File without changes
File without changes
File without changes
File without changes