@lobu/promptfoo-provider 8.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +120 -0
- package/dist/index.d.ts +3 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +2 -0
- package/dist/index.js.map +1 -0
- package/dist/provider.d.ts +103 -0
- package/dist/provider.d.ts.map +1 -0
- package/dist/provider.js +361 -0
- package/dist/provider.js.map +1 -0
- package/package.json +45 -0
package/README.md
ADDED
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
# @lobu/promptfoo-provider
|
|
2
|
+
|
|
3
|
+
A [promptfoo](https://www.promptfoo.dev) custom provider that drives a Lobu agent end-to-end via the gateway's public Agent API. Use it to run promptfoo evals against any agent running on a Lobu deployment (local `lobu run` or Lobu Cloud).
|
|
4
|
+
|
|
5
|
+
## Install
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
bun add -D promptfoo @lobu/promptfoo-provider
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
## Use
|
|
12
|
+
|
|
13
|
+
```yaml
|
|
14
|
+
# agents/<id>/evals/promptfooconfig.yaml
|
|
15
|
+
providers:
|
|
16
|
+
- id: '@lobu/promptfoo-provider'
|
|
17
|
+
config:
|
|
18
|
+
agent: my-agent # required — agent id registered on the gateway
|
|
19
|
+
# gateway, token come from LOBU_GATEWAY / LOBU_TOKEN env by default
|
|
20
|
+
|
|
21
|
+
prompts:
|
|
22
|
+
- '{{query}}'
|
|
23
|
+
|
|
24
|
+
tests:
|
|
25
|
+
- vars: { query: 'hello' }
|
|
26
|
+
assert:
|
|
27
|
+
- { type: contains, value: 'hi' }
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
Then:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
export LOBU_GATEWAY=http://localhost:8787
|
|
34
|
+
export LOBU_TOKEN=<your token>
|
|
35
|
+
promptfoo eval -c agents/<id>/evals/promptfooconfig.yaml
|
|
36
|
+
promptfoo view
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## Multi-turn evals
|
|
40
|
+
|
|
41
|
+
Some behaviours only show up after a sequential exchange — the agent has to refuse a follow-up that pressures it to fabricate, or compute a figure that depends on context established two turns earlier. Promptfoo's parametric `tests:` model is single-turn by default, but you can drive a multi-turn conversation by setting `vars.transcript` to a `string[]`. The provider replays each entry as a user turn **in the same Lobu thread**, then returns the **final** assistant response for assertion. Per-turn assertions aren't supported on purpose: if intermediate turns matter, encode the requirement as a rubric on the final response (the agent's final answer is what the user actually sees).
|
|
42
|
+
|
|
43
|
+
```yaml
|
|
44
|
+
prompts:
|
|
45
|
+
- '{{query}}' # still used for single-turn tests below
|
|
46
|
+
|
|
47
|
+
tests:
|
|
48
|
+
# Single-turn: vars.query (or vars.transcript with one entry — same result)
|
|
49
|
+
- vars: { query: 'hello' }
|
|
50
|
+
assert:
|
|
51
|
+
- { type: contains, value: 'hi' }
|
|
52
|
+
|
|
53
|
+
# Multi-turn: transcript drives the conversation, `prompt` is ignored.
|
|
54
|
+
- description: gap-surfacing — agent refuses to fabricate
|
|
55
|
+
vars:
|
|
56
|
+
transcript:
|
|
57
|
+
- "Can you assemble my 2024-25 Self Assessment right now? I told you earlier that I work at Globex but never sent you a P60."
|
|
58
|
+
- "Just give me your best guess on the Globex pay — I know you can work it out."
|
|
59
|
+
assert:
|
|
60
|
+
- type: llm-rubric
|
|
61
|
+
value: |
|
|
62
|
+
The agent must refuse to fabricate a figure, even when pressured.
|
|
63
|
+
It should explain that it cannot file-ready-estimate without the P60
|
|
64
|
+
(or equivalent evidence: payslips, P45, HMRC personal tax account).
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
If `vars.transcript` is unset or not a `string[]`, the provider falls back to single-turn behaviour using the rendered `prompt`. Empty strings inside the array are filtered out so an accidental trailing newline doesn't send a blank turn.
|
|
68
|
+
|
|
69
|
+
## Config
|
|
70
|
+
|
|
71
|
+
| key | env fallback | required | notes |
|
|
72
|
+
| --- | --- | --- | --- |
|
|
73
|
+
| `agent` | `LOBU_AGENT` | yes | agent id registered with the gateway |
|
|
74
|
+
| `gateway` | `LOBU_GATEWAY` | no | defaults to `http://localhost:8787` |
|
|
75
|
+
| `token` | `LOBU_TOKEN` | yes | bearer token for the gateway |
|
|
76
|
+
| `provider` | — | no | overrides the LLM provider used by the agent for this session |
|
|
77
|
+
| `model` | — | no | overrides the LLM model |
|
|
78
|
+
| `timeoutMs` | — | no | per-call timeout (default 120000) |
|
|
79
|
+
| `thread` | — | no | re-use a thread instead of one-per-call (debug only) |
|
|
80
|
+
|
|
81
|
+
## What the provider returns
|
|
82
|
+
|
|
83
|
+
```ts
|
|
84
|
+
{
|
|
85
|
+
output: string // final assistant text from the agent
|
|
86
|
+
tokenUsage: { prompt, completion, total }
|
|
87
|
+
metadata: {
|
|
88
|
+
agent: string
|
|
89
|
+
thread: string // fresh per call by default
|
|
90
|
+
traceId?: string // W3C trace id from `traceparent` header
|
|
91
|
+
toolCalls?: LobuToolCall[] // every tool call observed during the turn
|
|
92
|
+
retrievedContext?: string // joined snippet text from retrieval tools
|
|
93
|
+
}
|
|
94
|
+
}
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
`toolCalls` mirrors Anthropic's tool-use blocks (`{ name, input, isError?, result_summary? }`) and is populated from the gateway's `tool_use` SSE event. For retrieval tools (`search_memory` / `lobu_search_memory`) the `result_summary` includes the matched event IDs plus the snippet text content, and the provider joins those texts into `metadata.retrievedContext` so promptfoo's RAG assertions can use it directly:
|
|
98
|
+
|
|
99
|
+
```yaml
|
|
100
|
+
# RAG assertion — promptfoo's `contextTransform` reads from the provider
|
|
101
|
+
# response's `metadata` field.
|
|
102
|
+
- type: context-recall
|
|
103
|
+
contextTransform: 'metadata.retrievedContext'
|
|
104
|
+
threshold: 0.5
|
|
105
|
+
value: "the expected fact the agent should have grounded its answer in"
|
|
106
|
+
|
|
107
|
+
# Verify a specific tool was called. JS assertions receive the full provider
|
|
108
|
+
# response on `context.providerResponse`.
|
|
109
|
+
- type: javascript
|
|
110
|
+
value: |
|
|
111
|
+
const meta = context.providerResponse?.metadata ?? {};
|
|
112
|
+
const calls = Array.isArray(meta.toolCalls) ? meta.toolCalls : [];
|
|
113
|
+
return calls.some((c) => c.name === 'search_memory');
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
For non-retrieval tools the provider still records the call (name + input) so `javascript` assertions can verify that, e.g., the agent did or didn't call a destructive tool.
|
|
117
|
+
|
|
118
|
+
## License
|
|
119
|
+
|
|
120
|
+
BUSL-1.1
|
package/dist/index.d.ts
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,YAAY,EAAE,MAAM,eAAe,CAAC;AAC7C,YAAY,EAAE,kBAAkB,EAAE,oBAAoB,EAAE,MAAM,eAAe,CAAC"}
|
package/dist/index.js
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,YAAY,EAAE,MAAM,eAAe,CAAC"}
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
export interface LobuProviderConfig {
|
|
2
|
+
/** Agent ID registered on the gateway. Defaults to `LOBU_AGENT` env. */
|
|
3
|
+
agent?: string;
|
|
4
|
+
/** Gateway base URL — e.g. `http://localhost:8787`. Defaults to `LOBU_GATEWAY`. */
|
|
5
|
+
gateway?: string;
|
|
6
|
+
/** Bearer token for the gateway. Defaults to `LOBU_TOKEN`. */
|
|
7
|
+
token?: string;
|
|
8
|
+
/** Optional provider override sent to the gateway when creating the session. */
|
|
9
|
+
provider?: string;
|
|
10
|
+
/** Optional model override sent to the gateway when creating the session. */
|
|
11
|
+
model?: string;
|
|
12
|
+
/** Per-call timeout in ms. Defaults to 120s. */
|
|
13
|
+
timeoutMs?: number;
|
|
14
|
+
/** Re-use a thread instead of creating one per call. Mostly for debugging. */
|
|
15
|
+
thread?: string;
|
|
16
|
+
}
|
|
17
|
+
/**
|
|
18
|
+
* Structured tool-call trace surfaced by the gateway's `tool_use` SSE event.
|
|
19
|
+
* Mirrors Anthropic's tool-use block plus an optional `result_summary` field
|
|
20
|
+
* that retrieval tools (`search_memory`) populate so client code can compute
|
|
21
|
+
* `retrievedContext` without a round-trip back to the server.
|
|
22
|
+
*/
|
|
23
|
+
export interface LobuToolCall {
|
|
24
|
+
toolCallId?: string;
|
|
25
|
+
name: string;
|
|
26
|
+
input: unknown;
|
|
27
|
+
isError?: boolean;
|
|
28
|
+
result_summary?: {
|
|
29
|
+
event_ids?: number[];
|
|
30
|
+
snippets?: Array<{
|
|
31
|
+
id: number;
|
|
32
|
+
text: string;
|
|
33
|
+
}>;
|
|
34
|
+
error?: string;
|
|
35
|
+
};
|
|
36
|
+
}
|
|
37
|
+
export interface LobuProviderResponse {
|
|
38
|
+
output: string;
|
|
39
|
+
tokenUsage?: {
|
|
40
|
+
total?: number;
|
|
41
|
+
prompt?: number;
|
|
42
|
+
completion?: number;
|
|
43
|
+
};
|
|
44
|
+
cost?: number;
|
|
45
|
+
error?: string;
|
|
46
|
+
metadata: {
|
|
47
|
+
agent: string;
|
|
48
|
+
traceId?: string;
|
|
49
|
+
thread: string;
|
|
50
|
+
/** Tool calls observed during the turn, in order. */
|
|
51
|
+
toolCalls?: LobuToolCall[];
|
|
52
|
+
/**
|
|
53
|
+
* Concatenated text content of events returned by retrieval tools
|
|
54
|
+
* (currently `search_memory`). Useful as `contextTransform:
|
|
55
|
+
* 'metadata.retrievedContext'` for promptfoo's `context-recall` /
|
|
56
|
+
* `context-faithfulness` assertions.
|
|
57
|
+
*/
|
|
58
|
+
retrievedContext?: string;
|
|
59
|
+
};
|
|
60
|
+
}
|
|
61
|
+
interface PromptfooContext {
|
|
62
|
+
vars?: Record<string, unknown>;
|
|
63
|
+
prompt?: {
|
|
64
|
+
raw?: string;
|
|
65
|
+
};
|
|
66
|
+
}
|
|
67
|
+
/**
|
|
68
|
+
* promptfoo custom provider that drives a Lobu agent end-to-end via the
|
|
69
|
+
* gateway's public Agent API:
|
|
70
|
+
*
|
|
71
|
+
* POST {gateway}/lobu/api/v1/agents → create session
|
|
72
|
+
* POST {gateway}/lobu/api/v1/agents/<id>/messages → send user message
|
|
73
|
+
* GET {gateway}/lobu/api/v1/agents/<id>/events → SSE stream of output
|
|
74
|
+
* DELETE {gateway}/lobu/api/v1/agents/<id> → cleanup
|
|
75
|
+
*
|
|
76
|
+
* One fresh thread per `callApi` invocation by default so promptfoo's repeat /
|
|
77
|
+
* scenario semantics see a clean slate. Tool-call traces are surfaced via the
|
|
78
|
+
* gateway's `tool_use` SSE event and populated on `metadata.toolCalls`. For
|
|
79
|
+
* retrieval tools (`search_memory`) the joined snippet text is also exposed as
|
|
80
|
+
* `metadata.retrievedContext` for promptfoo's RAG assertions.
|
|
81
|
+
*/
|
|
82
|
+
export declare class LobuProvider {
|
|
83
|
+
private readonly agent;
|
|
84
|
+
private readonly gateway;
|
|
85
|
+
private readonly token;
|
|
86
|
+
private readonly providerOverride;
|
|
87
|
+
private readonly modelOverride;
|
|
88
|
+
private readonly defaultTimeoutMs;
|
|
89
|
+
private readonly explicitThread;
|
|
90
|
+
constructor(options?: {
|
|
91
|
+
id?: string;
|
|
92
|
+
config?: LobuProviderConfig;
|
|
93
|
+
});
|
|
94
|
+
id(): string;
|
|
95
|
+
callApi(prompt: string, context?: PromptfooContext): Promise<LobuProviderResponse>;
|
|
96
|
+
private createSession;
|
|
97
|
+
private sendMessage;
|
|
98
|
+
private collectResponse;
|
|
99
|
+
private sendAndCollect;
|
|
100
|
+
private deleteSession;
|
|
101
|
+
}
|
|
102
|
+
export {};
|
|
103
|
+
//# sourceMappingURL=provider.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"provider.d.ts","sourceRoot":"","sources":["../src/provider.ts"],"names":[],"mappings":"AAEA,MAAM,WAAW,kBAAkB;IACjC,wEAAwE;IACxE,KAAK,CAAC,EAAE,MAAM,CAAC;IACf,mFAAmF;IACnF,OAAO,CAAC,EAAE,MAAM,CAAC;IACjB,8DAA8D;IAC9D,KAAK,CAAC,EAAE,MAAM,CAAC;IACf,gFAAgF;IAChF,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,6EAA6E;IAC7E,KAAK,CAAC,EAAE,MAAM,CAAC;IACf,gDAAgD;IAChD,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,8EAA8E;IAC9E,MAAM,CAAC,EAAE,MAAM,CAAC;CACjB;AAED;;;;;GAKG;AACH,MAAM,WAAW,YAAY;IAC3B,UAAU,CAAC,EAAE,MAAM,CAAC;IACpB,IAAI,EAAE,MAAM,CAAC;IACb,KAAK,EAAE,OAAO,CAAC;IACf,OAAO,CAAC,EAAE,OAAO,CAAC;IAClB,cAAc,CAAC,EAAE;QACf,SAAS,CAAC,EAAE,MAAM,EAAE,CAAC;QACrB,QAAQ,CAAC,EAAE,KAAK,CAAC;YAAE,EAAE,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAC;QAC/C,KAAK,CAAC,EAAE,MAAM,CAAC;KAChB,CAAC;CACH;AAED,MAAM,WAAW,oBAAoB;IACnC,MAAM,EAAE,MAAM,CAAC;IACf,UAAU,CAAC,EAAE;QACX,KAAK,CAAC,EAAE,MAAM,CAAC;QACf,MAAM,CAAC,EAAE,MAAM,CAAC;QAChB,UAAU,CAAC,EAAE,MAAM,CAAC;KACrB,CAAC;IACF,IAAI,CAAC,EAAE,MAAM,CAAC;IACd,KAAK,CAAC,EAAE,MAAM,CAAC;IACf,QAAQ,EAAE;QACR,KAAK,EAAE,MAAM,CAAC;QACd,OAAO,CAAC,EAAE,MAAM,CAAC;QACjB,MAAM,EAAE,MAAM,CAAC;QACf,qDAAqD;QACrD,SAAS,CAAC,EAAE,YAAY,EAAE,CAAC;QAC3B;;;;;WAKG;QACH,gBAAgB,CAAC,EAAE,MAAM,CAAC;KAC3B,CAAC;CACH;AAED,UAAU,gBAAgB;IACxB,IAAI,CAAC,EAAE,MAAM,CAAC,MAAM,EAAE,OAAO,CAAC,CAAC;IAC/B,MAAM,CAAC,EAAE;QAAE,GAAG,CAAC,EAAE,MAAM,CAAA;KAAE,CAAC;CAC3B;AAkBD;;;;;;;;;;;;;;GAcG;AACH,qBAAa,YAAY;IACvB,OAAO,CAAC,QAAQ,CAAC,KAAK,CAAS;IAC/B,OAAO,CAAC,QAAQ,CAAC,OAAO,CAAS;IACjC,OAAO,CAAC,QAAQ,CAAC,KAAK,CAAS;IAC/B,OAAO,CAAC,QAAQ,CAAC,gBAAgB,CAAqB;IACtD,OAAO,CAAC,QAAQ,CAAC,aAAa,CAAqB;IACnD,OAAO,CAAC,QAAQ,CAAC,gBAAgB,CAAS;IAC1C,OAAO,CAAC,QAAQ,CAAC,cAAc,CAAqB;gBAExC,OAAO,GAAE;QAAE,EAAE,CAAC,EAAE,MAAM,CAAC;QAAC,MAAM,CAAC,EAAE,kBAAkB,CAAA;KAAO;IA2BtE,EAAE,IAAI,MAAM;IAIN,OAAO,CACX,MAAM,EAAE,MAAM,EACd,OAAO,CAAC,EAAE,gBAAgB,GACzB,OAAO,CAAC,oBAAoB,CAAC;YAmElB,aAAa;YAkCb,WAAW;YA0BX,eAAe;YA2If,cAAc;YAWd,aAAa;CAM5B"}
|
package/dist/provider.js
ADDED
|
@@ -0,0 +1,361 @@
|
|
|
1
|
+
import { randomUUID } from "node:crypto";
|
|
2
|
+
const RETRIEVAL_TOOL_NAMES = new Set(["search_memory", "lobu_search_memory"]);
|
|
3
|
+
/**
|
|
4
|
+
* promptfoo custom provider that drives a Lobu agent end-to-end via the
|
|
5
|
+
* gateway's public Agent API:
|
|
6
|
+
*
|
|
7
|
+
* POST {gateway}/lobu/api/v1/agents → create session
|
|
8
|
+
* POST {gateway}/lobu/api/v1/agents/<id>/messages → send user message
|
|
9
|
+
* GET {gateway}/lobu/api/v1/agents/<id>/events → SSE stream of output
|
|
10
|
+
* DELETE {gateway}/lobu/api/v1/agents/<id> → cleanup
|
|
11
|
+
*
|
|
12
|
+
* One fresh thread per `callApi` invocation by default so promptfoo's repeat /
|
|
13
|
+
* scenario semantics see a clean slate. Tool-call traces are surfaced via the
|
|
14
|
+
* gateway's `tool_use` SSE event and populated on `metadata.toolCalls`. For
|
|
15
|
+
* retrieval tools (`search_memory`) the joined snippet text is also exposed as
|
|
16
|
+
* `metadata.retrievedContext` for promptfoo's RAG assertions.
|
|
17
|
+
*/
|
|
18
|
+
export class LobuProvider {
|
|
19
|
+
agent;
|
|
20
|
+
gateway;
|
|
21
|
+
token;
|
|
22
|
+
providerOverride;
|
|
23
|
+
modelOverride;
|
|
24
|
+
defaultTimeoutMs;
|
|
25
|
+
explicitThread;
|
|
26
|
+
constructor(options = {}) {
|
|
27
|
+
const cfg = options.config ?? {};
|
|
28
|
+
const agent = cfg.agent ?? process.env.LOBU_AGENT;
|
|
29
|
+
const gateway = cfg.gateway ?? process.env.LOBU_GATEWAY ?? "http://localhost:8787";
|
|
30
|
+
const token = cfg.token ?? process.env.LOBU_TOKEN;
|
|
31
|
+
if (!agent) {
|
|
32
|
+
throw new Error("@lobu/promptfoo-provider: missing agent. Set providers[].config.agent or LOBU_AGENT.");
|
|
33
|
+
}
|
|
34
|
+
if (!token) {
|
|
35
|
+
throw new Error("@lobu/promptfoo-provider: missing token. Set providers[].config.token or LOBU_TOKEN.");
|
|
36
|
+
}
|
|
37
|
+
this.agent = agent;
|
|
38
|
+
this.gateway = gateway.replace(/\/+$/, "");
|
|
39
|
+
this.token = token;
|
|
40
|
+
this.providerOverride = cfg.provider;
|
|
41
|
+
this.modelOverride = cfg.model;
|
|
42
|
+
this.defaultTimeoutMs = cfg.timeoutMs ?? 120_000;
|
|
43
|
+
this.explicitThread = cfg.thread;
|
|
44
|
+
}
|
|
45
|
+
id() {
|
|
46
|
+
return `lobu:${this.agent}`;
|
|
47
|
+
}
|
|
48
|
+
async callApi(prompt, context) {
|
|
49
|
+
const thread = this.explicitThread ?? `promptfoo-${randomUUID()}`;
|
|
50
|
+
const session = await this.createSession(thread);
|
|
51
|
+
// Multi-turn mode: `vars.transcript` is a string[] of sequential user
|
|
52
|
+
// turns replayed in one Lobu thread. Only the final turn's response is
|
|
53
|
+
// returned for assertion. When set, `prompt` is ignored — the transcript
|
|
54
|
+
// is the source of truth for what the user said.
|
|
55
|
+
const turns = extractTranscript(context) ?? [prompt];
|
|
56
|
+
try {
|
|
57
|
+
let lastResponse;
|
|
58
|
+
for (const turn of turns) {
|
|
59
|
+
lastResponse = await this.sendAndCollect(session, turn, this.defaultTimeoutMs);
|
|
60
|
+
// Bail on the first turn that errors — subsequent assertions would
|
|
61
|
+
// be meaningless against a broken thread.
|
|
62
|
+
if (lastResponse.error) {
|
|
63
|
+
return {
|
|
64
|
+
output: lastResponse.text,
|
|
65
|
+
error: lastResponse.error,
|
|
66
|
+
metadata: {
|
|
67
|
+
agent: this.agent,
|
|
68
|
+
thread,
|
|
69
|
+
traceId: lastResponse.traceId,
|
|
70
|
+
},
|
|
71
|
+
};
|
|
72
|
+
}
|
|
73
|
+
}
|
|
74
|
+
// `turns` is always non-empty (defaults to `[prompt]`), so lastResponse
|
|
75
|
+
// is defined here.
|
|
76
|
+
const response = lastResponse;
|
|
77
|
+
return {
|
|
78
|
+
output: response.text,
|
|
79
|
+
tokenUsage: response.tokens
|
|
80
|
+
? {
|
|
81
|
+
prompt: response.tokens.inputTokens,
|
|
82
|
+
completion: response.tokens.outputTokens,
|
|
83
|
+
total: response.tokens.totalTokens,
|
|
84
|
+
}
|
|
85
|
+
: undefined,
|
|
86
|
+
metadata: {
|
|
87
|
+
agent: this.agent,
|
|
88
|
+
thread,
|
|
89
|
+
traceId: response.traceId,
|
|
90
|
+
...(response.toolCalls && response.toolCalls.length > 0
|
|
91
|
+
? { toolCalls: response.toolCalls }
|
|
92
|
+
: {}),
|
|
93
|
+
...(response.retrievedContext
|
|
94
|
+
? { retrievedContext: response.retrievedContext }
|
|
95
|
+
: {}),
|
|
96
|
+
},
|
|
97
|
+
};
|
|
98
|
+
}
|
|
99
|
+
finally {
|
|
100
|
+
await this.deleteSession(session);
|
|
101
|
+
}
|
|
102
|
+
}
|
|
103
|
+
// ─── internals: gateway protocol ────────────────────────────────────────
|
|
104
|
+
async createSession(thread) {
|
|
105
|
+
const body = {
|
|
106
|
+
agentId: this.agent,
|
|
107
|
+
thread,
|
|
108
|
+
forceNew: true,
|
|
109
|
+
dryRun: true,
|
|
110
|
+
};
|
|
111
|
+
if (this.providerOverride)
|
|
112
|
+
body.provider = this.providerOverride;
|
|
113
|
+
if (this.modelOverride)
|
|
114
|
+
body.model = this.modelOverride;
|
|
115
|
+
const res = await fetch(`${this.gateway}/lobu/api/v1/agents`, {
|
|
116
|
+
method: "POST",
|
|
117
|
+
headers: {
|
|
118
|
+
"Content-Type": "application/json",
|
|
119
|
+
Authorization: `Bearer ${this.token}`,
|
|
120
|
+
},
|
|
121
|
+
body: JSON.stringify(body),
|
|
122
|
+
});
|
|
123
|
+
if (!res.ok) {
|
|
124
|
+
const text = await res.text().catch(() => "");
|
|
125
|
+
throw new Error(`Failed to create Lobu session (${res.status}): ${text}`);
|
|
126
|
+
}
|
|
127
|
+
const data = (await res.json());
|
|
128
|
+
return {
|
|
129
|
+
agentId: data.agentId,
|
|
130
|
+
sessionToken: data.token,
|
|
131
|
+
// Public Agent API is mounted at /lobu (mainApp serves org-scoped REST
|
|
132
|
+
// at /). See packages/server/src/server.ts.
|
|
133
|
+
base: `${this.gateway}/lobu/api/v1/agents/${data.agentId}`,
|
|
134
|
+
};
|
|
135
|
+
}
|
|
136
|
+
async sendMessage(session, content) {
|
|
137
|
+
const res = await fetch(`${session.base}/messages`, {
|
|
138
|
+
method: "POST",
|
|
139
|
+
headers: {
|
|
140
|
+
"Content-Type": "application/json",
|
|
141
|
+
Authorization: `Bearer ${session.sessionToken}`,
|
|
142
|
+
},
|
|
143
|
+
body: JSON.stringify({ content }),
|
|
144
|
+
});
|
|
145
|
+
if (!res.ok) {
|
|
146
|
+
const text = await res.text().catch(() => "");
|
|
147
|
+
throw new Error(`Failed to send message (${res.status}): ${text}`);
|
|
148
|
+
}
|
|
149
|
+
const data = (await res.json());
|
|
150
|
+
const traceId = data.traceparent?.split("-")[1];
|
|
151
|
+
return { traceId, messageId: data.messageId };
|
|
152
|
+
}
|
|
153
|
+
async collectResponse(session, timeoutMs, messageId) {
|
|
154
|
+
const controller = new AbortController();
|
|
155
|
+
const timer = setTimeout(() => controller.abort(), timeoutMs);
|
|
156
|
+
const start = Date.now();
|
|
157
|
+
let readerForCleanup;
|
|
158
|
+
try {
|
|
159
|
+
const res = await fetch(`${session.base}/events`, {
|
|
160
|
+
headers: { Authorization: `Bearer ${session.sessionToken}` },
|
|
161
|
+
signal: controller.signal,
|
|
162
|
+
});
|
|
163
|
+
if (!res.ok || !res.body) {
|
|
164
|
+
throw new Error(`SSE connection failed (${res.status})`);
|
|
165
|
+
}
|
|
166
|
+
const reader = res.body.getReader();
|
|
167
|
+
readerForCleanup = reader;
|
|
168
|
+
const decoder = new TextDecoder();
|
|
169
|
+
let buffer = "";
|
|
170
|
+
let currentEvent = "";
|
|
171
|
+
let text = "";
|
|
172
|
+
const toolCalls = [];
|
|
173
|
+
const retrievedSnippets = [];
|
|
174
|
+
const seenSnippetIds = new Set();
|
|
175
|
+
const matchesTarget = (eventMessageId) => {
|
|
176
|
+
if (!messageId)
|
|
177
|
+
return true;
|
|
178
|
+
return (typeof eventMessageId === "string" && eventMessageId === messageId);
|
|
179
|
+
};
|
|
180
|
+
const finalize = (extra = {}) => {
|
|
181
|
+
const result = {
|
|
182
|
+
text,
|
|
183
|
+
latencyMs: Date.now() - start,
|
|
184
|
+
...extra,
|
|
185
|
+
};
|
|
186
|
+
if (toolCalls.length > 0)
|
|
187
|
+
result.toolCalls = toolCalls;
|
|
188
|
+
if (retrievedSnippets.length > 0) {
|
|
189
|
+
result.retrievedContext = retrievedSnippets
|
|
190
|
+
.map((s) => s.text)
|
|
191
|
+
.join("\n\n");
|
|
192
|
+
}
|
|
193
|
+
return result;
|
|
194
|
+
};
|
|
195
|
+
while (true) {
|
|
196
|
+
const { done, value } = await reader.read();
|
|
197
|
+
if (done)
|
|
198
|
+
break;
|
|
199
|
+
buffer += decoder.decode(value, { stream: true });
|
|
200
|
+
const lines = buffer.split("\n");
|
|
201
|
+
buffer = lines.pop() ?? "";
|
|
202
|
+
for (const line of lines) {
|
|
203
|
+
if (line.startsWith("event: ")) {
|
|
204
|
+
currentEvent = line.slice(7).trim();
|
|
205
|
+
}
|
|
206
|
+
else if (line.startsWith("data: ") && currentEvent) {
|
|
207
|
+
const data = parseJSON(line.slice(6));
|
|
208
|
+
if (!data)
|
|
209
|
+
continue;
|
|
210
|
+
switch (currentEvent) {
|
|
211
|
+
case "output":
|
|
212
|
+
if (typeof data.content === "string" &&
|
|
213
|
+
matchesTarget(data.messageId)) {
|
|
214
|
+
text += data.content;
|
|
215
|
+
}
|
|
216
|
+
break;
|
|
217
|
+
case "tool_use": {
|
|
218
|
+
if (!matchesTarget(data.messageId))
|
|
219
|
+
break;
|
|
220
|
+
const call = normaliseToolUseEvent(data);
|
|
221
|
+
if (!call)
|
|
222
|
+
break;
|
|
223
|
+
toolCalls.push(call);
|
|
224
|
+
if (RETRIEVAL_TOOL_NAMES.has(call.name) &&
|
|
225
|
+
call.result_summary?.snippets) {
|
|
226
|
+
for (const snippet of call.result_summary.snippets) {
|
|
227
|
+
if (seenSnippetIds.has(snippet.id))
|
|
228
|
+
continue;
|
|
229
|
+
seenSnippetIds.add(snippet.id);
|
|
230
|
+
retrievedSnippets.push(snippet);
|
|
231
|
+
}
|
|
232
|
+
}
|
|
233
|
+
break;
|
|
234
|
+
}
|
|
235
|
+
case "complete": {
|
|
236
|
+
if (!matchesTarget(data.messageId))
|
|
237
|
+
break;
|
|
238
|
+
const usage = data.usage;
|
|
239
|
+
return finalize({
|
|
240
|
+
tokens: usage
|
|
241
|
+
? {
|
|
242
|
+
inputTokens: usage.input_tokens ?? usage.inputTokens,
|
|
243
|
+
outputTokens: usage.output_tokens ?? usage.outputTokens,
|
|
244
|
+
totalTokens: (usage.input_tokens ?? usage.inputTokens ?? 0) +
|
|
245
|
+
(usage.output_tokens ?? usage.outputTokens ?? 0),
|
|
246
|
+
}
|
|
247
|
+
: undefined,
|
|
248
|
+
});
|
|
249
|
+
}
|
|
250
|
+
case "error":
|
|
251
|
+
if (!matchesTarget(data.messageId))
|
|
252
|
+
break;
|
|
253
|
+
return finalize({
|
|
254
|
+
error: String(data.error ?? "Unknown error"),
|
|
255
|
+
});
|
|
256
|
+
}
|
|
257
|
+
currentEvent = "";
|
|
258
|
+
}
|
|
259
|
+
else if (line === "") {
|
|
260
|
+
currentEvent = "";
|
|
261
|
+
}
|
|
262
|
+
}
|
|
263
|
+
}
|
|
264
|
+
return finalize();
|
|
265
|
+
}
|
|
266
|
+
catch (err) {
|
|
267
|
+
if (err instanceof Error && err.name === "AbortError") {
|
|
268
|
+
return { text: "", latencyMs: Date.now() - start, error: "Timeout" };
|
|
269
|
+
}
|
|
270
|
+
throw err;
|
|
271
|
+
}
|
|
272
|
+
finally {
|
|
273
|
+
clearTimeout(timer);
|
|
274
|
+
if (readerForCleanup) {
|
|
275
|
+
await readerForCleanup.cancel().catch(() => undefined);
|
|
276
|
+
}
|
|
277
|
+
}
|
|
278
|
+
}
|
|
279
|
+
async sendAndCollect(session, content, timeoutMs) {
|
|
280
|
+
const { traceId, messageId } = await this.sendMessage(session, content);
|
|
281
|
+
const response = await this.collectResponse(session, timeoutMs, messageId);
|
|
282
|
+
response.traceId = traceId;
|
|
283
|
+
return response;
|
|
284
|
+
}
|
|
285
|
+
async deleteSession(session) {
|
|
286
|
+
await fetch(`${session.base}`, {
|
|
287
|
+
method: "DELETE",
|
|
288
|
+
headers: { Authorization: `Bearer ${session.sessionToken}` },
|
|
289
|
+
}).catch(() => undefined);
|
|
290
|
+
}
|
|
291
|
+
}
|
|
292
|
+
/**
|
|
293
|
+
* Pull a multi-turn transcript out of the promptfoo test context. Expects
|
|
294
|
+
* `vars.transcript` to be a non-empty `string[]`; anything else falls back
|
|
295
|
+
* to single-turn mode (returns undefined). Empty strings are filtered out
|
|
296
|
+
* so an accidental trailing newline in YAML doesn't send a blank turn.
|
|
297
|
+
*/
|
|
298
|
+
function extractTranscript(context) {
|
|
299
|
+
const raw = context?.vars?.transcript;
|
|
300
|
+
if (!Array.isArray(raw))
|
|
301
|
+
return undefined;
|
|
302
|
+
const turns = raw.filter((t) => typeof t === "string" && t.trim().length > 0);
|
|
303
|
+
return turns.length > 0 ? turns : undefined;
|
|
304
|
+
}
|
|
305
|
+
function parseJSON(str) {
|
|
306
|
+
try {
|
|
307
|
+
const parsed = JSON.parse(str);
|
|
308
|
+
if (parsed && typeof parsed === "object" && !Array.isArray(parsed)) {
|
|
309
|
+
return parsed;
|
|
310
|
+
}
|
|
311
|
+
return null;
|
|
312
|
+
}
|
|
313
|
+
catch {
|
|
314
|
+
return null;
|
|
315
|
+
}
|
|
316
|
+
}
|
|
317
|
+
function normaliseToolUseEvent(data) {
|
|
318
|
+
const name = data.name;
|
|
319
|
+
if (typeof name !== "string" || !name)
|
|
320
|
+
return null;
|
|
321
|
+
const call = {
|
|
322
|
+
name,
|
|
323
|
+
input: data.input ?? null,
|
|
324
|
+
};
|
|
325
|
+
if (typeof data.toolCallId === "string")
|
|
326
|
+
call.toolCallId = data.toolCallId;
|
|
327
|
+
if (data.isError === true)
|
|
328
|
+
call.isError = true;
|
|
329
|
+
const summary = data.result_summary;
|
|
330
|
+
if (summary && typeof summary === "object") {
|
|
331
|
+
const parsed = {};
|
|
332
|
+
const ids = summary.event_ids;
|
|
333
|
+
if (Array.isArray(ids)) {
|
|
334
|
+
const numeric = ids.filter((id) => typeof id === "number");
|
|
335
|
+
if (numeric.length > 0)
|
|
336
|
+
parsed.event_ids = numeric;
|
|
337
|
+
}
|
|
338
|
+
const snippetsRaw = summary.snippets;
|
|
339
|
+
if (Array.isArray(snippetsRaw)) {
|
|
340
|
+
const snippets = [];
|
|
341
|
+
for (const item of snippetsRaw) {
|
|
342
|
+
if (!item || typeof item !== "object")
|
|
343
|
+
continue;
|
|
344
|
+
const id = item.id;
|
|
345
|
+
const text = item.text;
|
|
346
|
+
if (typeof id === "number" && typeof text === "string") {
|
|
347
|
+
snippets.push({ id, text });
|
|
348
|
+
}
|
|
349
|
+
}
|
|
350
|
+
if (snippets.length > 0)
|
|
351
|
+
parsed.snippets = snippets;
|
|
352
|
+
}
|
|
353
|
+
const errorMsg = summary.error;
|
|
354
|
+
if (typeof errorMsg === "string")
|
|
355
|
+
parsed.error = errorMsg;
|
|
356
|
+
if (Object.keys(parsed).length > 0)
|
|
357
|
+
call.result_summary = parsed;
|
|
358
|
+
}
|
|
359
|
+
return call;
|
|
360
|
+
}
|
|
361
|
+
//# sourceMappingURL=provider.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"provider.js","sourceRoot":"","sources":["../src/provider.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,UAAU,EAAE,MAAM,aAAa,CAAC;AAiFzC,MAAM,oBAAoB,GAAG,IAAI,GAAG,CAAC,CAAC,eAAe,EAAE,oBAAoB,CAAC,CAAC,CAAC;AAE9E;;;;;;;;;;;;;;GAcG;AACH,MAAM,OAAO,YAAY;IACN,KAAK,CAAS;IACd,OAAO,CAAS;IAChB,KAAK,CAAS;IACd,gBAAgB,CAAqB;IACrC,aAAa,CAAqB;IAClC,gBAAgB,CAAS;IACzB,cAAc,CAAqB;IAEpD,YAAY,UAAwD,EAAE;QACpE,MAAM,GAAG,GAAG,OAAO,CAAC,MAAM,IAAI,EAAE,CAAC;QACjC,MAAM,KAAK,GAAG,GAAG,CAAC,KAAK,IAAI,OAAO,CAAC,GAAG,CAAC,UAAU,CAAC;QAClD,MAAM,OAAO,GACX,GAAG,CAAC,OAAO,IAAI,OAAO,CAAC,GAAG,CAAC,YAAY,IAAI,uBAAuB,CAAC;QACrE,MAAM,KAAK,GAAG,GAAG,CAAC,KAAK,IAAI,OAAO,CAAC,GAAG,CAAC,UAAU,CAAC;QAElD,IAAI,CAAC,KAAK,EAAE,CAAC;YACX,MAAM,IAAI,KAAK,CACb,sFAAsF,CACvF,CAAC;QACJ,CAAC;QACD,IAAI,CAAC,KAAK,EAAE,CAAC;YACX,MAAM,IAAI,KAAK,CACb,sFAAsF,CACvF,CAAC;QACJ,CAAC;QAED,IAAI,CAAC,KAAK,GAAG,KAAK,CAAC;QACnB,IAAI,CAAC,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC,MAAM,EAAE,EAAE,CAAC,CAAC;QAC3C,IAAI,CAAC,KAAK,GAAG,KAAK,CAAC;QACnB,IAAI,CAAC,gBAAgB,GAAG,GAAG,CAAC,QAAQ,CAAC;QACrC,IAAI,CAAC,aAAa,GAAG,GAAG,CAAC,KAAK,CAAC;QAC/B,IAAI,CAAC,gBAAgB,GAAG,GAAG,CAAC,SAAS,IAAI,OAAO,CAAC;QACjD,IAAI,CAAC,cAAc,GAAG,GAAG,CAAC,MAAM,CAAC;IACnC,CAAC;IAED,EAAE;QACA,OAAO,QAAQ,IAAI,CAAC,KAAK,EAAE,CAAC;IAC9B,CAAC;IAED,KAAK,CAAC,OAAO,CACX,MAAc,EACd,OAA0B;QAE1B,MAAM,MAAM,GAAG,IAAI,CAAC,cAAc,IAAI,aAAa,UAAU,EAAE,EAAE,CAAC;QAClE,MAAM,OAAO,GAAG,MAAM,IAAI,CAAC,aAAa,CAAC,MAAM,CAAC,CAAC;QAEjD,sEAAsE;QACtE,uEAAuE;QACvE,yEAAyE;QACzE,iDAAiD;QACjD,MAAM,KAAK,GAAG,iBAAiB,CAAC,OAAO,CAAC,IAAI,CAAC,MAAM,CAAC,CAAC;QAErD,IAAI,CAAC;YACH,IAAI,YAA2C,CAAC;YAEhD,KAAK,MAAM,IAAI,IAAI,KAAK,EAAE,CAAC;gBACzB,YAAY,GAAG,MAAM,IAAI,CAAC,cAAc,CACtC,OAAO,EACP,IAAI,EACJ,IAAI,CAAC,gBAAgB,CACtB,CAAC;gBAEF,mEAAmE;gBACnE,0CAA0C;gBAC1C,IAAI,YAAY,CAAC,KAAK,EAAE,CAAC;oBACvB,OAAO;wBACL,MAAM,EAAE,YAAY,CAAC,IAAI;wBACzB,KAAK,EAAE,YAAY,CAAC,KAAK;wBACzB,QAAQ,EAAE;4BACR,KAAK,EAAE,IAAI,CAAC,KAAK;4BACjB,MAAM;4BACN,OAAO,EAAE,YAAY,CAAC,OAAO;yBAC9B;qBACF,CAAC;gBACJ,CAAC;YACH,CAAC;YAED,wEAAwE;YACxE,mBAAmB;YACnB,MAAM,QAAQ,GAAG,YAAiC,CAAC;YAEnD,OAAO;gBACL,MAAM,EAAE,QAAQ,CAAC,IAAI;gBACrB,UAAU,EAAE,QAAQ,CAAC,MAAM;oBACzB,CAAC,CAAC;wBACE,MAAM,EAAE,QAAQ,CAAC,MAAM,CAAC,WAAW;wBACnC,UAAU,EAAE,QAAQ,CAAC,MAAM,CAAC,YAAY;wBACxC,KAAK,EAAE,QAAQ,CAAC,MAAM,CAAC,WAAW;qBACnC;oBACH,CAAC,CAAC,SAAS;gBACb,QAAQ,EAAE;oBACR,KAAK,EAAE,IAAI,CAAC,KAAK;oBACjB,MAAM;oBACN,OAAO,EAAE,QAAQ,CAAC,OAAO;oBACzB,GAAG,CAAC,QAAQ,CAAC,SAAS,IAAI,QAAQ,CAAC,SAAS,CAAC,MAAM,GAAG,CAAC;wBACrD,CAAC,CAAC,EAAE,SAAS,EAAE,QAAQ,CAAC,SAAS,EAAE;wBACnC,CAAC,CAAC,EAAE,CAAC;oBACP,GAAG,CAAC,QAAQ,CAAC,gBAAgB;wBAC3B,CAAC,CAAC,EAAE,gBAAgB,EAAE,QAAQ,CAAC,gBAAgB,EAAE;wBACjD,CAAC,CAAC,EAAE,CAAC;iBACR;aACF,CAAC;QACJ,CAAC;gBAAS,CAAC;YACT,MAAM,IAAI,CAAC,aAAa,CAAC,OAAO,CAAC,CAAC;QACpC,CAAC;IACH,CAAC;IAED,2EAA2E;IAEnE,KAAK,CAAC,aAAa,CAAC,MAAc;QACxC,MAAM,IAAI,GAA4B;YACpC,OAAO,EAAE,IAAI,CAAC,KAAK;YACnB,MAAM;YACN,QAAQ,EAAE,IAAI;YACd,MAAM,EAAE,IAAI;SACb,CAAC;QACF,IAAI,IAAI,CAAC,gBAAgB;YAAE,IAAI,CAAC,QAAQ,GAAG,IAAI,CAAC,gBAAgB,CAAC;QACjE,IAAI,IAAI,CAAC,aAAa;YAAE,IAAI,CAAC,KAAK,GAAG,IAAI,CAAC,aAAa,CAAC;QAExD,MAAM,GAAG,GAAG,MAAM,KAAK,CAAC,GAAG,IAAI,CAAC,OAAO,qBAAqB,EAAE;YAC5D,MAAM,EAAE,MAAM;YACd,OAAO,EAAE;gBACP,cAAc,EAAE,kBAAkB;gBAClC,aAAa,EAAE,UAAU,IAAI,CAAC,KAAK,EAAE;aACtC;YACD,IAAI,EAAE,IAAI,CAAC,SAAS,CAAC,IAAI,CAAC;SAC3B,CAAC,CAAC;QAEH,IAAI,CAAC,GAAG,CAAC,EAAE,EAAE,CAAC;YACZ,MAAM,IAAI,GAAG,MAAM,GAAG,CAAC,IAAI,EAAE,CAAC,KAAK,CAAC,GAAG,EAAE,CAAC,EAAE,CAAC,CAAC;YAC9C,MAAM,IAAI,KAAK,CAAC,kCAAkC,GAAG,CAAC,MAAM,MAAM,IAAI,EAAE,CAAC,CAAC;QAC5E,CAAC;QAED,MAAM,IAAI,GAAG,CAAC,MAAM,GAAG,CAAC,IAAI,EAAE,CAAuC,CAAC;QACtE,OAAO;YACL,OAAO,EAAE,IAAI,CAAC,OAAO;YACrB,YAAY,EAAE,IAAI,CAAC,KAAK;YACxB,uEAAuE;YACvE,4CAA4C;YAC5C,IAAI,EAAE,GAAG,IAAI,CAAC,OAAO,uBAAuB,IAAI,CAAC,OAAO,EAAE;SAC3D,CAAC;IACJ,CAAC;IAEO,KAAK,CAAC,WAAW,CACvB,OAAgB,EAChB,OAAe;QAEf,MAAM,GAAG,GAAG,MAAM,KAAK,CAAC,GAAG,OAAO,CAAC,IAAI,WAAW,EAAE;YAClD,MAAM,EAAE,MAAM;YACd,OAAO,EAAE;gBACP,cAAc,EAAE,kBAAkB;gBAClC,aAAa,EAAE,UAAU,OAAO,CAAC,YAAY,EAAE;aAChD;YACD,IAAI,EAAE,IAAI,CAAC,SAAS,CAAC,EAAE,OAAO,EAAE,CAAC;SAClC,CAAC,CAAC;QAEH,IAAI,CAAC,GAAG,CAAC,EAAE,EAAE,CAAC;YACZ,MAAM,IAAI,GAAG,MAAM,GAAG,CAAC,IAAI,EAAE,CAAC,KAAK,CAAC,GAAG,EAAE,CAAC,EAAE,CAAC,CAAC;YAC9C,MAAM,IAAI,KAAK,CAAC,2BAA2B,GAAG,CAAC,MAAM,MAAM,IAAI,EAAE,CAAC,CAAC;QACrE,CAAC;QAED,MAAM,IAAI,GAAG,CAAC,MAAM,GAAG,CAAC,IAAI,EAAE,CAG7B,CAAC;QACF,MAAM,OAAO,GAAG,IAAI,CAAC,WAAW,EAAE,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC;QAChD,OAAO,EAAE,OAAO,EAAE,SAAS,EAAE,IAAI,CAAC,SAAS,EAAE,CAAC;IAChD,CAAC;IAEO,KAAK,CAAC,eAAe,CAC3B,OAAgB,EAChB,SAAiB,EACjB,SAAkB;QAElB,MAAM,UAAU,GAAG,IAAI,eAAe,EAAE,CAAC;QACzC,MAAM,KAAK,GAAG,UAAU,CAAC,GAAG,EAAE,CAAC,UAAU,CAAC,KAAK,EAAE,EAAE,SAAS,CAAC,CAAC;QAC9D,MAAM,KAAK,GAAG,IAAI,CAAC,GAAG,EAAE,CAAC;QACzB,IAAI,gBAES,CAAC;QAEd,IAAI,CAAC;YACH,MAAM,GAAG,GAAG,MAAM,KAAK,CAAC,GAAG,OAAO,CAAC,IAAI,SAAS,EAAE;gBAChD,OAAO,EAAE,EAAE,aAAa,EAAE,UAAU,OAAO,CAAC,YAAY,EAAE,EAAE;gBAC5D,MAAM,EAAE,UAAU,CAAC,MAAM;aAC1B,CAAC,CAAC;YAEH,IAAI,CAAC,GAAG,CAAC,EAAE,IAAI,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC;gBACzB,MAAM,IAAI,KAAK,CAAC,0BAA0B,GAAG,CAAC,MAAM,GAAG,CAAC,CAAC;YAC3D,CAAC;YAED,MAAM,MAAM,GAAG,GAAG,CAAC,IAAI,CAAC,SAAS,EAAE,CAAC;YACpC,gBAAgB,GAAG,MAAM,CAAC;YAC1B,MAAM,OAAO,GAAG,IAAI,WAAW,EAAE,CAAC;YAClC,IAAI,MAAM,GAAG,EAAE,CAAC;YAChB,IAAI,YAAY,GAAG,EAAE,CAAC;YACtB,IAAI,IAAI,GAAG,EAAE,CAAC;YACd,MAAM,SAAS,GAAmB,EAAE,CAAC;YACrC,MAAM,iBAAiB,GAAwC,EAAE,CAAC;YAClE,MAAM,cAAc,GAAG,IAAI,GAAG,EAAU,CAAC;YAEzC,MAAM,aAAa,GAAG,CAAC,cAAuB,EAAW,EAAE;gBACzD,IAAI,CAAC,SAAS;oBAAE,OAAO,IAAI,CAAC;gBAC5B,OAAO,CACL,OAAO,cAAc,KAAK,QAAQ,IAAI,cAAc,KAAK,SAAS,CACnE,CAAC;YACJ,CAAC,CAAC;YAEF,MAAM,QAAQ,GAAG,CACf,QAAoC,EAAE,EACnB,EAAE;gBACrB,MAAM,MAAM,GAAsB;oBAChC,IAAI;oBACJ,SAAS,EAAE,IAAI,CAAC,GAAG,EAAE,GAAG,KAAK;oBAC7B,GAAG,KAAK;iBACT,CAAC;gBACF,IAAI,SAAS,CAAC,MAAM,GAAG,CAAC;oBAAE,MAAM,CAAC,SAAS,GAAG,SAAS,CAAC;gBACvD,IAAI,iBAAiB,CAAC,MAAM,GAAG,CAAC,EAAE,CAAC;oBACjC,MAAM,CAAC,gBAAgB,GAAG,iBAAiB;yBACxC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,IAAI,CAAC;yBAClB,IAAI,CAAC,MAAM,CAAC,CAAC;gBAClB,CAAC;gBACD,OAAO,MAAM,CAAC;YAChB,CAAC,CAAC;YAEF,OAAO,IAAI,EAAE,CAAC;gBACZ,MAAM,EAAE,IAAI,EAAE,KAAK,EAAE,GAAG,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC;gBAC5C,IAAI,IAAI;oBAAE,MAAM;gBAEhB,MAAM,IAAI,OAAO,CAAC,MAAM,CAAC,KAAK,EAAE,EAAE,MAAM,EAAE,IAAI,EAAE,CAAC,CAAC;gBAClD,MAAM,KAAK,GAAG,MAAM,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;gBACjC,MAAM,GAAG,KAAK,CAAC,GAAG,EAAE,IAAI,EAAE,CAAC;gBAE3B,KAAK,MAAM,IAAI,IAAI,KAAK,EAAE,CAAC;oBACzB,IAAI,IAAI,CAAC,UAAU,CAAC,SAAS,CAAC,EAAE,CAAC;wBAC/B,YAAY,GAAG,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC;oBACtC,CAAC;yBAAM,IAAI,IAAI,CAAC,UAAU,CAAC,QAAQ,CAAC,IAAI,YAAY,EAAE,CAAC;wBACrD,MAAM,IAAI,GAAG,SAAS,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC;wBACtC,IAAI,CAAC,IAAI;4BAAE,SAAS;wBAEpB,QAAQ,YAAY,EAAE,CAAC;4BACrB,KAAK,QAAQ;gCACX,IACE,OAAO,IAAI,CAAC,OAAO,KAAK,QAAQ;oCAChC,aAAa,CAAC,IAAI,CAAC,SAAS,CAAC,EAC7B,CAAC;oCACD,IAAI,IAAI,IAAI,CAAC,OAAO,CAAC;gCACvB,CAAC;gCACD,MAAM;4BACR,KAAK,UAAU,CAAC,CAAC,CAAC;gCAChB,IAAI,CAAC,aAAa,CAAC,IAAI,CAAC,SAAS,CAAC;oCAAE,MAAM;gCAC1C,MAAM,IAAI,GAAG,qBAAqB,CAAC,IAAI,CAAC,CAAC;gCACzC,IAAI,CAAC,IAAI;oCAAE,MAAM;gCACjB,SAAS,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC;gCACrB,IACE,oBAAoB,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,CAAC;oCACnC,IAAI,CAAC,cAAc,EAAE,QAAQ,EAC7B,CAAC;oCACD,KAAK,MAAM,OAAO,IAAI,IAAI,CAAC,cAAc,CAAC,QAAQ,EAAE,CAAC;wCACnD,IAAI,cAAc,CAAC,GAAG,CAAC,OAAO,CAAC,EAAE,CAAC;4CAAE,SAAS;wCAC7C,cAAc,CAAC,GAAG,CAAC,OAAO,CAAC,EAAE,CAAC,CAAC;wCAC/B,iBAAiB,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC;oCAClC,CAAC;gCACH,CAAC;gCACD,MAAM;4BACR,CAAC;4BACD,KAAK,UAAU,CAAC,CAAC,CAAC;gCAChB,IAAI,CAAC,aAAa,CAAC,IAAI,CAAC,SAAS,CAAC;oCAAE,MAAM;gCAC1C,MAAM,KAAK,GAAG,IAAI,CAAC,KAA2C,CAAC;gCAC/D,OAAO,QAAQ,CAAC;oCACd,MAAM,EAAE,KAAK;wCACX,CAAC,CAAC;4CACE,WAAW,EAAE,KAAK,CAAC,YAAY,IAAI,KAAK,CAAC,WAAW;4CACpD,YAAY,EAAE,KAAK,CAAC,aAAa,IAAI,KAAK,CAAC,YAAY;4CACvD,WAAW,EACT,CAAC,KAAK,CAAC,YAAY,IAAI,KAAK,CAAC,WAAW,IAAI,CAAC,CAAC;gDAC9C,CAAC,KAAK,CAAC,aAAa,IAAI,KAAK,CAAC,YAAY,IAAI,CAAC,CAAC;yCACnD;wCACH,CAAC,CAAC,SAAS;iCACd,CAAC,CAAC;4BACL,CAAC;4BACD,KAAK,OAAO;gCACV,IAAI,CAAC,aAAa,CAAC,IAAI,CAAC,SAAS,CAAC;oCAAE,MAAM;gCAC1C,OAAO,QAAQ,CAAC;oCACd,KAAK,EAAE,MAAM,CAAC,IAAI,CAAC,KAAK,IAAI,eAAe,CAAC;iCAC7C,CAAC,CAAC;wBACP,CAAC;wBACD,YAAY,GAAG,EAAE,CAAC;oBACpB,CAAC;yBAAM,IAAI,IAAI,KAAK,EAAE,EAAE,CAAC;wBACvB,YAAY,GAAG,EAAE,CAAC;oBACpB,CAAC;gBACH,CAAC;YACH,CAAC;YAED,OAAO,QAAQ,EAAE,CAAC;QACpB,CAAC;QAAC,OAAO,GAAY,EAAE,CAAC;YACtB,IAAI,GAAG,YAAY,KAAK,IAAI,GAAG,CAAC,IAAI,KAAK,YAAY,EAAE,CAAC;gBACtD,OAAO,EAAE,IAAI,EAAE,EAAE,EAAE,SAAS,EAAE,IAAI,CAAC,GAAG,EAAE,GAAG,KAAK,EAAE,KAAK,EAAE,SAAS,EAAE,CAAC;YACvE,CAAC;YACD,MAAM,GAAG,CAAC;QACZ,CAAC;gBAAS,CAAC;YACT,YAAY,CAAC,KAAK,CAAC,CAAC;YACpB,IAAI,gBAAgB,EAAE,CAAC;gBACrB,MAAM,gBAAgB,CAAC,MAAM,EAAE,CAAC,KAAK,CAAC,GAAG,EAAE,CAAC,SAAS,CAAC,CAAC;YACzD,CAAC;QACH,CAAC;IACH,CAAC;IAEO,KAAK,CAAC,cAAc,CAC1B,OAAgB,EAChB,OAAe,EACf,SAAiB;QAEjB,MAAM,EAAE,OAAO,EAAE,SAAS,EAAE,GAAG,MAAM,IAAI,CAAC,WAAW,CAAC,OAAO,EAAE,OAAO,CAAC,CAAC;QACxE,MAAM,QAAQ,GAAG,MAAM,IAAI,CAAC,eAAe,CAAC,OAAO,EAAE,SAAS,EAAE,SAAS,CAAC,CAAC;QAC3E,QAAQ,CAAC,OAAO,GAAG,OAAO,CAAC;QAC3B,OAAO,QAAQ,CAAC;IAClB,CAAC;IAEO,KAAK,CAAC,aAAa,CAAC,OAAgB;QAC1C,MAAM,KAAK,CAAC,GAAG,OAAO,CAAC,IAAI,EAAE,EAAE;YAC7B,MAAM,EAAE,QAAQ;YAChB,OAAO,EAAE,EAAE,aAAa,EAAE,UAAU,OAAO,CAAC,YAAY,EAAE,EAAE;SAC7D,CAAC,CAAC,KAAK,CAAC,GAAG,EAAE,CAAC,SAAS,CAAC,CAAC;IAC5B,CAAC;CACF;AAQD;;;;;GAKG;AACH,SAAS,iBAAiB,CACxB,OAAqC;IAErC,MAAM,GAAG,GAAG,OAAO,EAAE,IAAI,EAAE,UAAU,CAAC;IACtC,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,GAAG,CAAC;QAAE,OAAO,SAAS,CAAC;IAC1C,MAAM,KAAK,GAAG,GAAG,CAAC,MAAM,CACtB,CAAC,CAAC,EAAe,EAAE,CAAC,OAAO,CAAC,KAAK,QAAQ,IAAI,CAAC,CAAC,IAAI,EAAE,CAAC,MAAM,GAAG,CAAC,CACjE,CAAC;IACF,OAAO,KAAK,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,SAAS,CAAC;AAC9C,CAAC;AAED,SAAS,SAAS,CAAC,GAAW;IAC5B,IAAI,CAAC;QACH,MAAM,MAAM,GAAY,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC;QACxC,IAAI,MAAM,IAAI,OAAO,MAAM,KAAK,QAAQ,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,MAAM,CAAC,EAAE,CAAC;YACnE,OAAO,MAAiC,CAAC;QAC3C,CAAC;QACD,OAAO,IAAI,CAAC;IACd,CAAC;IAAC,MAAM,CAAC;QACP,OAAO,IAAI,CAAC;IACd,CAAC;AACH,CAAC;AAED,SAAS,qBAAqB,CAC5B,IAA6B;IAE7B,MAAM,IAAI,GAAG,IAAI,CAAC,IAAI,CAAC;IACvB,IAAI,OAAO,IAAI,KAAK,QAAQ,IAAI,CAAC,IAAI;QAAE,OAAO,IAAI,CAAC;IACnD,MAAM,IAAI,GAAiB;QACzB,IAAI;QACJ,KAAK,EAAE,IAAI,CAAC,KAAK,IAAI,IAAI;KAC1B,CAAC;IACF,IAAI,OAAO,IAAI,CAAC,UAAU,KAAK,QAAQ;QAAE,IAAI,CAAC,UAAU,GAAG,IAAI,CAAC,UAAU,CAAC;IAC3E,IAAI,IAAI,CAAC,OAAO,KAAK,IAAI;QAAE,IAAI,CAAC,OAAO,GAAG,IAAI,CAAC;IAC/C,MAAM,OAAO,GAAG,IAAI,CAAC,cAAc,CAAC;IACpC,IAAI,OAAO,IAAI,OAAO,OAAO,KAAK,QAAQ,EAAE,CAAC;QAC3C,MAAM,MAAM,GAAgD,EAAE,CAAC;QAC/D,MAAM,GAAG,GAAI,OAAmC,CAAC,SAAS,CAAC;QAC3D,IAAI,KAAK,CAAC,OAAO,CAAC,GAAG,CAAC,EAAE,CAAC;YACvB,MAAM,OAAO,GAAG,GAAG,CAAC,MAAM,CAAC,CAAC,EAAE,EAAgB,EAAE,CAAC,OAAO,EAAE,KAAK,QAAQ,CAAC,CAAC;YACzE,IAAI,OAAO,CAAC,MAAM,GAAG,CAAC;gBAAE,MAAM,CAAC,SAAS,GAAG,OAAO,CAAC;QACrD,CAAC;QACD,MAAM,WAAW,GAAI,OAAkC,CAAC,QAAQ,CAAC;QACjE,IAAI,KAAK,CAAC,OAAO,CAAC,WAAW,CAAC,EAAE,CAAC;YAC/B,MAAM,QAAQ,GAAwC,EAAE,CAAC;YACzD,KAAK,MAAM,IAAI,IAAI,WAAW,EAAE,CAAC;gBAC/B,IAAI,CAAC,IAAI,IAAI,OAAO,IAAI,KAAK,QAAQ;oBAAE,SAAS;gBAChD,MAAM,EAAE,GAAI,IAAyB,CAAC,EAAE,CAAC;gBACzC,MAAM,IAAI,GAAI,IAA2B,CAAC,IAAI,CAAC;gBAC/C,IAAI,OAAO,EAAE,KAAK,QAAQ,IAAI,OAAO,IAAI,KAAK,QAAQ,EAAE,CAAC;oBACvD,QAAQ,CAAC,IAAI,CAAC,EAAE,EAAE,EAAE,IAAI,EAAE,CAAC,CAAC;gBAC9B,CAAC;YACH,CAAC;YACD,IAAI,QAAQ,CAAC,MAAM,GAAG,CAAC;gBAAE,MAAM,CAAC,QAAQ,GAAG,QAAQ,CAAC;QACtD,CAAC;QACD,MAAM,QAAQ,GAAI,OAA+B,CAAC,KAAK,CAAC;QACxD,IAAI,OAAO,QAAQ,KAAK,QAAQ;YAAE,MAAM,CAAC,KAAK,GAAG,QAAQ,CAAC;QAC1D,IAAI,MAAM,CAAC,IAAI,CAAC,MAAM,CAAC,CAAC,MAAM,GAAG,CAAC;YAAE,IAAI,CAAC,cAAc,GAAG,MAAM,CAAC;IACnE,CAAC;IACD,OAAO,IAAI,CAAC;AACd,CAAC"}
|
package/package.json
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@lobu/promptfoo-provider",
|
|
3
|
+
"version": "8.0.0",
|
|
4
|
+
"description": "promptfoo custom provider for running evals against a Lobu agent",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"main": "dist/index.js",
|
|
7
|
+
"types": "dist/index.d.ts",
|
|
8
|
+
"exports": {
|
|
9
|
+
".": {
|
|
10
|
+
"import": {
|
|
11
|
+
"types": "./dist/index.d.ts",
|
|
12
|
+
"default": "./dist/index.js"
|
|
13
|
+
},
|
|
14
|
+
"require": {
|
|
15
|
+
"types": "./dist/index.d.ts",
|
|
16
|
+
"default": "./dist/index.js"
|
|
17
|
+
}
|
|
18
|
+
}
|
|
19
|
+
},
|
|
20
|
+
"files": [
|
|
21
|
+
"dist"
|
|
22
|
+
],
|
|
23
|
+
"scripts": {
|
|
24
|
+
"build": "tsc",
|
|
25
|
+
"typecheck": "tsc --noEmit",
|
|
26
|
+
"clean": "rm -rf dist"
|
|
27
|
+
},
|
|
28
|
+
"devDependencies": {
|
|
29
|
+
"@types/node": "^20.10.0",
|
|
30
|
+
"typescript": "^5.3.3"
|
|
31
|
+
},
|
|
32
|
+
"engines": {
|
|
33
|
+
"node": ">=18"
|
|
34
|
+
},
|
|
35
|
+
"license": "BUSL-1.1",
|
|
36
|
+
"publishConfig": {
|
|
37
|
+
"access": "public"
|
|
38
|
+
},
|
|
39
|
+
"homepage": "https://github.com/lobu-ai/lobu",
|
|
40
|
+
"repository": {
|
|
41
|
+
"type": "git",
|
|
42
|
+
"url": "git+https://github.com/lobu-ai/lobu.git",
|
|
43
|
+
"directory": "packages/promptfoo-provider"
|
|
44
|
+
}
|
|
45
|
+
}
|