@artemiskit/adapter-deepagents 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,133 @@
1
+ # @artemiskit/adapter-deepagents
2
+
3
+ ## 0.2.0
4
+
5
+ ### Minor Changes
6
+
7
+ - ## v0.3.0 - SDK, Guardian Mode & OWASP Compliance
8
+
9
+ This major release delivers the full programmatic SDK, runtime protection with Guardian Mode, OWASP LLM Top 10 2025 attack vectors, and agentic framework adapters.
10
+
11
+ ### Programmatic SDK (`@artemiskit/sdk`)
12
+
13
+ The new SDK package provides a complete programmatic API for LLM evaluation:
14
+
15
+ - **ArtemisKit class** with `run()`, `redteam()`, and `stress()` methods
16
+ - **Jest integration** with custom matchers (`toPassAllCases`, `toHaveSuccessRate`, etc.)
17
+ - **Vitest integration** with identical matchers
18
+ - **Event handling** for real-time progress updates
19
+ - **13 custom matchers** for run, red team, and stress test assertions
20
+
21
+ ```typescript
22
+ import { ArtemisKit } from "@artemiskit/sdk";
23
+ import { jestMatchers } from "@artemiskit/sdk/jest";
24
+
25
+ expect.extend(jestMatchers);
26
+
27
+ const kit = new ArtemisKit({ provider: "openai", model: "gpt-4o" });
28
+ const results = await kit.run({ scenario: "./tests.yaml" });
29
+ expect(results).toPassAllCases();
30
+ ```
31
+
32
+ ### Guardian Mode (Runtime Protection)
33
+
34
+ New Guardian Mode provides runtime protection for AI/LLM applications:
35
+
36
+ - **Three operating modes**: `testing`, `guardian`, `hybrid`
37
+ - **Prompt injection detection** and blocking
38
+ - **PII detection & redaction** (email, SSN, phone, API keys)
39
+ - **Action validation** for agent tool/function calls
40
+ - **Intent classification** with risk assessment
41
+ - **Circuit breaker** for automatic blocking on repeated violations
42
+ - **Rate limiting** and **cost limiting**
43
+ - **Custom policies** via TypeScript or YAML
44
+
45
+ ```typescript
46
+ import { createGuardian } from "@artemiskit/sdk/guardian";
47
+
48
+ const guardian = createGuardian({ mode: "guardian", blockOnFailure: true });
49
+ const protectedClient = guardian.protect(myLLMClient);
50
+ ```
51
+
52
+ ### OWASP LLM Top 10 2025 Attack Vectors
53
+
54
+ New red team mutations aligned with OWASP LLM Top 10 2025:
55
+
56
+ | Mutation | OWASP | Description |
57
+ | -------------------- | ----- | ------------------------------ |
58
+ | `bad-likert-judge` | LLM01 | Exploit evaluation capability |
59
+ | `crescendo` | LLM01 | Multi-turn gradual escalation |
60
+ | `deceptive-delight` | LLM01 | Positive framing bypass |
61
+ | `system-extraction` | LLM07 | System prompt leakage |
62
+ | `output-injection` | LLM05 | XSS, SQLi in output |
63
+ | `excessive-agency` | LLM06 | Unauthorized action claims |
64
+ | `hallucination-trap` | LLM09 | Confident fabrication triggers |
65
+
66
+ ```bash
67
+ akit redteam scenario.yaml --owasp LLM01,LLM05
68
+ akit redteam scenario.yaml --owasp-full
69
+ ```
70
+
71
+ ### Agentic Framework Adapters
72
+
73
+ New adapters for testing agentic AI systems:
74
+
75
+ **LangChain Adapter** (`@artemiskit/adapter-langchain`)
76
+
77
+ - Test chains, agents, and runnables
78
+ - Capture intermediate steps and tool usage
79
+ - Support for LCEL, ReAct agents, RAG chains
80
+
81
+ **DeepAgents Adapter** (`@artemiskit/adapter-deepagents`)
82
+
83
+ - Test multi-agent systems and workflows
84
+ - Capture agent traces and inter-agent messages
85
+ - Support for sequential, parallel, and hierarchical workflows
86
+
87
+ ```typescript
88
+ import { createLangChainAdapter } from "@artemiskit/adapter-langchain";
89
+ import { createDeepAgentsAdapter } from "@artemiskit/adapter-deepagents";
90
+
91
+ const adapter = createLangChainAdapter(myChain, {
92
+ captureIntermediateSteps: true,
93
+ });
94
+ const result = await adapter.generate({ prompt: "Test query" });
95
+ ```
96
+
97
+ ### Supabase Storage Enhancements
98
+
99
+ Enhanced cloud storage capabilities:
100
+
101
+ - **Analytics tables** for metrics tracking
102
+ - **Case results table** for granular analysis
103
+ - **Baseline management** for regression detection
104
+ - **Trend analysis** queries
105
+
106
+ ### Bug Fixes
107
+
108
+ - **adapter-openai**: Use `max_completion_tokens` for newer OpenAI models (o1, o3, gpt-4.5)
109
+ - **redteam**: Resolve TypeScript and flaky test issues in OWASP mutations
110
+ - **adapters**: Fix TypeScript build errors for agentic adapters
111
+ - **core**: Add `langchain` and `deepagents` to ProviderType union
112
+
113
+ ### Examples
114
+
115
+ New comprehensive examples organized by feature:
116
+
117
+ - `examples/guardian/` - Guardian Mode examples (testing, guardian, hybrid modes)
118
+ - `examples/sdk/` - SDK usage examples (Jest, Vitest, events)
119
+ - `examples/adapters/` - Agentic adapter examples
120
+ - `examples/owasp/` - OWASP LLM Top 10 test scenarios
121
+
122
+ ### Documentation
123
+
124
+ - Complete SDK documentation with API reference
125
+ - Guardian Mode guide with all three modes explained
126
+ - Agentic adapters documentation (LangChain, DeepAgents)
127
+ - Test matchers reference for Jest/Vitest
128
+ - OWASP LLM Top 10 testing scenarios
129
+
130
+ ### Patch Changes
131
+
132
+ - Updated dependencies
133
+ - @artemiskit/core@0.3.0
package/README.md ADDED
@@ -0,0 +1,198 @@
1
+ # @artemiskit/adapter-deepagents
2
+
3
+ DeepAgents.js adapter for ArtemisKit - Test and evaluate DeepAgents multi-agent systems.
4
+
5
+ ## Installation
6
+
7
+ ```bash
8
+ bun add @artemiskit/adapter-deepagents
9
+ # or
10
+ npm install @artemiskit/adapter-deepagents
11
+ ```
12
+
13
+ ## Quick Start
14
+
15
+ ### Testing a Multi-Agent Team
16
+
17
+ ```typescript
18
+ import { createDeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
19
+ import { createTeam, Agent } from 'deepagents';
20
+
21
+ // Create your DeepAgents team
22
+ const researcher = new Agent({
23
+ name: 'researcher',
24
+ role: 'Research specialist',
25
+ tools: [webSearchTool, documentReader],
26
+ });
27
+
28
+ const writer = new Agent({
29
+ name: 'writer',
30
+ role: 'Content writer',
31
+ });
32
+
33
+ const team = createTeam({
34
+ agents: [researcher, writer],
35
+ workflow: 'sequential',
36
+ });
37
+
38
+ // Wrap with ArtemisKit adapter
39
+ const adapter = createDeepAgentsAdapter(team, {
40
+ name: 'content-team',
41
+ captureTraces: true,
42
+ captureMessages: true,
43
+ });
44
+
45
+ // Use in ArtemisKit tests
46
+ const result = await adapter.generate({
47
+ prompt: 'Write an article about AI testing best practices',
48
+ });
49
+
50
+ console.log(result.text); // Final article content
51
+
52
+ // Access execution metadata
53
+ const metadata = result.raw.metadata;
54
+ console.log(metadata.agentsInvolved); // ['researcher', 'writer']
55
+ console.log(metadata.totalToolCalls); // Number of tool invocations
56
+ console.log(metadata.totalMessages); // Messages exchanged between agents
57
+ ```
58
+
59
+ ### Testing a Hierarchical Agent System
60
+
61
+ ```typescript
62
+ import { createDeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
63
+ import { createHierarchy, Coordinator, Worker } from 'deepagents';
64
+
65
+ // Create hierarchical system
66
+ const coordinator = new Coordinator({
67
+ name: 'manager',
68
+ strategy: 'delegate-and-synthesize',
69
+ });
70
+
71
+ const workers = [
72
+ new Worker({ name: 'analyst', specialty: 'data analysis' }),
73
+ new Worker({ name: 'visualizer', specialty: 'chart creation' }),
74
+ ];
75
+
76
+ const hierarchy = createHierarchy({ coordinator, workers });
77
+
78
+ const adapter = createDeepAgentsAdapter(hierarchy, {
79
+ name: 'analysis-team',
80
+ executionTimeout: 120000, // 2 minutes
81
+ });
82
+
83
+ const result = await adapter.generate({
84
+ prompt: 'Analyze sales data and create visualizations',
85
+ });
86
+ ```
87
+
88
+ ### Testing with Execution Tracing
89
+
90
+ ```typescript
91
+ const adapter = createDeepAgentsAdapter(mySystem, {
92
+ name: 'traced-system',
93
+ captureTraces: true,
94
+ captureMessages: true,
95
+ });
96
+
97
+ const result = await adapter.generate({ prompt: 'Complex task' });
98
+
99
+ // Access detailed execution information
100
+ const { metadata } = result.raw;
101
+
102
+ // See which agents were involved
103
+ console.log('Agents:', metadata.agentsInvolved);
104
+
105
+ // See tools used
106
+ console.log('Tools:', metadata.toolsUsed);
107
+
108
+ // See full trace of execution
109
+ for (const trace of metadata.traces) {
110
+ console.log(`${trace.agent}: ${trace.action}`, trace.output);
111
+ }
112
+
113
+ // See inter-agent communication
114
+ for (const msg of metadata.messages) {
115
+ console.log(`${msg.from} -> ${msg.to}: ${msg.content}`);
116
+ }
117
+ ```
118
+
119
+ ## Configuration Options
120
+
121
+ | Option | Type | Default | Description |
122
+ | ------------------- | --------- | ---------- | ---------------------------------------- |
123
+ | `name` | `string` | - | Identifier for the agent system |
124
+ | `captureTraces` | `boolean` | `true` | Capture agent execution traces |
125
+ | `captureMessages` | `boolean` | `true` | Capture inter-agent messages |
126
+ | `executionTimeout` | `number` | `300000` | Max execution time (ms) |
127
+ | `inputTransformer` | `string` | - | Custom input transformer function |
128
+ | `outputTransformer` | `string` | - | Custom output transformer function |
129
+
130
+ ## Supported System Types
131
+
132
+ The adapter supports DeepAgents systems that implement one of these methods:
133
+
134
+ - `invoke(input, config)` - Primary execution method
135
+ - `run(input, config)` - Alternative execution method
136
+ - `execute(input, config)` - Legacy execution method
137
+
138
+ All methods should return a `DeepAgentsOutput` object.
139
+
140
+ ## Streaming Support
141
+
142
+ If your system supports streaming via `stream()`, the adapter will use it:
143
+
144
+ ```typescript
145
+ for await (const chunk of adapter.stream({ prompt: 'Long task' }, console.log)) {
146
+ // Process streaming events
147
+ }
148
+ ```
149
+
150
+ ## Execution Metadata
151
+
152
+ The adapter captures rich metadata about multi-agent execution:
153
+
154
+ ```typescript
155
+ interface DeepAgentsExecutionMetadata {
156
+ name?: string; // System name
157
+ agentsInvolved: string[]; // All participating agents
158
+ totalAgentCalls: number; // Total agent invocations
159
+ totalMessages: number; // Messages exchanged
160
+ totalToolCalls: number; // Tool invocations across agents
161
+ toolsUsed: string[]; // Unique tools used
162
+ traces?: DeepAgentsTrace[]; // Full execution traces
163
+ messages?: DeepAgentsMessage[]; // Full message log
164
+ executionTimeMs?: number; // Total execution time
165
+ }
166
+ ```
167
+
168
+ ## ArtemisKit Integration
169
+
170
+ Use with ArtemisKit scenarios:
171
+
172
+ ```yaml
173
+ # scenario.yaml
174
+ name: multi-agent-test
175
+ provider: deepagents
176
+ scenarios:
177
+ - name: Research Task
178
+ input: 'Research and summarize recent AI developments'
179
+ expected:
180
+ contains: ['AI', 'development']
181
+ metadata:
182
+ minAgents: 2
183
+ ```
184
+
185
+ ```typescript
186
+ // Register adapter in your test setup
187
+ import { adapterRegistry } from '@artemiskit/core';
188
+ import { DeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
189
+
190
+ adapterRegistry.register('deepagents', async (config) => {
191
+ // Your system setup
192
+ return new DeepAgentsAdapter(config, myTeam);
193
+ });
194
+ ```
195
+
196
+ ## License
197
+
198
+ Apache-2.0
@@ -0,0 +1,81 @@
1
+ /**
2
+ * DeepAgents Adapter
3
+ * Wraps DeepAgents multi-agent systems for ArtemisKit testing
4
+ */
5
+ import type { AdapterConfig, GenerateOptions, GenerateResult, ModelCapabilities, ModelClient } from '@artemiskit/core';
6
+ import type { DeepAgentsAdapterConfig, DeepAgentsSystem } from './types';
7
+ /**
8
+ * Adapter for testing DeepAgents multi-agent systems with ArtemisKit
9
+ *
10
+ * @example
11
+ * ```typescript
12
+ * import { DeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
13
+ * import { createTeam } from 'deepagents';
14
+ *
15
+ * // Create a DeepAgents team
16
+ * const team = createTeam({
17
+ * agents: [researcher, writer, editor],
18
+ * workflow: 'sequential',
19
+ * });
20
+ *
21
+ * // Wrap with ArtemisKit adapter
22
+ * const adapter = new DeepAgentsAdapter(
23
+ * { provider: 'deepagents', name: 'content-team' },
24
+ * team
25
+ * );
26
+ *
27
+ * // Use in ArtemisKit tests
28
+ * const result = await adapter.generate({
29
+ * prompt: 'Write an article about AI testing',
30
+ * });
31
+ * ```
32
+ */
33
+ export declare class DeepAgentsAdapter implements ModelClient {
34
+ private system;
35
+ private config;
36
+ readonly provider = "deepagents";
37
+ private traces;
38
+ private messages;
39
+ constructor(config: AdapterConfig, system: DeepAgentsSystem);
40
+ /**
41
+ * Validate that the system has a usable execution method
42
+ */
43
+ private validateSystem;
44
+ generate(options: GenerateOptions): Promise<GenerateResult>;
45
+ /**
46
+ * Execute the DeepAgents system using available method
47
+ */
48
+ private executeSystem;
49
+ stream(options: GenerateOptions, onChunk: (chunk: string) => void): AsyncIterable<string>;
50
+ capabilities(): Promise<ModelCapabilities>;
51
+ close(): Promise<void>;
52
+ /**
53
+ * Prepare input for the DeepAgents system
54
+ */
55
+ private prepareInput;
56
+ /**
57
+ * Extract text output from DeepAgents response
58
+ */
59
+ private extractOutput;
60
+ /**
61
+ * Create callbacks for execution tracking
62
+ */
63
+ private createCallbacks;
64
+ /**
65
+ * Extract execution metadata
66
+ */
67
+ private extractMetadata;
68
+ }
69
+ /**
70
+ * Factory function to create a DeepAgents adapter
71
+ *
72
+ * @example
73
+ * ```typescript
74
+ * const adapter = createDeepAgentsAdapter(myTeam, {
75
+ * name: 'research-team',
76
+ * captureTraces: true,
77
+ * });
78
+ * ```
79
+ */
80
+ export declare function createDeepAgentsAdapter(system: DeepAgentsSystem, options?: Partial<DeepAgentsAdapterConfig>): DeepAgentsAdapter;
81
+ //# sourceMappingURL=client.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"client.d.ts","sourceRoot":"","sources":["../src/client.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,OAAO,KAAK,EACV,aAAa,EACb,eAAe,EACf,cAAc,EACd,iBAAiB,EACjB,WAAW,EACZ,MAAM,kBAAkB,CAAC;AAE1B,OAAO,KAAK,EACV,uBAAuB,EAKvB,gBAAgB,EAEjB,MAAM,SAAS,CAAC;AAEjB;;;;;;;;;;;;;;;;;;;;;;;;;GAyBG;AACH,qBAAa,iBAAkB,YAAW,WAAW;IACnD,OAAO,CAAC,MAAM,CAAmB;IACjC,OAAO,CAAC,MAAM,CAA0B;IACxC,QAAQ,CAAC,QAAQ,gBAAgB;IAGjC,OAAO,CAAC,MAAM,CAAyB;IACvC,OAAO,CAAC,QAAQ,CAA2B;gBAE/B,MAAM,EAAE,aAAa,EAAE,MAAM,EAAE,gBAAgB;IAM3D;;OAEG;IACH,OAAO,CAAC,cAAc;IAWhB,QAAQ,CAAC,OAAO,EAAE,eAAe,GAAG,OAAO,CAAC,cAAc,CAAC;IAkDjE;;OAEG;YACW,aAAa;IAkBpB,MAAM,CAAC,OAAO,EAAE,eAAe,EAAE,OAAO,EAAE,CAAC,KAAK,EAAE,MAAM,KAAK,IAAI,GAAG,aAAa,CAAC,MAAM,CAAC;IA0C1F,YAAY,IAAI,OAAO,CAAC,iBAAiB,CAAC;IAW1C,KAAK,IAAI,OAAO,CAAC,IAAI,CAAC;IAI5B;;OAEG;IACH,OAAO,CAAC,YAAY;IAkCpB;;OAEG;IACH,OAAO,CAAC,aAAa;IAyBrB;;OAEG;IACH,OAAO,CAAC,eAAe;IA+CvB;;OAEG;IACH,OAAO,CAAC,eAAe;CA4CxB;AAED;;;;;;;;;;GAUG;AACH,wBAAgB,uBAAuB,CACrC,MAAM,EAAE,gBAAgB,EACxB,OAAO,CAAC,EAAE,OAAO,CAAC,uBAAuB,CAAC,GACzC,iBAAiB,CAOnB"}
@@ -0,0 +1,21 @@
1
+ /**
2
+ * @artemiskit/adapter-deepagents
3
+ *
4
+ * DeepAgents.js adapter for ArtemisKit LLM evaluation toolkit.
5
+ * Enables testing of DeepAgents multi-agent systems.
6
+ *
7
+ * @example
8
+ * ```typescript
9
+ * import { createDeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
10
+ * import { createTeam } from 'deepagents';
11
+ *
12
+ * const team = createTeam({ agents: [researcher, writer] });
13
+ * const adapter = createDeepAgentsAdapter(team, { name: 'content-team' });
14
+ *
15
+ * // Use with ArtemisKit
16
+ * const result = await adapter.generate({ prompt: 'Create content' });
17
+ * ```
18
+ */
19
+ export { DeepAgentsAdapter, createDeepAgentsAdapter } from './client';
20
+ export type { DeepAgentsAdapterConfig, DeepAgentsSystem, DeepAgentsInput, DeepAgentsOutput, DeepAgentsConfig, DeepAgentsTrace, DeepAgentsMessage, DeepAgentsStreamEvent, DeepAgentsCallbacks, DeepAgentsExecutionMetadata, } from './types';
21
+ //# sourceMappingURL=index.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;;;;;;;;GAiBG;AAEH,OAAO,EAAE,iBAAiB,EAAE,uBAAuB,EAAE,MAAM,UAAU,CAAC;AACtE,YAAY,EACV,uBAAuB,EACvB,gBAAgB,EAChB,eAAe,EACf,gBAAgB,EAChB,gBAAgB,EAChB,eAAe,EACf,iBAAiB,EACjB,qBAAqB,EACrB,mBAAmB,EACnB,2BAA2B,GAC5B,MAAM,SAAS,CAAC"}