npm - @artemiskit/adapter-deepagents - Versions diffs - 0.2.0 - Mend

@artemiskit/adapter-deepagents 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,133 @@
+# @artemiskit/adapter-deepagents
+## 0.2.0
+### Minor Changes
+- ## v0.3.0 - SDK, Guardian Mode & OWASP Compliance
+  This major release delivers the full programmatic SDK, runtime protection with Guardian Mode, OWASP LLM Top 10 2025 attack vectors, and agentic framework adapters.
+  ### Programmatic SDK (`@artemiskit/sdk`)
+  The new SDK package provides a complete programmatic API for LLM evaluation:
+  - **ArtemisKit class** with `run()`, `redteam()`, and `stress()` methods
+  - **Jest integration** with custom matchers (`toPassAllCases`, `toHaveSuccessRate`, etc.)
+  - **Vitest integration** with identical matchers
+  - **Event handling** for real-time progress updates
+  - **13 custom matchers** for run, red team, and stress test assertions
+  ```typescript
+  import { ArtemisKit } from "@artemiskit/sdk";
+  import { jestMatchers } from "@artemiskit/sdk/jest";
+  expect.extend(jestMatchers);
+  const kit = new ArtemisKit({ provider: "openai", model: "gpt-4o" });
+  const results = await kit.run({ scenario: "./tests.yaml" });
+  expect(results).toPassAllCases();
+  ```
+  ### Guardian Mode (Runtime Protection)
+  New Guardian Mode provides runtime protection for AI/LLM applications:
+  - **Three operating modes**: `testing`, `guardian`, `hybrid`
+  - **Prompt injection detection** and blocking
+  - **PII detection & redaction** (email, SSN, phone, API keys)
+  - **Action validation** for agent tool/function calls
+  - **Intent classification** with risk assessment
+  - **Circuit breaker** for automatic blocking on repeated violations
+  - **Rate limiting** and **cost limiting**
+  - **Custom policies** via TypeScript or YAML
+  ```typescript
+  import { createGuardian } from "@artemiskit/sdk/guardian";
+  const guardian = createGuardian({ mode: "guardian", blockOnFailure: true });
+  const protectedClient = guardian.protect(myLLMClient);
+  ```
+  ### OWASP LLM Top 10 2025 Attack Vectors
+  New red team mutations aligned with OWASP LLM Top 10 2025:
+  | Mutation             | OWASP | Description                    |
+  | -------------------- | ----- | ------------------------------ |
+  | `bad-likert-judge`   | LLM01 | Exploit evaluation capability  |
+  | `crescendo`          | LLM01 | Multi-turn gradual escalation  |
+  | `deceptive-delight`  | LLM01 | Positive framing bypass        |
+  | `system-extraction`  | LLM07 | System prompt leakage          |
+  | `output-injection`   | LLM05 | XSS, SQLi in output            |
+  | `excessive-agency`   | LLM06 | Unauthorized action claims     |
+  | `hallucination-trap` | LLM09 | Confident fabrication triggers |
+  ```bash
+  akit redteam scenario.yaml --owasp LLM01,LLM05
+  akit redteam scenario.yaml --owasp-full
+  ```
+  ### Agentic Framework Adapters
+  New adapters for testing agentic AI systems:
+  **LangChain Adapter** (`@artemiskit/adapter-langchain`)
+  - Test chains, agents, and runnables
+  - Capture intermediate steps and tool usage
+  - Support for LCEL, ReAct agents, RAG chains
+  **DeepAgents Adapter** (`@artemiskit/adapter-deepagents`)
+  - Test multi-agent systems and workflows
+  - Capture agent traces and inter-agent messages
+  - Support for sequential, parallel, and hierarchical workflows
+  ```typescript
+  import { createLangChainAdapter } from "@artemiskit/adapter-langchain";
+  import { createDeepAgentsAdapter } from "@artemiskit/adapter-deepagents";
+  const adapter = createLangChainAdapter(myChain, {
+    captureIntermediateSteps: true,
+  });
+  const result = await adapter.generate({ prompt: "Test query" });
+  ```
+  ### Supabase Storage Enhancements
+  Enhanced cloud storage capabilities:
+  - **Analytics tables** for metrics tracking
+  - **Case results table** for granular analysis
+  - **Baseline management** for regression detection
+  - **Trend analysis** queries
+  ### Bug Fixes
+  - **adapter-openai**: Use `max_completion_tokens` for newer OpenAI models (o1, o3, gpt-4.5)
+  - **redteam**: Resolve TypeScript and flaky test issues in OWASP mutations
+  - **adapters**: Fix TypeScript build errors for agentic adapters
+  - **core**: Add `langchain` and `deepagents` to ProviderType union
+  ### Examples
+  New comprehensive examples organized by feature:
+  - `examples/guardian/` - Guardian Mode examples (testing, guardian, hybrid modes)
+  - `examples/sdk/` - SDK usage examples (Jest, Vitest, events)
+  - `examples/adapters/` - Agentic adapter examples
+  - `examples/owasp/` - OWASP LLM Top 10 test scenarios
+  ### Documentation
+  - Complete SDK documentation with API reference
+  - Guardian Mode guide with all three modes explained
+  - Agentic adapters documentation (LangChain, DeepAgents)
+  - Test matchers reference for Jest/Vitest
+  - OWASP LLM Top 10 testing scenarios
+### Patch Changes
+- Updated dependencies
+  - @artemiskit/core@0.3.0

package/README.md ADDED Viewed

@@ -0,0 +1,198 @@
+# @artemiskit/adapter-deepagents
+DeepAgents.js adapter for ArtemisKit - Test and evaluate DeepAgents multi-agent systems.
+## Installation
+```bash
+bun add @artemiskit/adapter-deepagents
+# or
+npm install @artemiskit/adapter-deepagents
+```
+## Quick Start
+### Testing a Multi-Agent Team
+```typescript
+import { createDeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
+import { createTeam, Agent } from 'deepagents';
+// Create your DeepAgents team
+const researcher = new Agent({
+  name: 'researcher',
+  role: 'Research specialist',
+  tools: [webSearchTool, documentReader],
+});
+const writer = new Agent({
+  name: 'writer',
+  role: 'Content writer',
+});
+const team = createTeam({
+  agents: [researcher, writer],
+  workflow: 'sequential',
+});
+// Wrap with ArtemisKit adapter
+const adapter = createDeepAgentsAdapter(team, {
+  name: 'content-team',
+  captureTraces: true,
+  captureMessages: true,
+});
+// Use in ArtemisKit tests
+const result = await adapter.generate({
+  prompt: 'Write an article about AI testing best practices',
+});
+console.log(result.text); // Final article content
+// Access execution metadata
+const metadata = result.raw.metadata;
+console.log(metadata.agentsInvolved); // ['researcher', 'writer']
+console.log(metadata.totalToolCalls); // Number of tool invocations
+console.log(metadata.totalMessages); // Messages exchanged between agents
+```
+### Testing a Hierarchical Agent System
+```typescript
+import { createDeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
+import { createHierarchy, Coordinator, Worker } from 'deepagents';
+// Create hierarchical system
+const coordinator = new Coordinator({
+  name: 'manager',
+  strategy: 'delegate-and-synthesize',
+});
+const workers = [
+  new Worker({ name: 'analyst', specialty: 'data analysis' }),
+  new Worker({ name: 'visualizer', specialty: 'chart creation' }),
+];
+const hierarchy = createHierarchy({ coordinator, workers });
+const adapter = createDeepAgentsAdapter(hierarchy, {
+  name: 'analysis-team',
+  executionTimeout: 120000, // 2 minutes
+});
+const result = await adapter.generate({
+  prompt: 'Analyze sales data and create visualizations',
+});
+```
+### Testing with Execution Tracing
+```typescript
+const adapter = createDeepAgentsAdapter(mySystem, {
+  name: 'traced-system',
+  captureTraces: true,
+  captureMessages: true,
+});
+const result = await adapter.generate({ prompt: 'Complex task' });
+// Access detailed execution information
+const { metadata } = result.raw;
+// See which agents were involved
+console.log('Agents:', metadata.agentsInvolved);
+// See tools used
+console.log('Tools:', metadata.toolsUsed);
+// See full trace of execution
+for (const trace of metadata.traces) {
+  console.log(`${trace.agent}: ${trace.action}`, trace.output);
+}
+// See inter-agent communication
+for (const msg of metadata.messages) {
+  console.log(`${msg.from} -> ${msg.to}: ${msg.content}`);
+}
+```
+## Configuration Options
+| Option              | Type      | Default    | Description                              |
+| ------------------- | --------- | ---------- | ---------------------------------------- |
+| `name`              | `string`  | -          | Identifier for the agent system          |
+| `captureTraces`     | `boolean` | `true`     | Capture agent execution traces           |
+| `captureMessages`   | `boolean` | `true`     | Capture inter-agent messages             |
+| `executionTimeout`  | `number`  | `300000`   | Max execution time (ms)                  |
+| `inputTransformer`  | `string`  | -          | Custom input transformer function        |
+| `outputTransformer` | `string`  | -          | Custom output transformer function       |
+## Supported System Types
+The adapter supports DeepAgents systems that implement one of these methods:
+- `invoke(input, config)` - Primary execution method
+- `run(input, config)` - Alternative execution method
+- `execute(input, config)` - Legacy execution method
+All methods should return a `DeepAgentsOutput` object.
+## Streaming Support
+If your system supports streaming via `stream()`, the adapter will use it:
+```typescript
+for await (const chunk of adapter.stream({ prompt: 'Long task' }, console.log)) {
+  // Process streaming events
+}
+```
+## Execution Metadata
+The adapter captures rich metadata about multi-agent execution:
+```typescript
+interface DeepAgentsExecutionMetadata {
+  name?: string;                  // System name
+  agentsInvolved: string[];       // All participating agents
+  totalAgentCalls: number;        // Total agent invocations
+  totalMessages: number;          // Messages exchanged
+  totalToolCalls: number;         // Tool invocations across agents
+  toolsUsed: string[];            // Unique tools used
+  traces?: DeepAgentsTrace[];     // Full execution traces
+  messages?: DeepAgentsMessage[]; // Full message log
+  executionTimeMs?: number;       // Total execution time
+}
+```
+## ArtemisKit Integration
+Use with ArtemisKit scenarios:
+```yaml
+# scenario.yaml
+name: multi-agent-test
+provider: deepagents
+scenarios:
+  - name: Research Task
+    input: 'Research and summarize recent AI developments'
+    expected:
+      contains: ['AI', 'development']
+    metadata:
+      minAgents: 2
+```
+```typescript
+// Register adapter in your test setup
+import { adapterRegistry } from '@artemiskit/core';
+import { DeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
+adapterRegistry.register('deepagents', async (config) => {
+  // Your system setup
+  return new DeepAgentsAdapter(config, myTeam);
+});
+```
+## License
+Apache-2.0

package/dist/client.d.ts ADDED Viewed

@@ -0,0 +1,81 @@
+/**
+ * DeepAgents Adapter
+ * Wraps DeepAgents multi-agent systems for ArtemisKit testing
+ */
+import type { AdapterConfig, GenerateOptions, GenerateResult, ModelCapabilities, ModelClient } from '@artemiskit/core';
+import type { DeepAgentsAdapterConfig, DeepAgentsSystem } from './types';
+/**
+ * Adapter for testing DeepAgents multi-agent systems with ArtemisKit
+ *
+ * @example
+ * ```typescript
+ * import { DeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
+ * import { createTeam } from 'deepagents';
+ *
+ * // Create a DeepAgents team
+ * const team = createTeam({
+ *   agents: [researcher, writer, editor],
+ *   workflow: 'sequential',
+ * });
+ *
+ * // Wrap with ArtemisKit adapter
+ * const adapter = new DeepAgentsAdapter(
+ *   { provider: 'deepagents', name: 'content-team' },
+ *   team
+ * );
+ *
+ * // Use in ArtemisKit tests
+ * const result = await adapter.generate({
+ *   prompt: 'Write an article about AI testing',
+ * });
+ * ```
+ */
+export declare class DeepAgentsAdapter implements ModelClient {
+    private system;
+    private config;
+    readonly provider = "deepagents";
+    private traces;
+    private messages;
+    constructor(config: AdapterConfig, system: DeepAgentsSystem);
+    /**
+     * Validate that the system has a usable execution method
+     */
+    private validateSystem;
+    generate(options: GenerateOptions): Promise<GenerateResult>;
+    /**
+     * Execute the DeepAgents system using available method
+     */
+    private executeSystem;
+    stream(options: GenerateOptions, onChunk: (chunk: string) => void): AsyncIterable<string>;
+    capabilities(): Promise<ModelCapabilities>;
+    close(): Promise<void>;
+    /**
+     * Prepare input for the DeepAgents system
+     */
+    private prepareInput;
+    /**
+     * Extract text output from DeepAgents response
+     */
+    private extractOutput;
+    /**
+     * Create callbacks for execution tracking
+     */
+    private createCallbacks;
+    /**
+     * Extract execution metadata
+     */
+    private extractMetadata;
+}
+/**
+ * Factory function to create a DeepAgents adapter
+ *
+ * @example
+ * ```typescript
+ * const adapter = createDeepAgentsAdapter(myTeam, {
+ *   name: 'research-team',
+ *   captureTraces: true,
+ * });
+ * ```
+ */
+export declare function createDeepAgentsAdapter(system: DeepAgentsSystem, options?: Partial<DeepAgentsAdapterConfig>): DeepAgentsAdapter;
+//# sourceMappingURL=client.d.ts.map

package/dist/client.d.ts.map ADDED Viewed

@@ -0,0 +1 @@

+ {"version":3,"file":"client.d.ts","sourceRoot":"","sources":["../src/client.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,OAAO,KAAK,EACV,aAAa,EACb,eAAe,EACf,cAAc,EACd,iBAAiB,EACjB,WAAW,EACZ,MAAM,kBAAkB,CAAC;AAE1B,OAAO,KAAK,EACV,uBAAuB,EAKvB,gBAAgB,EAEjB,MAAM,SAAS,CAAC;AAEjB;;;;;;;;;;;;;;;;;;;;;;;;;GAyBG;AACH,qBAAa,iBAAkB,YAAW,WAAW;IACnD,OAAO,CAAC,MAAM,CAAmB;IACjC,OAAO,CAAC,MAAM,CAA0B;IACxC,QAAQ,CAAC,QAAQ,gBAAgB;IAGjC,OAAO,CAAC,MAAM,CAAyB;IACvC,OAAO,CAAC,QAAQ,CAA2B;gBAE/B,MAAM,EAAE,aAAa,EAAE,MAAM,EAAE,gBAAgB;IAM3D;;OAEG;IACH,OAAO,CAAC,cAAc;IAWhB,QAAQ,CAAC,OAAO,EAAE,eAAe,GAAG,OAAO,CAAC,cAAc,CAAC;IAkDjE;;OAEG;YACW,aAAa;IAkBpB,MAAM,CAAC,OAAO,EAAE,eAAe,EAAE,OAAO,EAAE,CAAC,KAAK,EAAE,MAAM,KAAK,IAAI,GAAG,aAAa,CAAC,MAAM,CAAC;IA0C1F,YAAY,IAAI,OAAO,CAAC,iBAAiB,CAAC;IAW1C,KAAK,IAAI,OAAO,CAAC,IAAI,CAAC;IAI5B;;OAEG;IACH,OAAO,CAAC,YAAY;IAkCpB;;OAEG;IACH,OAAO,CAAC,aAAa;IAyBrB;;OAEG;IACH,OAAO,CAAC,eAAe;IA+CvB;;OAEG;IACH,OAAO,CAAC,eAAe;CA4CxB;AAED;;;;;;;;;;GAUG;AACH,wBAAgB,uBAAuB,CACrC,MAAM,EAAE,gBAAgB,EACxB,OAAO,CAAC,EAAE,OAAO,CAAC,uBAAuB,CAAC,GACzC,iBAAiB,CAOnB"}

package/dist/index.d.ts ADDED Viewed

@@ -0,0 +1,21 @@
+/**
+ * @artemiskit/adapter-deepagents
+ *
+ * DeepAgents.js adapter for ArtemisKit LLM evaluation toolkit.
+ * Enables testing of DeepAgents multi-agent systems.
+ *
+ * @example
+ * ```typescript
+ * import { createDeepAgentsAdapter } from '@artemiskit/adapter-deepagents';
+ * import { createTeam } from 'deepagents';
+ *
+ * const team = createTeam({ agents: [researcher, writer] });
+ * const adapter = createDeepAgentsAdapter(team, { name: 'content-team' });
+ *
+ * // Use with ArtemisKit
+ * const result = await adapter.generate({ prompt: 'Create content' });
+ * ```
+ */
+export { DeepAgentsAdapter, createDeepAgentsAdapter } from './client';
+export type { DeepAgentsAdapterConfig, DeepAgentsSystem, DeepAgentsInput, DeepAgentsOutput, DeepAgentsConfig, DeepAgentsTrace, DeepAgentsMessage, DeepAgentsStreamEvent, DeepAgentsCallbacks, DeepAgentsExecutionMetadata, } from './types';
+//# sourceMappingURL=index.d.ts.map

package/dist/index.d.ts.map ADDED Viewed

	@@ -0,0 +1 @@
1	+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;;;;;;;;GAiBG;AAEH,OAAO,EAAE,iBAAiB,EAAE,uBAAuB,EAAE,MAAM,UAAU,CAAC;AACtE,YAAY,EACV,uBAAuB,EACvB,gBAAgB,EAChB,eAAe,EACf,gBAAgB,EAChB,gBAAgB,EAChB,eAAe,EACf,iBAAiB,EACjB,qBAAqB,EACrB,mBAAmB,EACnB,2BAA2B,GAC5B,MAAM,SAAS,CAAC"}