@artemiskit/redteam 0.2.3 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/CHANGELOG.md +139 -0
  2. package/adapters/openai/dist/index.js +5612 -0
  3. package/dist/index.d.ts +1 -0
  4. package/dist/index.d.ts.map +1 -1
  5. package/dist/index.js +1184 -2
  6. package/dist/mutations/bad-likert-judge.d.ts +41 -0
  7. package/dist/mutations/bad-likert-judge.d.ts.map +1 -0
  8. package/dist/mutations/crescendo.d.ts +50 -0
  9. package/dist/mutations/crescendo.d.ts.map +1 -0
  10. package/dist/mutations/deceptive-delight.d.ts +49 -0
  11. package/dist/mutations/deceptive-delight.d.ts.map +1 -0
  12. package/dist/mutations/excessive-agency.d.ts +45 -0
  13. package/dist/mutations/excessive-agency.d.ts.map +1 -0
  14. package/dist/mutations/hallucination-trap.d.ts +51 -0
  15. package/dist/mutations/hallucination-trap.d.ts.map +1 -0
  16. package/dist/mutations/index.d.ts +86 -0
  17. package/dist/mutations/index.d.ts.map +1 -1
  18. package/dist/mutations/output-injection.d.ts +45 -0
  19. package/dist/mutations/output-injection.d.ts.map +1 -0
  20. package/dist/mutations/system-extraction.d.ts +44 -0
  21. package/dist/mutations/system-extraction.d.ts.map +1 -0
  22. package/dist/severity.d.ts.map +1 -1
  23. package/package.json +2 -2
  24. package/src/index.ts +24 -0
  25. package/src/mutations/bad-likert-judge.ts +143 -0
  26. package/src/mutations/crescendo.ts +295 -0
  27. package/src/mutations/deceptive-delight.ts +179 -0
  28. package/src/mutations/excessive-agency.ts +179 -0
  29. package/src/mutations/hallucination-trap.ts +236 -0
  30. package/src/mutations/index.ts +152 -0
  31. package/src/mutations/output-injection.ts +237 -0
  32. package/src/mutations/owasp.test.ts +438 -0
  33. package/src/mutations/system-extraction.ts +180 -0
  34. package/src/severity.ts +86 -0
package/CHANGELOG.md CHANGED
@@ -1,5 +1,144 @@
1
1
  # @artemiskit/redteam
2
2
 
3
+ ## 0.3.0
4
+
5
+ ### Minor Changes
6
+
7
+ - ## v0.3.0 - SDK, Guardian Mode & OWASP Compliance
8
+
9
+ This major release delivers the full programmatic SDK, runtime protection with Guardian Mode, OWASP LLM Top 10 2025 attack vectors, and agentic framework adapters.
10
+
11
+ ### Programmatic SDK (`@artemiskit/sdk`)
12
+
13
+ The new SDK package provides a complete programmatic API for LLM evaluation:
14
+
15
+ - **ArtemisKit class** with `run()`, `redteam()`, and `stress()` methods
16
+ - **Jest integration** with custom matchers (`toPassAllCases`, `toHaveSuccessRate`, etc.)
17
+ - **Vitest integration** with identical matchers
18
+ - **Event handling** for real-time progress updates
19
+ - **13 custom matchers** for run, red team, and stress test assertions
20
+
21
+ ```typescript
22
+ import { ArtemisKit } from "@artemiskit/sdk";
23
+ import { jestMatchers } from "@artemiskit/sdk/jest";
24
+
25
+ expect.extend(jestMatchers);
26
+
27
+ const kit = new ArtemisKit({ provider: "openai", model: "gpt-4o" });
28
+ const results = await kit.run({ scenario: "./tests.yaml" });
29
+ expect(results).toPassAllCases();
30
+ ```
31
+
32
+ ### Guardian Mode (Runtime Protection)
33
+
34
+ New Guardian Mode provides runtime protection for AI/LLM applications:
35
+
36
+ - **Three operating modes**: `testing`, `guardian`, `hybrid`
37
+ - **Prompt injection detection** and blocking
38
+ - **PII detection & redaction** (email, SSN, phone, API keys)
39
+ - **Action validation** for agent tool/function calls
40
+ - **Intent classification** with risk assessment
41
+ - **Circuit breaker** for automatic blocking on repeated violations
42
+ - **Rate limiting** and **cost limiting**
43
+ - **Custom policies** via TypeScript or YAML
44
+
45
+ ```typescript
46
+ import { createGuardian } from "@artemiskit/sdk/guardian";
47
+
48
+ const guardian = createGuardian({ mode: "guardian", blockOnFailure: true });
49
+ const protectedClient = guardian.protect(myLLMClient);
50
+ ```
51
+
52
+ ### OWASP LLM Top 10 2025 Attack Vectors
53
+
54
+ New red team mutations aligned with OWASP LLM Top 10 2025:
55
+
56
+ | Mutation | OWASP | Description |
57
+ | -------------------- | ----- | ------------------------------ |
58
+ | `bad-likert-judge` | LLM01 | Exploit evaluation capability |
59
+ | `crescendo` | LLM01 | Multi-turn gradual escalation |
60
+ | `deceptive-delight` | LLM01 | Positive framing bypass |
61
+ | `system-extraction` | LLM07 | System prompt leakage |
62
+ | `output-injection` | LLM05 | XSS, SQLi in output |
63
+ | `excessive-agency` | LLM06 | Unauthorized action claims |
64
+ | `hallucination-trap` | LLM09 | Confident fabrication triggers |
65
+
66
+ ```bash
67
+ akit redteam scenario.yaml --owasp LLM01,LLM05
68
+ akit redteam scenario.yaml --owasp-full
69
+ ```
70
+
71
+ ### Agentic Framework Adapters
72
+
73
+ New adapters for testing agentic AI systems:
74
+
75
+ **LangChain Adapter** (`@artemiskit/adapter-langchain`)
76
+
77
+ - Test chains, agents, and runnables
78
+ - Capture intermediate steps and tool usage
79
+ - Support for LCEL, ReAct agents, RAG chains
80
+
81
+ **DeepAgents Adapter** (`@artemiskit/adapter-deepagents`)
82
+
83
+ - Test multi-agent systems and workflows
84
+ - Capture agent traces and inter-agent messages
85
+ - Support for sequential, parallel, and hierarchical workflows
86
+
87
+ ```typescript
88
+ import { createLangChainAdapter } from "@artemiskit/adapter-langchain";
89
+ import { createDeepAgentsAdapter } from "@artemiskit/adapter-deepagents";
90
+
91
+ const adapter = createLangChainAdapter(myChain, {
92
+ captureIntermediateSteps: true,
93
+ });
94
+ const result = await adapter.generate({ prompt: "Test query" });
95
+ ```
96
+
97
+ ### Supabase Storage Enhancements
98
+
99
+ Enhanced cloud storage capabilities:
100
+
101
+ - **Analytics tables** for metrics tracking
102
+ - **Case results table** for granular analysis
103
+ - **Baseline management** for regression detection
104
+ - **Trend analysis** queries
105
+
106
+ ### Bug Fixes
107
+
108
+ - **adapter-openai**: Use `max_completion_tokens` for newer OpenAI models (o1, o3, gpt-4.5)
109
+ - **redteam**: Resolve TypeScript and flaky test issues in OWASP mutations
110
+ - **adapters**: Fix TypeScript build errors for agentic adapters
111
+ - **core**: Add `langchain` and `deepagents` to ProviderType union
112
+
113
+ ### Examples
114
+
115
+ New comprehensive examples organized by feature:
116
+
117
+ - `examples/guardian/` - Guardian Mode examples (testing, guardian, hybrid modes)
118
+ - `examples/sdk/` - SDK usage examples (Jest, Vitest, events)
119
+ - `examples/adapters/` - Agentic adapter examples
120
+ - `examples/owasp/` - OWASP LLM Top 10 test scenarios
121
+
122
+ ### Documentation
123
+
124
+ - Complete SDK documentation with API reference
125
+ - Guardian Mode guide with all three modes explained
126
+ - Agentic adapters documentation (LangChain, DeepAgents)
127
+ - Test matchers reference for Jest/Vitest
128
+ - OWASP LLM Top 10 testing scenarios
129
+
130
+ ### Patch Changes
131
+
132
+ - Updated dependencies
133
+ - @artemiskit/core@0.3.0
134
+
135
+ ## 0.2.4
136
+
137
+ ### Patch Changes
138
+
139
+ - Updated dependencies [16604a6]
140
+ - @artemiskit/core@0.2.4
141
+
3
142
  ## 0.2.3
4
143
 
5
144
  ### Patch Changes