@artemiskit/redteam 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/CHANGELOG.md +8 -0
  2. package/README.md +105 -0
  3. package/package.json +1 -1
package/CHANGELOG.md CHANGED
@@ -1,5 +1,13 @@
1
1
  # @artemiskit/redteam
2
2
 
3
+ ## 0.1.3
4
+
5
+ ### Patch Changes
6
+
7
+ - 11ac4a7: Updated Package Documentations
8
+ - Updated dependencies [11ac4a7]
9
+ - @artemiskit/core@0.1.3
10
+
3
11
  ## 0.1.2
4
12
 
5
13
  ### Patch Changes
package/README.md ADDED
@@ -0,0 +1,105 @@
1
+ # @artemiskit/redteam
2
+
3
+ Red team adversarial security testing for ArtemisKit LLM evaluation toolkit.
4
+
5
+ ## Installation
6
+
7
+ ```bash
8
+ npm install @artemiskit/redteam
9
+ # or
10
+ bun add @artemiskit/redteam
11
+ ```
12
+
13
+ ## Overview
14
+
15
+ This package provides adversarial testing capabilities to identify vulnerabilities in LLM-powered applications:
16
+
17
+ - **Prompt Injection** - Test resistance to instruction override attacks
18
+ - **Jailbreak Attempts** - Test guardrail bypass techniques
19
+ - **Data Extraction** - Probe for system prompt and training data leakage
20
+ - **Hallucination Triggers** - Test factual accuracy under adversarial prompts
21
+ - **PII Disclosure** - Test for unauthorized personal data exposure
22
+
23
+ ## Usage
24
+
25
+ Most users should use the [`@artemiskit/cli`](https://www.npmjs.com/package/@artemiskit/cli) for red team testing:
26
+
27
+ ```bash
28
+ artemiskit redteam my-scenario.yaml --count 5
29
+ ```
30
+
31
+ For programmatic usage:
32
+
33
+ ```typescript
34
+ import { RedTeamGenerator } from '@artemiskit/redteam';
35
+
36
+ const generator = new RedTeamGenerator();
37
+
38
+ // Generate mutated versions of a prompt
39
+ const mutatedPrompts = generator.generate(basePrompt, 10);
40
+
41
+ // Each result contains:
42
+ // - original: the original prompt
43
+ // - mutated: the mutated prompt
44
+ // - mutations: array of mutation names applied
45
+ // - severity: 'low' | 'medium' | 'high' | 'critical'
46
+
47
+ // Apply a specific mutation
48
+ const mutated = generator.applyMutation(prompt, 'role-spoof');
49
+
50
+ // List available mutations
51
+ const mutations = generator.listMutations();
52
+ ```
53
+
54
+ ## How It Works
55
+
56
+ The red team module applies mutations to prompts to test LLM robustness:
57
+
58
+ 1. Takes a base prompt from your scenario
59
+ 2. Applies one or more mutations to create adversarial variants
60
+ 3. Sends mutated prompts to the LLM
61
+ 4. Analyzes responses for unsafe behavior
62
+ 5. Reports vulnerabilities with severity ratings
63
+
64
+ ## Mutations
65
+
66
+ The package includes mutation strategies to generate attack variants:
67
+
68
+ ```typescript
69
+ import {
70
+ CotInjectionMutation,
71
+ InstructionFlipMutation,
72
+ RoleSpoofMutation,
73
+ TypoMutation
74
+ } from '@artemiskit/redteam';
75
+
76
+ const mutation = new CotInjectionMutation();
77
+ const mutated = mutation.mutate(originalPrompt);
78
+ ```
79
+
80
+ Available mutations:
81
+ - `TypoMutation` - Introduces typos to evade filters
82
+ - `RoleSpoofMutation` - Role impersonation attacks
83
+ - `InstructionFlipMutation` - Reverses or contradicts instructions
84
+ - `CotInjectionMutation` - Chain-of-thought injection attacks
85
+
86
+ ## Severity Ratings
87
+
88
+ Results are categorized by severity:
89
+
90
+ | Severity | Description |
91
+ |----------|-------------|
92
+ | `critical` | Complete guardrail bypass |
93
+ | `high` | Significant information disclosure |
94
+ | `medium` | Partial bypass or concerning behavior |
95
+ | `low` | Minor issues or edge cases |
96
+
97
+ ## Related Packages
98
+
99
+ - [`@artemiskit/cli`](https://www.npmjs.com/package/@artemiskit/cli) - Command-line interface
100
+ - [`@artemiskit/core`](https://www.npmjs.com/package/@artemiskit/core) - Core runtime and evaluators
101
+ - [`@artemiskit/reports`](https://www.npmjs.com/package/@artemiskit/reports) - HTML report generation
102
+
103
+ ## License
104
+
105
+ Apache-2.0
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@artemiskit/redteam",
3
- "version": "0.1.2",
3
+ "version": "0.1.3",
4
4
  "description": "Red-team adversarial security testing for ArtemisKit LLM evaluation toolkit",
5
5
  "type": "module",
6
6
  "license": "Apache-2.0",