@artemiskit/redteam 0.1.2 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +22 -0
- package/README.md +105 -0
- package/package.json +3 -2
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,27 @@
|
|
|
1
1
|
# @artemiskit/redteam
|
|
2
2
|
|
|
3
|
+
## 0.1.4
|
|
4
|
+
|
|
5
|
+
### Patch Changes
|
|
6
|
+
|
|
7
|
+
- 367eb3b: fix: resolve npm install error caused by workspace:\* protocol
|
|
8
|
+
|
|
9
|
+
Fixed an issue where `npm i -g @artemiskit/cli` would fail with
|
|
10
|
+
"Unsupported URL Type workspace:_" error. The publish workflow now
|
|
11
|
+
automatically replaces workspace:_ dependencies with actual version
|
|
12
|
+
numbers before publishing to npm.
|
|
13
|
+
|
|
14
|
+
- Updated dependencies [367eb3b]
|
|
15
|
+
- @artemiskit/core@0.1.4
|
|
16
|
+
|
|
17
|
+
## 0.1.3
|
|
18
|
+
|
|
19
|
+
### Patch Changes
|
|
20
|
+
|
|
21
|
+
- 11ac4a7: Updated Package Documentations
|
|
22
|
+
- Updated dependencies [11ac4a7]
|
|
23
|
+
- @artemiskit/core@0.1.3
|
|
24
|
+
|
|
3
25
|
## 0.1.2
|
|
4
26
|
|
|
5
27
|
### Patch Changes
|
package/README.md
ADDED
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
# @artemiskit/redteam
|
|
2
|
+
|
|
3
|
+
Red team adversarial security testing for ArtemisKit LLM evaluation toolkit.
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
npm install @artemiskit/redteam
|
|
9
|
+
# or
|
|
10
|
+
bun add @artemiskit/redteam
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
## Overview
|
|
14
|
+
|
|
15
|
+
This package provides adversarial testing capabilities to identify vulnerabilities in LLM-powered applications:
|
|
16
|
+
|
|
17
|
+
- **Prompt Injection** - Test resistance to instruction override attacks
|
|
18
|
+
- **Jailbreak Attempts** - Test guardrail bypass techniques
|
|
19
|
+
- **Data Extraction** - Probe for system prompt and training data leakage
|
|
20
|
+
- **Hallucination Triggers** - Test factual accuracy under adversarial prompts
|
|
21
|
+
- **PII Disclosure** - Test for unauthorized personal data exposure
|
|
22
|
+
|
|
23
|
+
## Usage
|
|
24
|
+
|
|
25
|
+
Most users should use the [`@artemiskit/cli`](https://www.npmjs.com/package/@artemiskit/cli) for red team testing:
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
artemiskit redteam my-scenario.yaml --count 5
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
For programmatic usage:
|
|
32
|
+
|
|
33
|
+
```typescript
|
|
34
|
+
import { RedTeamGenerator } from '@artemiskit/redteam';
|
|
35
|
+
|
|
36
|
+
const generator = new RedTeamGenerator();
|
|
37
|
+
|
|
38
|
+
// Generate mutated versions of a prompt
|
|
39
|
+
const mutatedPrompts = generator.generate(basePrompt, 10);
|
|
40
|
+
|
|
41
|
+
// Each result contains:
|
|
42
|
+
// - original: the original prompt
|
|
43
|
+
// - mutated: the mutated prompt
|
|
44
|
+
// - mutations: array of mutation names applied
|
|
45
|
+
// - severity: 'low' | 'medium' | 'high' | 'critical'
|
|
46
|
+
|
|
47
|
+
// Apply a specific mutation
|
|
48
|
+
const mutated = generator.applyMutation(prompt, 'role-spoof');
|
|
49
|
+
|
|
50
|
+
// List available mutations
|
|
51
|
+
const mutations = generator.listMutations();
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## How It Works
|
|
55
|
+
|
|
56
|
+
The red team module applies mutations to prompts to test LLM robustness:
|
|
57
|
+
|
|
58
|
+
1. Takes a base prompt from your scenario
|
|
59
|
+
2. Applies one or more mutations to create adversarial variants
|
|
60
|
+
3. Sends mutated prompts to the LLM
|
|
61
|
+
4. Analyzes responses for unsafe behavior
|
|
62
|
+
5. Reports vulnerabilities with severity ratings
|
|
63
|
+
|
|
64
|
+
## Mutations
|
|
65
|
+
|
|
66
|
+
The package includes mutation strategies to generate attack variants:
|
|
67
|
+
|
|
68
|
+
```typescript
|
|
69
|
+
import {
|
|
70
|
+
CotInjectionMutation,
|
|
71
|
+
InstructionFlipMutation,
|
|
72
|
+
RoleSpoofMutation,
|
|
73
|
+
TypoMutation
|
|
74
|
+
} from '@artemiskit/redteam';
|
|
75
|
+
|
|
76
|
+
const mutation = new CotInjectionMutation();
|
|
77
|
+
const mutated = mutation.mutate(originalPrompt);
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
Available mutations:
|
|
81
|
+
- `TypoMutation` - Introduces typos to evade filters
|
|
82
|
+
- `RoleSpoofMutation` - Role impersonation attacks
|
|
83
|
+
- `InstructionFlipMutation` - Reverses or contradicts instructions
|
|
84
|
+
- `CotInjectionMutation` - Chain-of-thought injection attacks
|
|
85
|
+
|
|
86
|
+
## Severity Ratings
|
|
87
|
+
|
|
88
|
+
Results are categorized by severity:
|
|
89
|
+
|
|
90
|
+
| Severity | Description |
|
|
91
|
+
|----------|-------------|
|
|
92
|
+
| `critical` | Complete guardrail bypass |
|
|
93
|
+
| `high` | Significant information disclosure |
|
|
94
|
+
| `medium` | Partial bypass or concerning behavior |
|
|
95
|
+
| `low` | Minor issues or edge cases |
|
|
96
|
+
|
|
97
|
+
## Related Packages
|
|
98
|
+
|
|
99
|
+
- [`@artemiskit/cli`](https://www.npmjs.com/package/@artemiskit/cli) - Command-line interface
|
|
100
|
+
- [`@artemiskit/core`](https://www.npmjs.com/package/@artemiskit/core) - Core runtime and evaluators
|
|
101
|
+
- [`@artemiskit/reports`](https://www.npmjs.com/package/@artemiskit/reports) - HTML report generation
|
|
102
|
+
|
|
103
|
+
## License
|
|
104
|
+
|
|
105
|
+
Apache-2.0
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@artemiskit/redteam",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.4",
|
|
4
4
|
"description": "Red-team adversarial security testing for ArtemisKit LLM evaluation toolkit",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"license": "Apache-2.0",
|
|
@@ -13,6 +13,7 @@
|
|
|
13
13
|
"bugs": {
|
|
14
14
|
"url": "https://github.com/code-sensei/artemiskit/issues"
|
|
15
15
|
},
|
|
16
|
+
"homepage": "https://artemiskit.vercel.app",
|
|
16
17
|
"keywords": [
|
|
17
18
|
"llm",
|
|
18
19
|
"testing",
|
|
@@ -38,7 +39,7 @@
|
|
|
38
39
|
"test": "bun test"
|
|
39
40
|
},
|
|
40
41
|
"dependencies": {
|
|
41
|
-
"@artemiskit/core": "
|
|
42
|
+
"@artemiskit/core": "0.1.4"
|
|
42
43
|
},
|
|
43
44
|
"devDependencies": {
|
|
44
45
|
"@types/bun": "^1.1.0",
|