@roleplay-sh/cli 0.1.1 → 0.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +14 -0
- package/CHANGELOG.md +33 -7
- package/CONTRIBUTING.md +1 -1
- package/README.md +13 -1
- package/RELEASE.md +10 -3
- package/SECURITY.md +1 -3
- package/dist/cli.js +3438 -2682
- package/dist/cli.js.map +1 -1
- package/dist/index.d.ts +85 -43
- package/dist/index.js +462 -60
- package/dist/index.js.map +1 -1
- package/package.json +1 -1
package/.env.example
CHANGED
|
@@ -10,3 +10,17 @@ ROLEPLAY_AGENT_NAME=
|
|
|
10
10
|
# Built-in social-engineering-core target. Set exactly one for CI.
|
|
11
11
|
ROLEPLAY_TARGET_URL=http://localhost:3000/agent
|
|
12
12
|
ROLEPLAY_TARGET_COMMAND=
|
|
13
|
+
|
|
14
|
+
# Optional LLM provider settings for adaptive attacker turns and semantic judging.
|
|
15
|
+
# Provider choices: mock, openai, anthropic, google, openai-compatible.
|
|
16
|
+
ROLEPLAY_LLM_PROVIDER=mock
|
|
17
|
+
ROLEPLAY_LLM_MODEL=
|
|
18
|
+
ROLEPLAY_ATTACKER_PROVIDER=
|
|
19
|
+
ROLEPLAY_ATTACKER_MODEL=
|
|
20
|
+
ROLEPLAY_JUDGE_PROVIDER=
|
|
21
|
+
ROLEPLAY_JUDGE_MODEL=
|
|
22
|
+
ROLEPLAY_OPENAI_API_KEY=
|
|
23
|
+
ROLEPLAY_ANTHROPIC_API_KEY=
|
|
24
|
+
ROLEPLAY_GOOGLE_API_KEY=
|
|
25
|
+
ROLEPLAY_LLM_API_KEY=
|
|
26
|
+
ROLEPLAY_LLM_BASE_URL=
|
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,35 @@ All notable changes to roleplay.sh will be documented in this file.
|
|
|
4
4
|
|
|
5
5
|
This project follows semantic versioning after the public `0.1.0` release.
|
|
6
6
|
|
|
7
|
+
## 0.1.3 - 2026-06-06
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- Adaptive LLM attacker providers for OpenAI, Anthropic, Google Gemini, and OpenAI-compatible APIs.
|
|
12
|
+
- LLM transcript judging against scenario success and failure criteria.
|
|
13
|
+
- `--provider`, `--attacker-provider`, `--judge-provider`, model, and OpenAI-compatible base URL flags.
|
|
14
|
+
- Scenario YAML support for attacker and judge provider settings.
|
|
15
|
+
|
|
16
|
+
### Changed
|
|
17
|
+
|
|
18
|
+
- Real HTTP and CLI targets default to LLM provider mode for `social-engineering-core`.
|
|
19
|
+
- Mock mode remains available as an explicit deterministic smoke-test path with `--target mock --provider mock`.
|
|
20
|
+
|
|
21
|
+
## 0.1.2 - 2026-06-03
|
|
22
|
+
|
|
23
|
+
### Changed
|
|
24
|
+
|
|
25
|
+
- Corrected packaged documentation to match the public launch scope.
|
|
26
|
+
|
|
27
|
+
## 0.1.1 - 2026-06-03
|
|
28
|
+
|
|
29
|
+
### Added
|
|
30
|
+
|
|
31
|
+
- Dedicated public CLI package for local attack-pack execution.
|
|
32
|
+
- Built-in `social-engineering-core` attack pack.
|
|
33
|
+
- Local reports and replayable transcripts.
|
|
34
|
+
- Sanitized Team Cloud upload support.
|
|
35
|
+
|
|
7
36
|
## 0.1.0 - 2026-05-17
|
|
8
37
|
|
|
9
38
|
### Added
|
|
@@ -11,21 +40,18 @@ This project follows semantic versioning after the public `0.1.0` release.
|
|
|
11
40
|
- Initial `roleplay` CLI.
|
|
12
41
|
- Scenario YAML validation with Zod.
|
|
13
42
|
- HTTP, CLI, and mock target adapters.
|
|
14
|
-
-
|
|
15
|
-
-
|
|
43
|
+
- Local deterministic roleplayed-user provider.
|
|
44
|
+
- Local deterministic judge implementation.
|
|
16
45
|
- Local run storage under `.roleplay/runs`.
|
|
17
46
|
- JSON and Markdown report generation.
|
|
18
|
-
- `init`, `
|
|
47
|
+
- `init`, `run`, `report`, `replay`, `list`, `upload`, `doctor`, and `mcp` commands.
|
|
19
48
|
- Example agents and scenarios.
|
|
20
49
|
- Vitest test suite, linting, strict TypeScript, tsup build, CI, and npm publish workflow.
|
|
21
50
|
- Package smoke test that verifies tarball contents and installed CLI behavior.
|
|
22
51
|
- Failed-run artifact persistence for target/provider/judge errors.
|
|
23
52
|
- Safer CLI target execution defaults and explicit `shell: true` opt-in.
|
|
24
|
-
- Red-team target validation and optional `--save` for generated scenarios.
|
|
25
53
|
- HTTP target diagnostics for text responses, missing fields, and timeouts.
|
|
26
54
|
|
|
27
55
|
### Notes
|
|
28
56
|
|
|
29
|
-
-
|
|
30
|
-
- Mock provider and mock judge are the stable path for first local usage.
|
|
31
|
-
- OpenAI mode requires `OPENAI_API_KEY` and should be treated as experimental until more live usage is collected.
|
|
57
|
+
- Local attack-pack execution is the supported path for first usage.
|
package/CONTRIBUTING.md
CHANGED
|
@@ -11,7 +11,7 @@ pnpm test
|
|
|
11
11
|
pnpm build
|
|
12
12
|
```
|
|
13
13
|
|
|
14
|
-
Use
|
|
14
|
+
Use local attack-pack execution for tests and examples. External model-provider behavior is now part of the public CLI; keep provider additions explicit, tested, and documented.
|
|
15
15
|
|
|
16
16
|
## Pull requests
|
|
17
17
|
|
package/README.md
CHANGED
|
@@ -20,7 +20,7 @@ npx @roleplay-sh/cli --help
|
|
|
20
20
|
|
|
21
21
|
```bash
|
|
22
22
|
roleplay init
|
|
23
|
-
roleplay run social-engineering-core --target mock --fail-on critical
|
|
23
|
+
roleplay run social-engineering-core --target mock --provider mock --fail-on critical
|
|
24
24
|
roleplay report latest
|
|
25
25
|
roleplay replay latest
|
|
26
26
|
```
|
|
@@ -32,6 +32,7 @@ HTTP target:
|
|
|
32
32
|
```bash
|
|
33
33
|
roleplay run social-engineering-core \
|
|
34
34
|
--target http://localhost:3000/agent \
|
|
35
|
+
--provider openai \
|
|
35
36
|
--fail-on critical
|
|
36
37
|
```
|
|
37
38
|
|
|
@@ -40,10 +41,19 @@ CLI target:
|
|
|
40
41
|
```bash
|
|
41
42
|
roleplay run social-engineering-core \
|
|
42
43
|
--target-command "node ./agent.js" \
|
|
44
|
+
--provider openai \
|
|
43
45
|
--fail-on critical \
|
|
44
46
|
--yes
|
|
45
47
|
```
|
|
46
48
|
|
|
49
|
+
Set the provider API key before running a real attack pack:
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
export ROLEPLAY_OPENAI_API_KEY="your-openai-key"
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
Supported providers are `openai`, `anthropic`, `google`, and `openai-compatible`. Use `--attacker-provider` and `--judge-provider` when you want different providers for adaptive attacker turns and transcript judging. Use `--target mock --provider mock` for deterministic local smoke tests.
|
|
56
|
+
|
|
47
57
|
## Upload Sanitized Findings To Team Cloud
|
|
48
58
|
|
|
49
59
|
Create a project and API key in Team Cloud at `https://app.roleplay.sh`, then run:
|
|
@@ -75,6 +85,8 @@ Sanitized upload is the default. Full transcripts, raw scenario YAML, and local
|
|
|
75
85
|
run: pnpm dlx @roleplay-sh/cli run social-engineering-core --fail-on critical
|
|
76
86
|
env:
|
|
77
87
|
ROLEPLAY_TARGET_URL: ${{ secrets.ROLEPLAY_TARGET_URL }}
|
|
88
|
+
ROLEPLAY_LLM_PROVIDER: openai
|
|
89
|
+
ROLEPLAY_OPENAI_API_KEY: ${{ secrets.ROLEPLAY_OPENAI_API_KEY }}
|
|
78
90
|
|
|
79
91
|
- name: Upload sanitized findings
|
|
80
92
|
if: always()
|
package/RELEASE.md
CHANGED
|
@@ -29,8 +29,8 @@ The publish workflow uses GitHub OIDC and intentionally does not require an npm
|
|
|
29
29
|
Create a GitHub release or push a version tag:
|
|
30
30
|
|
|
31
31
|
```bash
|
|
32
|
-
git tag v0.1.
|
|
33
|
-
git push origin v0.1.
|
|
32
|
+
git tag v0.1.3
|
|
33
|
+
git push origin v0.1.3
|
|
34
34
|
```
|
|
35
35
|
|
|
36
36
|
The publish workflow runs checks and then publishes with:
|
|
@@ -46,11 +46,18 @@ npm view @roleplay-sh/cli version
|
|
|
46
46
|
npm install -g @roleplay-sh/cli
|
|
47
47
|
roleplay --help
|
|
48
48
|
roleplay init
|
|
49
|
-
roleplay run social-engineering-core --target mock --fail-on critical
|
|
49
|
+
roleplay run social-engineering-core --target mock --provider mock --fail-on critical
|
|
50
50
|
roleplay report latest
|
|
51
51
|
roleplay replay latest
|
|
52
52
|
```
|
|
53
53
|
|
|
54
|
+
For real LLM-backed verification:
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
export ROLEPLAY_OPENAI_API_KEY=<openai-key>
|
|
58
|
+
roleplay run social-engineering-core --target http://localhost:3000/agent --provider openai --max-turns 1 --fail-on critical
|
|
59
|
+
```
|
|
60
|
+
|
|
54
61
|
For Team Cloud upload verification, create a project API key at `https://app.roleplay.sh` and run:
|
|
55
62
|
|
|
56
63
|
```bash
|
package/SECURITY.md
CHANGED
|
@@ -12,9 +12,7 @@ Do not include real API keys, customer data, private prompts, transcripts, or pr
|
|
|
12
12
|
|
|
13
13
|
## Data Handling
|
|
14
14
|
|
|
15
|
-
roleplay.sh stores runs locally under `.roleplay/runs`. Scenario files, hidden context, transcripts, and reports may contain sensitive information.
|
|
16
|
-
|
|
17
|
-
When using OpenAI providers or judges, scenario data and transcripts are sent to the external provider. Use `--provider mock --judge mock` for local-only testing.
|
|
15
|
+
roleplay.sh stores runs locally under `.roleplay/runs`. Scenario files, hidden context, transcripts, and reports may contain sensitive information. Full transcripts stay local unless you explicitly upload them to Team Cloud with full-transcript mode enabled in both the project policy and the CLI command.
|
|
18
16
|
|
|
19
17
|
## CLI Target Execution
|
|
20
18
|
|