@roleplay-sh/cli 0.1.2 → 0.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +14 -0
- package/CHANGELOG.md +14 -0
- package/CONTRIBUTING.md +1 -1
- package/README.md +13 -1
- package/RELEASE.md +10 -3
- package/dist/cli.js +3438 -2682
- package/dist/cli.js.map +1 -1
- package/dist/index.d.ts +85 -43
- package/dist/index.js +462 -60
- package/dist/index.js.map +1 -1
- package/package.json +1 -1
package/.env.example
CHANGED
|
@@ -10,3 +10,17 @@ ROLEPLAY_AGENT_NAME=
|
|
|
10
10
|
# Built-in social-engineering-core target. Set exactly one for CI.
|
|
11
11
|
ROLEPLAY_TARGET_URL=http://localhost:3000/agent
|
|
12
12
|
ROLEPLAY_TARGET_COMMAND=
|
|
13
|
+
|
|
14
|
+
# Optional LLM provider settings for adaptive attacker turns and semantic judging.
|
|
15
|
+
# Provider choices: mock, openai, anthropic, google, openai-compatible.
|
|
16
|
+
ROLEPLAY_LLM_PROVIDER=mock
|
|
17
|
+
ROLEPLAY_LLM_MODEL=
|
|
18
|
+
ROLEPLAY_ATTACKER_PROVIDER=
|
|
19
|
+
ROLEPLAY_ATTACKER_MODEL=
|
|
20
|
+
ROLEPLAY_JUDGE_PROVIDER=
|
|
21
|
+
ROLEPLAY_JUDGE_MODEL=
|
|
22
|
+
ROLEPLAY_OPENAI_API_KEY=
|
|
23
|
+
ROLEPLAY_ANTHROPIC_API_KEY=
|
|
24
|
+
ROLEPLAY_GOOGLE_API_KEY=
|
|
25
|
+
ROLEPLAY_LLM_API_KEY=
|
|
26
|
+
ROLEPLAY_LLM_BASE_URL=
|
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,20 @@ All notable changes to roleplay.sh will be documented in this file.
|
|
|
4
4
|
|
|
5
5
|
This project follows semantic versioning after the public `0.1.0` release.
|
|
6
6
|
|
|
7
|
+
## 0.1.3 - 2026-06-06
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- Adaptive LLM attacker providers for OpenAI, Anthropic, Google Gemini, and OpenAI-compatible APIs.
|
|
12
|
+
- LLM transcript judging against scenario success and failure criteria.
|
|
13
|
+
- `--provider`, `--attacker-provider`, `--judge-provider`, model, and OpenAI-compatible base URL flags.
|
|
14
|
+
- Scenario YAML support for attacker and judge provider settings.
|
|
15
|
+
|
|
16
|
+
### Changed
|
|
17
|
+
|
|
18
|
+
- Real HTTP and CLI targets default to LLM provider mode for `social-engineering-core`.
|
|
19
|
+
- Mock mode remains available as an explicit deterministic smoke-test path with `--target mock --provider mock`.
|
|
20
|
+
|
|
7
21
|
## 0.1.2 - 2026-06-03
|
|
8
22
|
|
|
9
23
|
### Changed
|
package/CONTRIBUTING.md
CHANGED
|
@@ -11,7 +11,7 @@ pnpm test
|
|
|
11
11
|
pnpm build
|
|
12
12
|
```
|
|
13
13
|
|
|
14
|
-
Use local attack-pack execution for tests and examples.
|
|
14
|
+
Use local attack-pack execution for tests and examples. External model-provider behavior is now part of the public CLI; keep provider additions explicit, tested, and documented.
|
|
15
15
|
|
|
16
16
|
## Pull requests
|
|
17
17
|
|
package/README.md
CHANGED
|
@@ -20,7 +20,7 @@ npx @roleplay-sh/cli --help
|
|
|
20
20
|
|
|
21
21
|
```bash
|
|
22
22
|
roleplay init
|
|
23
|
-
roleplay run social-engineering-core --target mock --fail-on critical
|
|
23
|
+
roleplay run social-engineering-core --target mock --provider mock --fail-on critical
|
|
24
24
|
roleplay report latest
|
|
25
25
|
roleplay replay latest
|
|
26
26
|
```
|
|
@@ -32,6 +32,7 @@ HTTP target:
|
|
|
32
32
|
```bash
|
|
33
33
|
roleplay run social-engineering-core \
|
|
34
34
|
--target http://localhost:3000/agent \
|
|
35
|
+
--provider openai \
|
|
35
36
|
--fail-on critical
|
|
36
37
|
```
|
|
37
38
|
|
|
@@ -40,10 +41,19 @@ CLI target:
|
|
|
40
41
|
```bash
|
|
41
42
|
roleplay run social-engineering-core \
|
|
42
43
|
--target-command "node ./agent.js" \
|
|
44
|
+
--provider openai \
|
|
43
45
|
--fail-on critical \
|
|
44
46
|
--yes
|
|
45
47
|
```
|
|
46
48
|
|
|
49
|
+
Set the provider API key before running a real attack pack:
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
export ROLEPLAY_OPENAI_API_KEY="your-openai-key"
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
Supported providers are `openai`, `anthropic`, `google`, and `openai-compatible`. Use `--attacker-provider` and `--judge-provider` when you want different providers for adaptive attacker turns and transcript judging. Use `--target mock --provider mock` for deterministic local smoke tests.
|
|
56
|
+
|
|
47
57
|
## Upload Sanitized Findings To Team Cloud
|
|
48
58
|
|
|
49
59
|
Create a project and API key in Team Cloud at `https://app.roleplay.sh`, then run:
|
|
@@ -75,6 +85,8 @@ Sanitized upload is the default. Full transcripts, raw scenario YAML, and local
|
|
|
75
85
|
run: pnpm dlx @roleplay-sh/cli run social-engineering-core --fail-on critical
|
|
76
86
|
env:
|
|
77
87
|
ROLEPLAY_TARGET_URL: ${{ secrets.ROLEPLAY_TARGET_URL }}
|
|
88
|
+
ROLEPLAY_LLM_PROVIDER: openai
|
|
89
|
+
ROLEPLAY_OPENAI_API_KEY: ${{ secrets.ROLEPLAY_OPENAI_API_KEY }}
|
|
78
90
|
|
|
79
91
|
- name: Upload sanitized findings
|
|
80
92
|
if: always()
|
package/RELEASE.md
CHANGED
|
@@ -29,8 +29,8 @@ The publish workflow uses GitHub OIDC and intentionally does not require an npm
|
|
|
29
29
|
Create a GitHub release or push a version tag:
|
|
30
30
|
|
|
31
31
|
```bash
|
|
32
|
-
git tag v0.1.
|
|
33
|
-
git push origin v0.1.
|
|
32
|
+
git tag v0.1.3
|
|
33
|
+
git push origin v0.1.3
|
|
34
34
|
```
|
|
35
35
|
|
|
36
36
|
The publish workflow runs checks and then publishes with:
|
|
@@ -46,11 +46,18 @@ npm view @roleplay-sh/cli version
|
|
|
46
46
|
npm install -g @roleplay-sh/cli
|
|
47
47
|
roleplay --help
|
|
48
48
|
roleplay init
|
|
49
|
-
roleplay run social-engineering-core --target mock --fail-on critical
|
|
49
|
+
roleplay run social-engineering-core --target mock --provider mock --fail-on critical
|
|
50
50
|
roleplay report latest
|
|
51
51
|
roleplay replay latest
|
|
52
52
|
```
|
|
53
53
|
|
|
54
|
+
For real LLM-backed verification:
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
export ROLEPLAY_OPENAI_API_KEY=<openai-key>
|
|
58
|
+
roleplay run social-engineering-core --target http://localhost:3000/agent --provider openai --max-turns 1 --fail-on critical
|
|
59
|
+
```
|
|
60
|
+
|
|
54
61
|
For Team Cloud upload verification, create a project API key at `https://app.roleplay.sh` and run:
|
|
55
62
|
|
|
56
63
|
```bash
|