@roleplay-sh/cli 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/.env.example CHANGED
@@ -10,3 +10,17 @@ ROLEPLAY_AGENT_NAME=
10
10
  # Built-in social-engineering-core target. Set exactly one for CI.
11
11
  ROLEPLAY_TARGET_URL=http://localhost:3000/agent
12
12
  ROLEPLAY_TARGET_COMMAND=
13
+
14
+ # Optional LLM provider settings for adaptive attacker turns and semantic judging.
15
+ # Provider choices: mock, openai, anthropic, google, openai-compatible.
16
+ ROLEPLAY_LLM_PROVIDER=mock
17
+ ROLEPLAY_LLM_MODEL=
18
+ ROLEPLAY_ATTACKER_PROVIDER=
19
+ ROLEPLAY_ATTACKER_MODEL=
20
+ ROLEPLAY_JUDGE_PROVIDER=
21
+ ROLEPLAY_JUDGE_MODEL=
22
+ ROLEPLAY_OPENAI_API_KEY=
23
+ ROLEPLAY_ANTHROPIC_API_KEY=
24
+ ROLEPLAY_GOOGLE_API_KEY=
25
+ ROLEPLAY_LLM_API_KEY=
26
+ ROLEPLAY_LLM_BASE_URL=
package/CHANGELOG.md CHANGED
@@ -4,6 +4,20 @@ All notable changes to roleplay.sh will be documented in this file.
4
4
 
5
5
  This project follows semantic versioning after the public `0.1.0` release.
6
6
 
7
+ ## 0.1.3 - 2026-06-06
8
+
9
+ ### Added
10
+
11
+ - Adaptive LLM attacker providers for OpenAI, Anthropic, Google Gemini, and OpenAI-compatible APIs.
12
+ - LLM transcript judging against scenario success and failure criteria.
13
+ - `--provider`, `--attacker-provider`, `--judge-provider`, model, and OpenAI-compatible base URL flags.
14
+ - Scenario YAML support for attacker and judge provider settings.
15
+
16
+ ### Changed
17
+
18
+ - Real HTTP and CLI targets default to LLM provider mode for `social-engineering-core`.
19
+ - Mock mode remains available as an explicit deterministic smoke-test path with `--target mock --provider mock`.
20
+
7
21
  ## 0.1.2 - 2026-06-03
8
22
 
9
23
  ### Changed
package/CONTRIBUTING.md CHANGED
@@ -11,7 +11,7 @@ pnpm test
11
11
  pnpm build
12
12
  ```
13
13
 
14
- Use local attack-pack execution for tests and examples. Do not add external model-provider behavior to the public CLI without an explicit product decision.
14
+ Use local attack-pack execution for tests and examples. External model-provider behavior is now part of the public CLI; keep provider additions explicit, tested, and documented.
15
15
 
16
16
  ## Pull requests
17
17
 
package/README.md CHANGED
@@ -20,7 +20,7 @@ npx @roleplay-sh/cli --help
20
20
 
21
21
  ```bash
22
22
  roleplay init
23
- roleplay run social-engineering-core --target mock --fail-on critical
23
+ roleplay run social-engineering-core --target mock --provider mock --fail-on critical
24
24
  roleplay report latest
25
25
  roleplay replay latest
26
26
  ```
@@ -32,6 +32,7 @@ HTTP target:
32
32
  ```bash
33
33
  roleplay run social-engineering-core \
34
34
  --target http://localhost:3000/agent \
35
+ --provider openai \
35
36
  --fail-on critical
36
37
  ```
37
38
 
@@ -40,10 +41,19 @@ CLI target:
40
41
  ```bash
41
42
  roleplay run social-engineering-core \
42
43
  --target-command "node ./agent.js" \
44
+ --provider openai \
43
45
  --fail-on critical \
44
46
  --yes
45
47
  ```
46
48
 
49
+ Set the provider API key before running a real attack pack:
50
+
51
+ ```bash
52
+ export ROLEPLAY_OPENAI_API_KEY="your-openai-key"
53
+ ```
54
+
55
+ Supported providers are `openai`, `anthropic`, `google`, and `openai-compatible`. Use `--attacker-provider` and `--judge-provider` when you want different providers for adaptive attacker turns and transcript judging. Use `--target mock --provider mock` for deterministic local smoke tests.
56
+
47
57
  ## Upload Sanitized Findings To Team Cloud
48
58
 
49
59
  Create a project and API key in Team Cloud at `https://app.roleplay.sh`, then run:
@@ -75,6 +85,8 @@ Sanitized upload is the default. Full transcripts, raw scenario YAML, and local
75
85
  run: pnpm dlx @roleplay-sh/cli run social-engineering-core --fail-on critical
76
86
  env:
77
87
  ROLEPLAY_TARGET_URL: ${{ secrets.ROLEPLAY_TARGET_URL }}
88
+ ROLEPLAY_LLM_PROVIDER: openai
89
+ ROLEPLAY_OPENAI_API_KEY: ${{ secrets.ROLEPLAY_OPENAI_API_KEY }}
78
90
 
79
91
  - name: Upload sanitized findings
80
92
  if: always()
package/RELEASE.md CHANGED
@@ -29,8 +29,8 @@ The publish workflow uses GitHub OIDC and intentionally does not require an npm
29
29
  Create a GitHub release or push a version tag:
30
30
 
31
31
  ```bash
32
- git tag v0.1.1
33
- git push origin v0.1.1
32
+ git tag v0.1.3
33
+ git push origin v0.1.3
34
34
  ```
35
35
 
36
36
  The publish workflow runs checks and then publishes with:
@@ -46,11 +46,18 @@ npm view @roleplay-sh/cli version
46
46
  npm install -g @roleplay-sh/cli
47
47
  roleplay --help
48
48
  roleplay init
49
- roleplay run social-engineering-core --target mock --fail-on critical
49
+ roleplay run social-engineering-core --target mock --provider mock --fail-on critical
50
50
  roleplay report latest
51
51
  roleplay replay latest
52
52
  ```
53
53
 
54
+ For real LLM-backed verification:
55
+
56
+ ```bash
57
+ export ROLEPLAY_OPENAI_API_KEY=<openai-key>
58
+ roleplay run social-engineering-core --target http://localhost:3000/agent --provider openai --max-turns 1 --fail-on critical
59
+ ```
60
+
54
61
  For Team Cloud upload verification, create a project API key at `https://app.roleplay.sh` and run:
55
62
 
56
63
  ```bash