npm - @roleplay-sh/cli - Versions diffs - 0.1.2 → 0.1.3 - Mend

@roleplay-sh/cli 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/.env.example CHANGED Viewed

@@ -10,3 +10,17 @@ ROLEPLAY_AGENT_NAME=
 # Built-in social-engineering-core target. Set exactly one for CI.
 ROLEPLAY_TARGET_URL=http://localhost:3000/agent
 ROLEPLAY_TARGET_COMMAND=
+# Optional LLM provider settings for adaptive attacker turns and semantic judging.
+# Provider choices: mock, openai, anthropic, google, openai-compatible.
+ROLEPLAY_LLM_PROVIDER=mock
+ROLEPLAY_LLM_MODEL=
+ROLEPLAY_ATTACKER_PROVIDER=
+ROLEPLAY_ATTACKER_MODEL=
+ROLEPLAY_JUDGE_PROVIDER=
+ROLEPLAY_JUDGE_MODEL=
+ROLEPLAY_OPENAI_API_KEY=
+ROLEPLAY_ANTHROPIC_API_KEY=
+ROLEPLAY_GOOGLE_API_KEY=
+ROLEPLAY_LLM_API_KEY=
+ROLEPLAY_LLM_BASE_URL=

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,20 @@ All notable changes to roleplay.sh will be documented in this file.
 This project follows semantic versioning after the public `0.1.0` release.
+## 0.1.3 - 2026-06-06
+### Added
+- Adaptive LLM attacker providers for OpenAI, Anthropic, Google Gemini, and OpenAI-compatible APIs.
+- LLM transcript judging against scenario success and failure criteria.
+- `--provider`, `--attacker-provider`, `--judge-provider`, model, and OpenAI-compatible base URL flags.
+- Scenario YAML support for attacker and judge provider settings.
+### Changed
+- Real HTTP and CLI targets default to LLM provider mode for `social-engineering-core`.
+- Mock mode remains available as an explicit deterministic smoke-test path with `--target mock --provider mock`.
 ## 0.1.2 - 2026-06-03
 ### Changed

package/CONTRIBUTING.md CHANGED Viewed

@@ -11,7 +11,7 @@ pnpm test
 pnpm build
 ```
-Use local attack-pack execution for tests and examples. Do not add external model-provider behavior to the public CLI without an explicit product decision.
+Use local attack-pack execution for tests and examples. External model-provider behavior is now part of the public CLI; keep provider additions explicit, tested, and documented.
 ## Pull requests

package/README.md CHANGED Viewed

@@ -20,7 +20,7 @@ npx @roleplay-sh/cli --help
 ```bash
 roleplay init
-roleplay run social-engineering-core --target mock --fail-on critical
+roleplay run social-engineering-core --target mock --provider mock --fail-on critical
 roleplay report latest
 roleplay replay latest
 ```
@@ -32,6 +32,7 @@ HTTP target:
 ```bash
 roleplay run social-engineering-core \
   --target http://localhost:3000/agent \
+  --provider openai \
   --fail-on critical
 ```
@@ -40,10 +41,19 @@ CLI target:
 ```bash
 roleplay run social-engineering-core \
   --target-command "node ./agent.js" \
+  --provider openai \
   --fail-on critical \
   --yes
 ```
+Set the provider API key before running a real attack pack:
+```bash
+export ROLEPLAY_OPENAI_API_KEY="your-openai-key"
+```
+Supported providers are `openai`, `anthropic`, `google`, and `openai-compatible`. Use `--attacker-provider` and `--judge-provider` when you want different providers for adaptive attacker turns and transcript judging. Use `--target mock --provider mock` for deterministic local smoke tests.
 ## Upload Sanitized Findings To Team Cloud
 Create a project and API key in Team Cloud at `https://app.roleplay.sh`, then run:
@@ -75,6 +85,8 @@ Sanitized upload is the default. Full transcripts, raw scenario YAML, and local
   run: pnpm dlx @roleplay-sh/cli run social-engineering-core --fail-on critical
   env:
     ROLEPLAY_TARGET_URL: ${{ secrets.ROLEPLAY_TARGET_URL }}
+    ROLEPLAY_LLM_PROVIDER: openai
+    ROLEPLAY_OPENAI_API_KEY: ${{ secrets.ROLEPLAY_OPENAI_API_KEY }}
 - name: Upload sanitized findings
   if: always()

package/RELEASE.md CHANGED Viewed

@@ -29,8 +29,8 @@ The publish workflow uses GitHub OIDC and intentionally does not require an npm
 Create a GitHub release or push a version tag:
 ```bash
-git tag v0.1.1
-git push origin v0.1.1
+git tag v0.1.3
+git push origin v0.1.3
 ```
 The publish workflow runs checks and then publishes with:
@@ -46,11 +46,18 @@ npm view @roleplay-sh/cli version
 npm install -g @roleplay-sh/cli
 roleplay --help
 roleplay init
-roleplay run social-engineering-core --target mock --fail-on critical
+roleplay run social-engineering-core --target mock --provider mock --fail-on critical
 roleplay report latest
 roleplay replay latest
 ```
+For real LLM-backed verification:
+```bash
+export ROLEPLAY_OPENAI_API_KEY=<openai-key>
+roleplay run social-engineering-core --target http://localhost:3000/agent --provider openai --max-turns 1 --fail-on critical
+```
 For Team Cloud upload verification, create a project API key at `https://app.roleplay.sh` and run:
 ```bash