npm - @roleplay-sh/cli - Versions diffs - 0.1.1 → 0.1.3 - Mend

@roleplay-sh/cli 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/.env.example CHANGED Viewed

@@ -10,3 +10,17 @@ ROLEPLAY_AGENT_NAME=
 # Built-in social-engineering-core target. Set exactly one for CI.
 ROLEPLAY_TARGET_URL=http://localhost:3000/agent
 ROLEPLAY_TARGET_COMMAND=
+# Optional LLM provider settings for adaptive attacker turns and semantic judging.
+# Provider choices: mock, openai, anthropic, google, openai-compatible.
+ROLEPLAY_LLM_PROVIDER=mock
+ROLEPLAY_LLM_MODEL=
+ROLEPLAY_ATTACKER_PROVIDER=
+ROLEPLAY_ATTACKER_MODEL=
+ROLEPLAY_JUDGE_PROVIDER=
+ROLEPLAY_JUDGE_MODEL=
+ROLEPLAY_OPENAI_API_KEY=
+ROLEPLAY_ANTHROPIC_API_KEY=
+ROLEPLAY_GOOGLE_API_KEY=
+ROLEPLAY_LLM_API_KEY=
+ROLEPLAY_LLM_BASE_URL=

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,35 @@ All notable changes to roleplay.sh will be documented in this file.
 This project follows semantic versioning after the public `0.1.0` release.
+## 0.1.3 - 2026-06-06
+### Added
+- Adaptive LLM attacker providers for OpenAI, Anthropic, Google Gemini, and OpenAI-compatible APIs.
+- LLM transcript judging against scenario success and failure criteria.
+- `--provider`, `--attacker-provider`, `--judge-provider`, model, and OpenAI-compatible base URL flags.
+- Scenario YAML support for attacker and judge provider settings.
+### Changed
+- Real HTTP and CLI targets default to LLM provider mode for `social-engineering-core`.
+- Mock mode remains available as an explicit deterministic smoke-test path with `--target mock --provider mock`.
+## 0.1.2 - 2026-06-03
+### Changed
+- Corrected packaged documentation to match the public launch scope.
+## 0.1.1 - 2026-06-03
+### Added
+- Dedicated public CLI package for local attack-pack execution.
+- Built-in `social-engineering-core` attack pack.
+- Local reports and replayable transcripts.
+- Sanitized Team Cloud upload support.
 ## 0.1.0 - 2026-05-17
 ### Added
@@ -11,21 +40,18 @@ This project follows semantic versioning after the public `0.1.0` release.
 - Initial `roleplay` CLI.
 - Scenario YAML validation with Zod.
 - HTTP, CLI, and mock target adapters.
-- Mock and OpenAI roleplayed-user providers.
-- Mock and OpenAI judge implementations.
+- Local deterministic roleplayed-user provider.
+- Local deterministic judge implementation.
 - Local run storage under `.roleplay/runs`.
 - JSON and Markdown report generation.
-- `init`, `scenario:create`, `run`, `report`, `replay`, `list`, `doctor`, `redteam`, and experimental `mcp` commands.
+- `init`, `run`, `report`, `replay`, `list`, `upload`, `doctor`, and `mcp` commands.
 - Example agents and scenarios.
 - Vitest test suite, linting, strict TypeScript, tsup build, CI, and npm publish workflow.
 - Package smoke test that verifies tarball contents and installed CLI behavior.
 - Failed-run artifact persistence for target/provider/judge errors.
 - Safer CLI target execution defaults and explicit `shell: true` opt-in.
-- Red-team target validation and optional `--save` for generated scenarios.
 - HTTP target diagnostics for text responses, missing fields, and timeouts.
 ### Notes
-- MCP support is a roadmap stub in this release.
-- Mock provider and mock judge are the stable path for first local usage.
-- OpenAI mode requires `OPENAI_API_KEY` and should be treated as experimental until more live usage is collected.
+- Local attack-pack execution is the supported path for first usage.

package/CONTRIBUTING.md CHANGED Viewed

@@ -11,7 +11,7 @@ pnpm test
 pnpm build
 ```
-Use mock providers for tests and examples unless you are intentionally testing OpenAI integration.
+Use local attack-pack execution for tests and examples. External model-provider behavior is now part of the public CLI; keep provider additions explicit, tested, and documented.
 ## Pull requests

package/README.md CHANGED Viewed

@@ -20,7 +20,7 @@ npx @roleplay-sh/cli --help
 ```bash
 roleplay init
-roleplay run social-engineering-core --target mock --fail-on critical
+roleplay run social-engineering-core --target mock --provider mock --fail-on critical
 roleplay report latest
 roleplay replay latest
 ```
@@ -32,6 +32,7 @@ HTTP target:
 ```bash
 roleplay run social-engineering-core \
   --target http://localhost:3000/agent \
+  --provider openai \
   --fail-on critical
 ```
@@ -40,10 +41,19 @@ CLI target:
 ```bash
 roleplay run social-engineering-core \
   --target-command "node ./agent.js" \
+  --provider openai \
   --fail-on critical \
   --yes
 ```
+Set the provider API key before running a real attack pack:
+```bash
+export ROLEPLAY_OPENAI_API_KEY="your-openai-key"
+```
+Supported providers are `openai`, `anthropic`, `google`, and `openai-compatible`. Use `--attacker-provider` and `--judge-provider` when you want different providers for adaptive attacker turns and transcript judging. Use `--target mock --provider mock` for deterministic local smoke tests.
 ## Upload Sanitized Findings To Team Cloud
 Create a project and API key in Team Cloud at `https://app.roleplay.sh`, then run:
@@ -75,6 +85,8 @@ Sanitized upload is the default. Full transcripts, raw scenario YAML, and local
   run: pnpm dlx @roleplay-sh/cli run social-engineering-core --fail-on critical
   env:
     ROLEPLAY_TARGET_URL: ${{ secrets.ROLEPLAY_TARGET_URL }}
+    ROLEPLAY_LLM_PROVIDER: openai
+    ROLEPLAY_OPENAI_API_KEY: ${{ secrets.ROLEPLAY_OPENAI_API_KEY }}
 - name: Upload sanitized findings
   if: always()

package/RELEASE.md CHANGED Viewed

@@ -29,8 +29,8 @@ The publish workflow uses GitHub OIDC and intentionally does not require an npm
 Create a GitHub release or push a version tag:
 ```bash
-git tag v0.1.1
-git push origin v0.1.1
+git tag v0.1.3
+git push origin v0.1.3
 ```
 The publish workflow runs checks and then publishes with:
@@ -46,11 +46,18 @@ npm view @roleplay-sh/cli version
 npm install -g @roleplay-sh/cli
 roleplay --help
 roleplay init
-roleplay run social-engineering-core --target mock --fail-on critical
+roleplay run social-engineering-core --target mock --provider mock --fail-on critical
 roleplay report latest
 roleplay replay latest
 ```
+For real LLM-backed verification:
+```bash
+export ROLEPLAY_OPENAI_API_KEY=<openai-key>
+roleplay run social-engineering-core --target http://localhost:3000/agent --provider openai --max-turns 1 --fail-on critical
+```
 For Team Cloud upload verification, create a project API key at `https://app.roleplay.sh` and run:
 ```bash

package/SECURITY.md CHANGED Viewed

@@ -12,9 +12,7 @@ Do not include real API keys, customer data, private prompts, transcripts, or pr
 ## Data Handling
-roleplay.sh stores runs locally under `.roleplay/runs`. Scenario files, hidden context, transcripts, and reports may contain sensitive information.
-When using OpenAI providers or judges, scenario data and transcripts are sent to the external provider. Use `--provider mock --judge mock` for local-only testing.
+roleplay.sh stores runs locally under `.roleplay/runs`. Scenario files, hidden context, transcripts, and reports may contain sensitive information. Full transcripts stay local unless you explicitly upload them to Team Cloud with full-transcript mode enabled in both the project policy and the CLI command.
 ## CLI Target Execution