@roleplay-sh/cli 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +12 -0
- package/CHANGELOG.md +31 -0
- package/CONTRIBUTING.md +25 -0
- package/LICENSE +21 -0
- package/README.md +102 -0
- package/RELEASE.md +61 -0
- package/SECURITY.md +25 -0
- package/dist/cli.d.ts +1 -0
- package/dist/cli.js +3162 -0
- package/dist/cli.js.map +1 -0
- package/dist/index.d.ts +857 -0
- package/dist/index.js +968 -0
- package/dist/index.js.map +1 -0
- package/examples/agents/bad-refund-agent.js +15 -0
- package/examples/agents/simple-support-agent.js +19 -0
- package/examples/scenarios/prompt-injection-basic.yml +35 -0
- package/examples/scenarios/refund-policy-edge-case.yml +43 -0
- package/examples/scenarios/support-happy-path.yml +34 -0
- package/package.json +90 -0
package/.env.example
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
# Optional agent credentials used by your own HTTP/CLI target.
|
|
2
|
+
AGENT_API_KEY=
|
|
3
|
+
|
|
4
|
+
# Team Cloud upload settings.
|
|
5
|
+
ROLEPLAY_CLOUD_URL=https://app.roleplay.sh
|
|
6
|
+
ROLEPLAY_PROJECT_ID=
|
|
7
|
+
ROLEPLAY_API_KEY=
|
|
8
|
+
ROLEPLAY_AGENT_NAME=
|
|
9
|
+
|
|
10
|
+
# Built-in social-engineering-core target. Set exactly one for CI.
|
|
11
|
+
ROLEPLAY_TARGET_URL=http://localhost:3000/agent
|
|
12
|
+
ROLEPLAY_TARGET_COMMAND=
|
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to roleplay.sh will be documented in this file.
|
|
4
|
+
|
|
5
|
+
This project follows semantic versioning after the public `0.1.0` release.
|
|
6
|
+
|
|
7
|
+
## 0.1.0 - 2026-05-17
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- Initial `roleplay` CLI.
|
|
12
|
+
- Scenario YAML validation with Zod.
|
|
13
|
+
- HTTP, CLI, and mock target adapters.
|
|
14
|
+
- Mock and OpenAI roleplayed-user providers.
|
|
15
|
+
- Mock and OpenAI judge implementations.
|
|
16
|
+
- Local run storage under `.roleplay/runs`.
|
|
17
|
+
- JSON and Markdown report generation.
|
|
18
|
+
- `init`, `scenario:create`, `run`, `report`, `replay`, `list`, `doctor`, `redteam`, and experimental `mcp` commands.
|
|
19
|
+
- Example agents and scenarios.
|
|
20
|
+
- Vitest test suite, linting, strict TypeScript, tsup build, CI, and npm publish workflow.
|
|
21
|
+
- Package smoke test that verifies tarball contents and installed CLI behavior.
|
|
22
|
+
- Failed-run artifact persistence for target/provider/judge errors.
|
|
23
|
+
- Safer CLI target execution defaults and explicit `shell: true` opt-in.
|
|
24
|
+
- Red-team target validation and optional `--save` for generated scenarios.
|
|
25
|
+
- HTTP target diagnostics for text responses, missing fields, and timeouts.
|
|
26
|
+
|
|
27
|
+
### Notes
|
|
28
|
+
|
|
29
|
+
- MCP support is a roadmap stub in this release.
|
|
30
|
+
- Mock provider and mock judge are the stable path for first local usage.
|
|
31
|
+
- OpenAI mode requires `OPENAI_API_KEY` and should be treated as experimental until more live usage is collected.
|
package/CONTRIBUTING.md
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Contributing
|
|
2
|
+
|
|
3
|
+
Thanks for helping improve roleplay.sh.
|
|
4
|
+
|
|
5
|
+
## Development
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
corepack enable
|
|
9
|
+
pnpm install
|
|
10
|
+
pnpm test
|
|
11
|
+
pnpm build
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
Use mock providers for tests and examples unless you are intentionally testing OpenAI integration.
|
|
15
|
+
|
|
16
|
+
## Pull requests
|
|
17
|
+
|
|
18
|
+
- Keep changes focused.
|
|
19
|
+
- Add tests for behavior changes.
|
|
20
|
+
- Run `pnpm lint`, `pnpm typecheck`, `pnpm test`, and `pnpm build`.
|
|
21
|
+
- Do not commit secrets, `.env`, or generated run artifacts.
|
|
22
|
+
|
|
23
|
+
## Security
|
|
24
|
+
|
|
25
|
+
Scenarios and transcripts can contain sensitive data. Avoid pasting real customer data into issues or pull requests.
|
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 roleplay.sh contributors
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
# roleplay.sh CLI
|
|
2
|
+
|
|
3
|
+
Social-engineering regression tests for AI agents.
|
|
4
|
+
|
|
5
|
+
`roleplay` runs adversarial roleplay scenarios against local, HTTP, CLI, or mock agents, records replayable evidence, and can upload sanitized findings to Team Cloud.
|
|
6
|
+
|
|
7
|
+
## Install
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
npm install -g @roleplay-sh/cli
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
Or run without installing:
|
|
14
|
+
|
|
15
|
+
```bash
|
|
16
|
+
npx @roleplay-sh/cli --help
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
## Quickstart
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
roleplay init
|
|
23
|
+
roleplay run social-engineering-core --target mock --fail-on critical
|
|
24
|
+
roleplay report latest
|
|
25
|
+
roleplay replay latest
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Test A Local Agent
|
|
29
|
+
|
|
30
|
+
HTTP target:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
roleplay run social-engineering-core \
|
|
34
|
+
--target http://localhost:3000/agent \
|
|
35
|
+
--fail-on critical
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
CLI target:
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
roleplay run social-engineering-core \
|
|
42
|
+
--target-command "node ./agent.js" \
|
|
43
|
+
--fail-on critical \
|
|
44
|
+
--yes
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Upload Sanitized Findings To Team Cloud
|
|
48
|
+
|
|
49
|
+
Create a project and API key in Team Cloud at `https://app.roleplay.sh`, then run:
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
ROLEPLAY_CLOUD_URL=https://app.roleplay.sh \
|
|
53
|
+
ROLEPLAY_PROJECT_ID=<project-id> \
|
|
54
|
+
ROLEPLAY_API_KEY=<project-api-key> \
|
|
55
|
+
roleplay upload all --mode sanitized_findings --source ci
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Sanitized upload is the default. Full transcripts, raw scenario YAML, and local metadata stay in your environment unless full transcript upload is explicitly enabled by project policy and CLI mode.
|
|
59
|
+
|
|
60
|
+
## Commands
|
|
61
|
+
|
|
62
|
+
- `roleplay init` creates local config and starter scenarios.
|
|
63
|
+
- `roleplay run` runs a scenario file or built-in attack pack.
|
|
64
|
+
- `roleplay report` prints a saved run report.
|
|
65
|
+
- `roleplay replay` replays transcript evidence.
|
|
66
|
+
- `roleplay upload` uploads sanitized findings to Team Cloud.
|
|
67
|
+
- `roleplay list` lists local runs.
|
|
68
|
+
- `roleplay doctor` checks local and Cloud configuration.
|
|
69
|
+
- `roleplay mcp` exposes roleplay.sh through MCP.
|
|
70
|
+
|
|
71
|
+
## CI Example
|
|
72
|
+
|
|
73
|
+
```yaml
|
|
74
|
+
- name: Run roleplay.sh attack pack
|
|
75
|
+
run: pnpm dlx @roleplay-sh/cli run social-engineering-core --fail-on critical
|
|
76
|
+
env:
|
|
77
|
+
ROLEPLAY_TARGET_URL: ${{ secrets.ROLEPLAY_TARGET_URL }}
|
|
78
|
+
|
|
79
|
+
- name: Upload sanitized findings
|
|
80
|
+
if: always()
|
|
81
|
+
run: pnpm dlx @roleplay-sh/cli upload all --source ci --mode sanitized_findings
|
|
82
|
+
env:
|
|
83
|
+
ROLEPLAY_CLOUD_URL: https://app.roleplay.sh
|
|
84
|
+
ROLEPLAY_PROJECT_ID: ${{ secrets.ROLEPLAY_PROJECT_ID }}
|
|
85
|
+
ROLEPLAY_API_KEY: ${{ secrets.ROLEPLAY_API_KEY }}
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
## Development
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
corepack enable
|
|
92
|
+
corepack pnpm install
|
|
93
|
+
corepack pnpm lint
|
|
94
|
+
corepack pnpm typecheck
|
|
95
|
+
corepack pnpm vitest run --testTimeout=60000
|
|
96
|
+
corepack pnpm build
|
|
97
|
+
corepack pnpm package:smoke
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
## License
|
|
101
|
+
|
|
102
|
+
MIT
|
package/RELEASE.md
ADDED
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
# Release Guide
|
|
2
|
+
|
|
3
|
+
This repository publishes the standalone `@roleplay-sh/cli` npm package.
|
|
4
|
+
|
|
5
|
+
## Preflight
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
corepack pnpm install --frozen-lockfile
|
|
9
|
+
corepack pnpm lint
|
|
10
|
+
corepack pnpm typecheck
|
|
11
|
+
corepack pnpm vitest run --testTimeout=60000
|
|
12
|
+
corepack pnpm build
|
|
13
|
+
corepack pnpm package:smoke
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
## npm Trusted Publishing
|
|
17
|
+
|
|
18
|
+
Configure npm Trusted Publishing for:
|
|
19
|
+
|
|
20
|
+
- Package: `@roleplay-sh/cli`
|
|
21
|
+
- Owner: `roleplay-sh`
|
|
22
|
+
- Repository: `cli`
|
|
23
|
+
- Workflow file: `publish.yml`
|
|
24
|
+
|
|
25
|
+
The publish workflow uses GitHub OIDC and intentionally does not require an npm token.
|
|
26
|
+
|
|
27
|
+
## Publish
|
|
28
|
+
|
|
29
|
+
Create a GitHub release or push a version tag:
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
git tag v0.1.1
|
|
33
|
+
git push origin v0.1.1
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
The publish workflow runs checks and then publishes with:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
pnpm publish --access public --no-git-checks
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## Verify
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
npm view @roleplay-sh/cli version
|
|
46
|
+
npm install -g @roleplay-sh/cli
|
|
47
|
+
roleplay --help
|
|
48
|
+
roleplay init
|
|
49
|
+
roleplay run social-engineering-core --target mock --fail-on critical
|
|
50
|
+
roleplay report latest
|
|
51
|
+
roleplay replay latest
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
For Team Cloud upload verification, create a project API key at `https://app.roleplay.sh` and run:
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
ROLEPLAY_CLOUD_URL=https://app.roleplay.sh \
|
|
58
|
+
ROLEPLAY_PROJECT_ID=<project-id> \
|
|
59
|
+
ROLEPLAY_API_KEY=<project-api-key> \
|
|
60
|
+
roleplay upload all --mode sanitized_findings
|
|
61
|
+
```
|
package/SECURITY.md
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Security Policy
|
|
2
|
+
|
|
3
|
+
## Supported Versions
|
|
4
|
+
|
|
5
|
+
roleplay.sh is pre-1.0. Security fixes will be released on the latest published minor version.
|
|
6
|
+
|
|
7
|
+
## Reporting A Vulnerability
|
|
8
|
+
|
|
9
|
+
Please report security issues privately by opening a GitHub security advisory for the repository, or by emailing the project maintainers once a public security contact is listed.
|
|
10
|
+
|
|
11
|
+
Do not include real API keys, customer data, private prompts, transcripts, or production scenario files in public issues.
|
|
12
|
+
|
|
13
|
+
## Data Handling
|
|
14
|
+
|
|
15
|
+
roleplay.sh stores runs locally under `.roleplay/runs`. Scenario files, hidden context, transcripts, and reports may contain sensitive information.
|
|
16
|
+
|
|
17
|
+
When using OpenAI providers or judges, scenario data and transcripts are sent to the external provider. Use `--provider mock --judge mock` for local-only testing.
|
|
18
|
+
|
|
19
|
+
## CLI Target Execution
|
|
20
|
+
|
|
21
|
+
CLI target scenarios can execute local commands. roleplay.sh requires `--yes` before running CLI targets in automated workflows. By default, commands run without a shell; scenario authors must opt into `shell: true` when shell behavior is required. Review scenario files before running commands from untrusted sources.
|
|
22
|
+
|
|
23
|
+
## Secrets
|
|
24
|
+
|
|
25
|
+
roleplay.sh attempts to redact common secret-like values from output and reports. Redaction is best effort. Treat `.env`, scenario hidden context, and generated reports as sensitive by default.
|
package/dist/cli.d.ts
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
#!/usr/bin/env node
|