@altairalabs/promptarena 1.1.2 → 1.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +44 -121
- package/package.json +3 -2
package/README.md
CHANGED
|
@@ -47,146 +47,69 @@ PromptKit Arena is a comprehensive testing framework for LLM-based applications.
|
|
|
47
47
|
|
|
48
48
|
## Quick Start
|
|
49
49
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
```yaml
|
|
53
|
-
# arena.yaml
|
|
54
|
-
name: Customer Support Test
|
|
55
|
-
prompts:
|
|
56
|
-
- name: support-agent
|
|
57
|
-
system_prompt: |
|
|
58
|
-
You are a helpful customer support agent.
|
|
59
|
-
Be professional and empathetic.
|
|
60
|
-
|
|
61
|
-
conversations:
|
|
62
|
-
- name: refund-request
|
|
63
|
-
turns:
|
|
64
|
-
- role: user
|
|
65
|
-
content: "I'd like a refund for order #12345"
|
|
66
|
-
- role: assistant
|
|
67
|
-
expected_topics: ["refund", "order"]
|
|
68
|
-
|
|
69
|
-
providers:
|
|
70
|
-
- type: openai
|
|
71
|
-
model: gpt-4
|
|
72
|
-
api_key: ${OPENAI_API_KEY}
|
|
73
|
-
```
|
|
74
|
-
|
|
75
|
-
2. Run the test:
|
|
50
|
+
Get started in under 2 minutes:
|
|
76
51
|
|
|
77
52
|
```bash
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
3. View the HTML report:
|
|
82
|
-
|
|
83
|
-
```bash
|
|
84
|
-
open out/report.html
|
|
85
|
-
```
|
|
86
|
-
|
|
87
|
-
## Features
|
|
88
|
-
|
|
89
|
-
### Multi-Provider Testing
|
|
90
|
-
|
|
91
|
-
Test the same prompts across different LLM providers:
|
|
92
|
-
|
|
93
|
-
```yaml
|
|
94
|
-
providers:
|
|
95
|
-
- type: openai
|
|
96
|
-
model: gpt-4
|
|
97
|
-
- type: anthropic
|
|
98
|
-
model: claude-3-5-sonnet-20241022
|
|
99
|
-
- type: google
|
|
100
|
-
model: gemini-1.5-pro
|
|
101
|
-
```
|
|
53
|
+
# Create a new project from a template
|
|
54
|
+
npx @altairalabs/promptarena init my-test --quick
|
|
102
55
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
Validate LLM responses automatically:
|
|
106
|
-
|
|
107
|
-
```yaml
|
|
108
|
-
turns:
|
|
109
|
-
- role: assistant
|
|
110
|
-
assertions:
|
|
111
|
-
- type: contains
|
|
112
|
-
value: "refund"
|
|
113
|
-
- type: tone
|
|
114
|
-
expected: professional
|
|
115
|
-
- type: length
|
|
116
|
-
min: 50
|
|
117
|
-
max: 500
|
|
118
|
-
```
|
|
56
|
+
# Navigate to your project
|
|
57
|
+
cd my-test
|
|
119
58
|
|
|
120
|
-
|
|
59
|
+
# Set your API key (or use mock provider for testing)
|
|
60
|
+
export OPENAI_API_KEY=your-key-here
|
|
121
61
|
|
|
122
|
-
|
|
62
|
+
# Run your first test
|
|
63
|
+
npx @altairalabs/promptarena run
|
|
123
64
|
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
enabled: true
|
|
127
|
-
rounds: 5
|
|
128
|
-
agents:
|
|
129
|
-
- role: customer
|
|
130
|
-
prompt: "Act as a frustrated customer"
|
|
131
|
-
- role: support
|
|
132
|
-
prompt: "Act as a patient support agent"
|
|
65
|
+
# View the HTML report
|
|
66
|
+
open out/report.html
|
|
133
67
|
```
|
|
134
68
|
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
This npm package downloads pre-built Go binaries from [GitHub Releases](https://github.com/AltairaLabs/PromptKit/releases) during installation. The binaries are:
|
|
138
|
-
|
|
139
|
-
1. Downloaded for your specific OS and architecture
|
|
140
|
-
2. Extracted from the release archive
|
|
141
|
-
3. Made executable (Unix-like systems)
|
|
142
|
-
4. Invoked through a thin Node.js wrapper
|
|
143
|
-
|
|
144
|
-
No Go toolchain is required on your machine.
|
|
145
|
-
|
|
146
|
-
## Supported Platforms
|
|
147
|
-
|
|
148
|
-
- macOS (Intel and Apple Silicon)
|
|
149
|
-
- Linux (x86_64 and arm64)
|
|
150
|
-
- Windows (x86_64 and arm64)
|
|
69
|
+
That's it! The template includes pre-configured scenarios, assertions, and examples to get you started.
|
|
151
70
|
|
|
152
|
-
|
|
71
|
+
### Browse Available Templates
|
|
153
72
|
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
73
|
+
```bash
|
|
74
|
+
# List all available templates
|
|
75
|
+
npx @altairalabs/promptarena templates list
|
|
157
76
|
|
|
158
|
-
|
|
77
|
+
# Create from a specific template
|
|
78
|
+
npx @altairalabs/promptarena init my-project --template community/iot-maintenance-demo
|
|
159
79
|
|
|
160
|
-
|
|
80
|
+
# Interactive mode (choose template, provider, etc.)
|
|
81
|
+
npx @altairalabs/promptarena init
|
|
82
|
+
```
|
|
161
83
|
|
|
162
|
-
|
|
84
|
+
## Key Features
|
|
163
85
|
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
86
|
+
- 🎯 **Multi-Provider Testing** - Compare OpenAI, Anthropic, Google, and Azure side-by-side
|
|
87
|
+
- 🔄 **Self-Play Mode** - AI agents simulate realistic user conversations with personas
|
|
88
|
+
- ✅ **Turn-Level Assertions** - Validate individual responses (content, tone, length, JSON)
|
|
89
|
+
- 📊 **Conversation Assertions** - Check patterns across entire conversations
|
|
90
|
+
- 🎭 **Template & Persona System** - Dynamic prompts with variables and reusable personas
|
|
91
|
+
- 🛡️ **Guardrail Testing** - Ensure tools and responses follow safety constraints
|
|
92
|
+
- 📈 **HTML Reports** - Beautiful, detailed reports with cost tracking and metrics
|
|
168
93
|
|
|
169
|
-
|
|
170
|
-
# Download binary directly
|
|
171
|
-
curl -L https://github.com/AltairaLabs/PromptKit/releases/download/v0.0.1/PromptKit_v0.0.1_Darwin_arm64.tar.gz -o promptarena.tar.gz
|
|
172
|
-
tar -xzf promptarena.tar.gz promptarena
|
|
173
|
-
chmod +x promptarena
|
|
174
|
-
```
|
|
94
|
+
## Learn More
|
|
175
95
|
|
|
176
|
-
###
|
|
96
|
+
### Assertion Types
|
|
177
97
|
|
|
178
|
-
|
|
98
|
+
- **Turn-Level**: `content_includes`, `content_matches`, `json_schema`, `jsonpath`, `llm_judge`, `tone`, `length`
|
|
99
|
+
- **Conversation-Level**: `llm_judge_conversation`, `tools_not_called_with_args`, `max_tool_calls`
|
|
179
100
|
|
|
180
|
-
|
|
181
|
-
chmod +x node_modules/@altairalabs/promptarena/promptarena
|
|
182
|
-
```
|
|
101
|
+
See the [Assertions Guide](https://promptkit.altairalabs.ai/arena/tutorials/05-assertions/) for examples and best practices.
|
|
183
102
|
|
|
184
|
-
|
|
103
|
+
### Documentation
|
|
185
104
|
|
|
186
|
-
- **
|
|
187
|
-
- **
|
|
188
|
-
- **
|
|
189
|
-
-
|
|
105
|
+
- **[Full Documentation](https://promptkit.altairalabs.ai/)** - Comprehensive guides and tutorials
|
|
106
|
+
- **[Configuration Reference](https://promptkit.altairalabs.ai/arena/reference/config-schema/)** - Complete schema documentation
|
|
107
|
+
- **[Examples](https://github.com/AltairaLabs/PromptKit/tree/main/examples)** - Working examples:
|
|
108
|
+
- [Assertions Test](https://github.com/AltairaLabs/PromptKit/tree/main/examples/assertions-test) - Turn and conversation-level assertions
|
|
109
|
+
- [Customer Support](https://github.com/AltairaLabs/PromptKit/tree/main/examples/customer-support) - Self-play with personas
|
|
110
|
+
- [Variables Demo](https://github.com/AltairaLabs/PromptKit/tree/main/examples/variables-demo) - Template rendering
|
|
111
|
+
- [LLM Judge](https://github.com/AltairaLabs/PromptKit/tree/main/examples/llm-judge) - AI-powered evaluation
|
|
112
|
+
- **[Multi-Turn Tutorial](https://promptkit.altairalabs.ai/arena/tutorials/03-multi-turn/)** - Self-play patterns
|
|
190
113
|
|
|
191
114
|
## License
|
|
192
115
|
|
|
@@ -199,4 +122,4 @@ Contributions welcome! See [CONTRIBUTING.md](https://github.com/AltairaLabs/Prom
|
|
|
199
122
|
## Support
|
|
200
123
|
|
|
201
124
|
- [GitHub Issues](https://github.com/AltairaLabs/PromptKit/issues)
|
|
202
|
-
- [Discussions](https://github.com/AltairaLabs/PromptKit/discussions)
|
|
125
|
+
- [Discussions](https://github.com/AltairaLabs/PromptKit/discussions)
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@altairalabs/promptarena",
|
|
3
|
-
"version": "1.1.
|
|
3
|
+
"version": "1.1.4",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "PromptKit Arena - Multi-turn conversation simulation and testing tool for LLM applications",
|
|
6
6
|
"bin": {
|
|
@@ -8,7 +8,8 @@
|
|
|
8
8
|
},
|
|
9
9
|
"scripts": {
|
|
10
10
|
"postinstall": "node postinstall.js",
|
|
11
|
-
"test": "node bin/promptarena.js --version"
|
|
11
|
+
"test": "node bin/promptarena.js --version",
|
|
12
|
+
"test:init": "node test-init.js"
|
|
12
13
|
},
|
|
13
14
|
"keywords": [
|
|
14
15
|
"llm",
|