@pickled-dev/cli 0.12.0 → 0.13.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +34 -0
- package/dist/index.js +204 -187
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -110,6 +110,40 @@ Review fired traps before trusting this surface.
|
|
|
110
110
|
| `Ungrounded` | No valid citations, or every citation is unknown. |
|
|
111
111
|
| `Error` | The target failed before Pickled could score the response. |
|
|
112
112
|
|
|
113
|
+
## Targets
|
|
114
|
+
|
|
115
|
+
Pickled ships three target shapes today. Each target is a distinct surface that exercises the agent differently; results are comparable but not identical.
|
|
116
|
+
|
|
117
|
+
### CLI targets
|
|
118
|
+
|
|
119
|
+
- `claude-code` (Claude Agent SDK) - runs the model with tools and workspace context. Requires the Claude Code CLI install.
|
|
120
|
+
- `codex-cli` (Codex CLI binary) - spawns the codex binary, pipes the prompt, parses the response.
|
|
121
|
+
|
|
122
|
+
### API target
|
|
123
|
+
|
|
124
|
+
- `anthropic` - calls the Anthropic Messages API directly via `@anthropic-ai/sdk`. No tools, no workspace, no agent orchestration. Useful when you want a controlled baseline that isolates "did the model understand the registered sources" from "did the agent's tools fix it for the model."
|
|
125
|
+
|
|
126
|
+
API targets require:
|
|
127
|
+
|
|
128
|
+
- `ANTHROPIC_API_KEY` in the environment
|
|
129
|
+
- An explicit `model` field on the target config (no silent defaults; reproducibility depends on pinning)
|
|
130
|
+
|
|
131
|
+
Example config:
|
|
132
|
+
|
|
133
|
+
```yaml
|
|
134
|
+
targets:
|
|
135
|
+
anthropic_haiku:
|
|
136
|
+
category: api
|
|
137
|
+
provider: anthropic
|
|
138
|
+
model: claude-haiku-4-5
|
|
139
|
+
temperature: 0
|
|
140
|
+
maxTokens: 4096
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
API targets accept only `model`, `temperature`, `maxTokens`, and `threshold`. The loader rejects CLI-only fields (`allowedTools`, `mcpServers`, `permissionMode`, `maxTurns`, etc.) on an API target so silent no-ops cannot create false confidence.
|
|
144
|
+
|
|
145
|
+
**Cost note:** API targets meter by input + output tokens, not by CLI session. Budget accordingly when running matrices with many sources or large scenario sets.
|
|
146
|
+
|
|
113
147
|
## CI
|
|
114
148
|
|
|
115
149
|
```yaml
|