@pickled-dev/cli 0.3.0 โ†’ 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +52 -64
  2. package/dist/index.js +237 -216
  3. package/package.json +6 -4
package/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  # @pickled-dev/cli
2
2
 
3
- > Stay fresh in AI ๐Ÿฅ’
3
+ > Test what agents actually understand about your product
4
4
 
5
- Test how well AI responds to questions about your developer tool. Define scenarios, run checks, and see your freshness score.
5
+ Pickled runs scenarios against real agent targets, checks citations against registered sources, and matches declared traps deterministically. No LLM grades another LLM.
6
6
 
7
7
  ## Installation
8
8
 
@@ -22,25 +22,26 @@ Creates a `pickled.yml` file:
22
22
 
23
23
  ```yaml
24
24
  tool:
25
- name: "your-tool"
26
- description: "What your tool does"
25
+ name: "your-product"
26
+ description: "What your product does"
27
27
 
28
- scenarios:
29
- - name: "Installation"
30
- prompt: "How do I install this tool?"
28
+ docs:
29
+ sources:
30
+ readme: ./README.md
31
31
 
32
+ scenarios:
32
33
  - name: "Getting started"
33
- prompt: "How do I set up this tool for my project?"
34
+ prompt: "How do I install and set up this product?"
35
+ requiredSources: [readme]
34
36
 
35
- - name: "Basic usage"
36
- prompt: "Show me a basic example of using this tool"
37
+ threshold: 80
37
38
  ```
38
39
 
39
40
  ### 2. Edit your config
40
41
 
41
- Update `pickled.yml` with your actual tool info and scenarios developers might ask about.
42
+ Declare the sources agents should cite, the scenarios they should answer, and any stale patterns you want traps to catch.
42
43
 
43
- ### 3. Run check
44
+ ### 3. Run the check
44
45
 
45
46
  ```bash
46
47
  pickled check
@@ -52,75 +53,62 @@ pickled check
52
53
 
53
54
  Create a starter `pickled.yml` config file.
54
55
 
56
+ ### `pickled audit [path]`
57
+
58
+ Static scan of agent-context files. No LLM calls.
59
+
55
60
  ### `pickled check [path]`
56
61
 
57
- Run freshness checks and report results.
62
+ Run agent scenarios against registered sources.
58
63
 
59
- | Option | Description |
60
- | --------------------- | ---------------------- |
61
- | `--json` | Output as JSON |
62
- | `-o, --output <file>` | Save report to file |
63
- | `-v, --verbose` | Show detailed progress |
64
- | `-t, --threshold <n>` | Min score % to pass |
64
+ | Option | Description |
65
+ | --------------------- | ----------------------------------- |
66
+ | `--json` | Output as JSON |
67
+ | `-o, --output <file>` | Save JSON report to file |
68
+ | `-v, --verbose` | Show progress while scenarios run |
69
+ | `-t, --threshold <n>` | Minimum score percent needed to pass |
65
70
 
66
71
  ## Example Output
67
72
 
68
- ```
69
- ๐Ÿฅ’ Freshness Check
70
- โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
71
-
73
+ ```text
74
+ pickled check
75
+ -------------------------------------------------------
72
76
  Tool: zod
73
-
74
- [default] โœ“ "Installation" - Well preserved (92%)
75
- [default] โœ“ "Basic parsing" - Fresh (85%)
76
- [default] โš  "Error handling" - Going stale (65%)
77
- Missing: safeParse details
78
-
79
- โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
80
- Freshness Score: 81% ๐Ÿฅ’๐Ÿฅ’๐Ÿฅ’๐Ÿฅ’โ–‘
81
-
82
- ๐Ÿฅ’ Looking fresh! Your docs are doing well.
77
+ Sources: [readme], [llms]
78
+ Scenarios: 1
79
+
80
+ Scenario: Error handling
81
+ โœ— Trap fired (0%)
82
+ trap: old_v2_api
83
+ reason: Deprecated in Zod 4; use z.treeifyError()
84
+ match: "ZodError.format()"
85
+ cited: [readme], [llms]
86
+
87
+ -------------------------------------------------------
88
+ Overall: 0 / 100 ยท threshold 80 ยท run fails
89
+ Review fired traps before trusting this surface.
83
90
  ```
84
91
 
85
- ## Freshness Scores
86
-
87
- | Score | Status | Meaning |
88
- |-------|--------|---------|
89
- | 90%+ | Well preserved | AI nails it |
90
- | 70-89% | Fresh | Good, minor gaps |
91
- | 50-69% | Going stale | Needs attention |
92
- | <50% | Gone sour | Major documentation gaps |
93
-
94
- ## Config Reference
92
+ ## Result Labels
95
93
 
96
- ```yaml
97
- tool:
98
- name: "tool-name" # Required: your tool's name
99
- description: "desc" # Required: what it does
100
-
101
- scenarios: # Required: scenarios to check
102
- - name: "Scenario name" # Display name
103
- prompt: "The question" # What to ask AI
104
- target: target-name # Optional: specific target
105
-
106
- targets: # Optional: named targets
107
- claude-sonnet:
108
- category: cli
109
- provider: claude-code
110
- model: claude-sonnet-4-20250514
111
-
112
- threshold: 80 # Optional: min score % to pass
113
- ```
94
+ | Label | Meaning |
95
+ | ----- | ------- |
96
+ | `Well grounded` | Required sources cited. No unknown sources. High confidence. |
97
+ | `Grounded` | Required sources cited. No unknown sources. Lower confidence. |
98
+ | `Partially grounded` | Some required citations are missing, or unknown citations appeared. |
99
+ | `Trap fired` | A declared stale pattern matched. Score is forced to 0 for that scenario. |
100
+ | `Ungrounded` | No valid citations, or every citation is unknown. |
101
+ | `Error` | The target failed before Pickled could score the response. |
114
102
 
115
- ## CI/CD Integration
103
+ ## CI
116
104
 
117
105
  ```yaml
118
106
  # GitHub Actions
119
- - name: Check AI freshness
107
+ - name: Check agent legibility
120
108
  run: pickled check --threshold 80
121
109
  ```
122
110
 
123
- Fail the build if AI can't answer questions about your tool correctly.
111
+ Fail the run when the overall score falls below the threshold.
124
112
 
125
113
  ## Local Development
126
114