@bilalimamoglu/sift 0.3.0 → 0.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +127 -312
- package/dist/cli.js +1939 -194
- package/dist/index.d.ts +11 -2
- package/dist/index.js +1528 -68
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -1,179 +1,165 @@
|
|
|
1
1
|
# sift
|
|
2
2
|
|
|
3
|
-
<img src="assets/brand/sift-logo-
|
|
3
|
+
<img src="assets/brand/sift-logo-minimal-monochrome.svg" alt="sift logo" width="120" />
|
|
4
4
|
|
|
5
|
-
`sift`
|
|
5
|
+
Most command output is long and noisy, but the thing you actually need to know is short: what failed, where, and what to do next. `sift` runs the command for you, captures the output, and gives you a short answer instead of a wall of text.
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
- `standard` = map
|
|
9
|
-
- `focused` or `rerun --remaining` = zoom
|
|
10
|
-
- raw traceback = last resort
|
|
11
|
-
|
|
12
|
-
It is a good fit when a human, agent, or CI job needs the answer faster than it needs the whole log.
|
|
7
|
+
It works with test suites, build logs, `git diff`, `npm audit`, `terraform plan` — anything where the signal is buried in noise. It always tries the cheapest approach first and only escalates when needed. Exit codes are preserved.
|
|
13
8
|
|
|
14
|
-
|
|
15
|
-
-
|
|
16
|
-
- typecheck failures
|
|
17
|
-
- lint failures
|
|
18
|
-
- build logs
|
|
19
|
-
- `git diff`
|
|
20
|
-
- `npm audit`
|
|
21
|
-
- `terraform plan`
|
|
22
|
-
|
|
23
|
-
Do not use it when:
|
|
24
|
-
- the exact raw log is the main thing you need
|
|
9
|
+
Skip it when:
|
|
10
|
+
- you need the exact raw log
|
|
25
11
|
- the command is interactive or TUI-based
|
|
26
|
-
-
|
|
12
|
+
- the output is already short
|
|
27
13
|
|
|
28
14
|
## Install
|
|
29
15
|
|
|
30
|
-
Requires Node.js
|
|
16
|
+
Requires Node.js 24 or later.
|
|
31
17
|
|
|
32
18
|
```bash
|
|
33
19
|
npm install -g @bilalimamoglu/sift
|
|
34
20
|
```
|
|
35
21
|
|
|
36
|
-
##
|
|
22
|
+
## Setup
|
|
37
23
|
|
|
38
|
-
The
|
|
24
|
+
The interactive setup writes a machine-wide config and walks you through provider selection:
|
|
39
25
|
|
|
40
26
|
```bash
|
|
41
27
|
sift config setup
|
|
28
|
+
sift doctor # verify it works
|
|
42
29
|
```
|
|
43
30
|
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
```text
|
|
47
|
-
~/.config/sift/config.yaml
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
After that, any terminal on the machine can use `sift`. A repo-local config can still override it later.
|
|
31
|
+
Config is saved to `~/.config/sift/config.yaml`. A repo-local `sift.config.yaml` can override it later.
|
|
51
32
|
|
|
52
|
-
If you prefer
|
|
33
|
+
If you prefer environment variables instead:
|
|
53
34
|
|
|
54
35
|
```bash
|
|
36
|
+
# OpenAI
|
|
55
37
|
export SIFT_PROVIDER=openai
|
|
56
38
|
export SIFT_BASE_URL=https://api.openai.com/v1
|
|
57
39
|
export SIFT_MODEL=gpt-5-nano
|
|
58
40
|
export OPENAI_API_KEY=your_openai_api_key
|
|
41
|
+
|
|
42
|
+
# or OpenRouter
|
|
43
|
+
export SIFT_PROVIDER=openrouter
|
|
44
|
+
export OPENROUTER_API_KEY=your_openrouter_api_key
|
|
45
|
+
|
|
46
|
+
# or any OpenAI-compatible endpoint (Together, Groq, self-hosted, etc.)
|
|
47
|
+
export SIFT_PROVIDER=openai-compatible
|
|
48
|
+
export SIFT_BASE_URL=https://your-endpoint/v1
|
|
49
|
+
export SIFT_PROVIDER_API_KEY=your_api_key
|
|
59
50
|
```
|
|
60
51
|
|
|
61
|
-
|
|
52
|
+
To switch between saved providers without editing files:
|
|
62
53
|
|
|
63
54
|
```bash
|
|
64
|
-
sift
|
|
55
|
+
sift config use openai
|
|
56
|
+
sift config use openrouter
|
|
65
57
|
```
|
|
66
58
|
|
|
67
|
-
##
|
|
59
|
+
## Usage
|
|
68
60
|
|
|
69
|
-
|
|
70
|
-
1. run the noisy command through `sift`
|
|
71
|
-
2. read the short `standard` answer first
|
|
72
|
-
3. only zoom in if `standard` clearly tells you more detail is still worth it
|
|
73
|
-
|
|
74
|
-
Examples:
|
|
61
|
+
Run a noisy command through `sift`, read the short answer, and only zoom in if it tells you to:
|
|
75
62
|
|
|
76
63
|
```bash
|
|
77
|
-
sift exec "what changed?" -- git diff
|
|
78
64
|
sift exec --preset test-status -- pytest -q
|
|
79
|
-
sift
|
|
80
|
-
sift rerun --remaining --detail focused
|
|
81
|
-
sift rerun --remaining --detail verbose --show-raw
|
|
82
|
-
sift watch "what changed between cycles?" < watcher-output.txt
|
|
83
|
-
sift exec --watch "what changed between cycles?" -- node watcher.js
|
|
84
|
-
sift exec --preset typecheck-summary -- npm run typecheck
|
|
85
|
-
sift exec --preset lint-failures -- eslint .
|
|
65
|
+
sift exec "what changed?" -- git diff
|
|
86
66
|
sift exec --preset audit-critical -- npm audit
|
|
87
67
|
sift exec --preset infra-risk -- terraform plan
|
|
88
|
-
sift agent install codex --dry-run
|
|
89
68
|
```
|
|
90
69
|
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
For most repos, this is the whole story:
|
|
70
|
+
`sift exec` runs the child command, captures its output, reduces it, and preserves the original exit code.
|
|
94
71
|
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
sift rerun --remaining --detail focused
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
Mental model:
|
|
102
|
-
- `sift escalate` = same cached output, deeper render
|
|
103
|
-
- `sift rerun` = rerun the cached full command at `standard` and prepend what resolved, remained, or changed
|
|
104
|
-
- `sift rerun --remaining` = rerun only the remaining failing pytest node IDs for a zoomed-in view
|
|
105
|
-
- `sift watch` / `sift exec --watch` = treat redraw-style output as cycles and summarize what changed
|
|
106
|
-
- `Decision: stop and act` = trust the current diagnosis and go read or fix code
|
|
107
|
-
- `Decision: zoom` = one deeper sift pass is justified before raw
|
|
108
|
-
- `Decision: raw only if exact traceback is required` = raw is last resort, not the next default step
|
|
109
|
-
|
|
110
|
-
If your project uses `pytest`, `vitest`, `jest`, `bun test`, or another test runner instead of `npm test`, use the same preset with that command.
|
|
72
|
+
Useful flags:
|
|
73
|
+
- `--dry-run`: show the reduced input and prompt without calling the provider
|
|
74
|
+
- `--show-raw`: print the captured raw output to `stderr`
|
|
111
75
|
|
|
112
|
-
|
|
113
|
-
1. runs the child command
|
|
114
|
-
2. captures `stdout` and `stderr`
|
|
115
|
-
3. keeps the useful signal
|
|
116
|
-
4. returns a short answer or JSON
|
|
117
|
-
5. preserves the child command exit code
|
|
76
|
+
## Test debugging workflow
|
|
118
77
|
|
|
119
|
-
|
|
120
|
-
- `--dry-run`: show the reduced input and prompt without calling the provider
|
|
121
|
-
- `--show-raw`: print the captured raw input to `stderr`
|
|
78
|
+
This is the most common use case and where `sift` adds the most value.
|
|
122
79
|
|
|
123
|
-
|
|
80
|
+
Think of it like this:
|
|
81
|
+
- `standard` = map
|
|
82
|
+
- `focused` or `rerun --remaining` = zoom
|
|
83
|
+
- raw traceback = last resort
|
|
124
84
|
|
|
125
|
-
|
|
85
|
+
For most repos, the whole story is:
|
|
126
86
|
|
|
127
87
|
```bash
|
|
128
|
-
sift exec --preset test-status -- <test command>
|
|
88
|
+
sift exec --preset test-status -- <test command> # get the map
|
|
89
|
+
sift rerun # after a fix, refresh the truth
|
|
90
|
+
sift rerun --remaining --detail focused # zoom into what's still failing
|
|
129
91
|
```
|
|
130
92
|
|
|
131
|
-
|
|
93
|
+
`test-status` becomes test-aware because you chose the preset. It does **not** infer "this is a test command" from the runner name — use the same preset with `pytest`, `vitest`, `jest`, `bun test`, or any other runner.
|
|
132
94
|
|
|
133
|
-
|
|
134
|
-
1. `sift exec --preset test-status -- <test command>`
|
|
135
|
-
2. `sift rerun`
|
|
136
|
-
3. `sift rerun --remaining --detail focused`
|
|
137
|
-
4. `sift rerun --remaining --detail verbose`
|
|
138
|
-
5. `sift rerun --remaining --detail verbose --show-raw`
|
|
139
|
-
6. raw pytest only if exact traceback lines are still needed
|
|
95
|
+
If `standard` already names the failure buckets, counts, and hints, stop there and read code. If it ends with `Decision: zoom`, do one deeper pass before falling back to raw traceback.
|
|
140
96
|
|
|
141
|
-
|
|
97
|
+
### What `sift` returns for each failure family
|
|
142
98
|
|
|
143
|
-
|
|
99
|
+
- `Shared blocker` — one setup problem affecting many tests
|
|
100
|
+
- A named family such as import, timeout, network, migration, or assertion
|
|
101
|
+
- `Anchor` — the first file, line window, or search term worth opening
|
|
102
|
+
- `Fix` — the likely next move
|
|
103
|
+
- `Decision` — whether to stop here or zoom one step deeper
|
|
104
|
+
- `Next` — the smallest practical action
|
|
144
105
|
|
|
145
|
-
|
|
106
|
+
### Detail levels
|
|
146
107
|
|
|
147
|
-
|
|
108
|
+
- `standard` — short summary, no file list (default)
|
|
109
|
+
- `focused` — groups failures by error type, shows a few representative tests
|
|
110
|
+
- `verbose` — flat list of all visible failing tests with their normalized reason
|
|
148
111
|
|
|
149
|
-
|
|
112
|
+
### Example output
|
|
150
113
|
|
|
151
|
-
|
|
114
|
+
Single failure family:
|
|
115
|
+
```text
|
|
116
|
+
- Tests did not complete.
|
|
117
|
+
- 114 errors occurred during collection.
|
|
118
|
+
- Import/dependency blocker: repeated collection failures are caused by missing dependencies.
|
|
119
|
+
- Anchor: path/to/failing_test.py
|
|
120
|
+
- Fix: Install the missing dependencies and rerun the affected tests.
|
|
121
|
+
- Decision: stop and act. Do not escalate unless you need exact traceback lines.
|
|
122
|
+
- Next: Fix bucket 1 first, then rerun the full suite at standard.
|
|
123
|
+
```
|
|
152
124
|
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
125
|
+
Multiple failure families in one pass:
|
|
126
|
+
```text
|
|
127
|
+
- Tests did not pass.
|
|
128
|
+
- 3 tests failed. 124 errors occurred.
|
|
129
|
+
- Shared blocker: DB-isolated tests are missing a required test env var.
|
|
130
|
+
Anchor: search <TEST_ENV_VAR> in path/to/test_setup.py
|
|
131
|
+
Fix: Set the required test env var and rerun the suite.
|
|
132
|
+
- Contract drift: snapshot expectations are out of sync with the current API or model state.
|
|
133
|
+
Anchor: search <route-or-entity> in path/to/freeze_test.py
|
|
134
|
+
Fix: Review the drift and regenerate the snapshots if the change is intentional.
|
|
135
|
+
- Decision: stop and act.
|
|
136
|
+
- Next: Fix bucket 1 first, then rerun the full suite at standard.
|
|
157
137
|
```
|
|
158
138
|
|
|
159
|
-
|
|
160
|
-
- `remaining_summary` and `resolved_summary` keep the answer small
|
|
161
|
-
- `read_targets` points to the first file or line worth reading
|
|
162
|
-
- `read_targets.context_hint` can tell an agent to read only a small line window first
|
|
163
|
-
- if `context_hint` only includes `search_hint`, search for that string before reading the whole file
|
|
164
|
-
- `remaining_subset_available` tells you whether `sift rerun --remaining` can zoom safely
|
|
139
|
+
### Recommended debugging order
|
|
165
140
|
|
|
166
|
-
|
|
141
|
+
1. `sift exec --preset test-status -- <test command>` — get the map.
|
|
142
|
+
2. If `standard` already shows root cause, `Anchor`, and `Fix`, trust it and act.
|
|
143
|
+
3. `sift escalate` — deeper render of the same cached output, without rerunning.
|
|
144
|
+
4. `sift rerun` — after a fix, refresh the full-suite truth at `standard`.
|
|
145
|
+
5. `sift rerun --remaining --detail focused` — zoom into what is still failing.
|
|
146
|
+
6. `sift rerun --remaining --detail verbose`
|
|
147
|
+
7. `sift rerun --remaining --detail verbose --show-raw`
|
|
148
|
+
8. Raw test command only if exact traceback lines are still needed.
|
|
167
149
|
|
|
168
|
-
|
|
169
|
-
sift exec --preset test-status --goal diagnose --format json --include-test-ids -- pytest -q
|
|
170
|
-
```
|
|
150
|
+
`sift rerun --remaining` currently supports only cached `pytest` or `python -m pytest` runs. For other runners, rerun a narrowed command manually with `sift exec --preset test-status -- <narrowed command>`.
|
|
171
151
|
|
|
172
|
-
|
|
152
|
+
### Quick glossary
|
|
153
|
+
|
|
154
|
+
- `sift escalate` = same cached output, deeper render
|
|
155
|
+
- `sift rerun` = rerun the cached command at `standard`, show what resolved or remained
|
|
156
|
+
- `sift rerun --remaining` = rerun only the remaining failing test nodes
|
|
157
|
+
- `Decision: stop and act` = trust the diagnosis and go fix code
|
|
158
|
+
- `Decision: zoom` = one deeper sift pass is justified before raw
|
|
173
159
|
|
|
174
160
|
## Watch mode
|
|
175
161
|
|
|
176
|
-
Use watch mode when
|
|
162
|
+
Use watch mode when output redraws or repeats across cycles:
|
|
177
163
|
|
|
178
164
|
```bash
|
|
179
165
|
sift watch "what changed between cycles?" < watcher-output.txt
|
|
@@ -181,116 +167,22 @@ sift exec --watch "what changed between cycles?" -- node watcher.js
|
|
|
181
167
|
sift exec --watch --preset test-status -- pytest -f
|
|
182
168
|
```
|
|
183
169
|
|
|
184
|
-
`sift watch` keeps the current summary and change summary together:
|
|
185
170
|
- cycle 1 = current state
|
|
186
171
|
- later cycles = what changed, what resolved, what stayed, and the next best action
|
|
187
172
|
- for `test-status`, resolved tests drop out and remaining failures stay in focus
|
|
188
173
|
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
## `test-status` detail modes
|
|
192
|
-
|
|
193
|
-
If you are running `npm test` and want `sift` to check the result, use `--preset test-status`.
|
|
194
|
-
|
|
195
|
-
`test-status` becomes test-aware because you chose the preset. It does **not** infer “this is a test command” from `pytest`, `vitest`, `npm test`, or any other runner name.
|
|
196
|
-
|
|
197
|
-
Available detail levels:
|
|
198
|
-
|
|
199
|
-
- `standard`
|
|
200
|
-
- short default summary
|
|
201
|
-
- no file list
|
|
202
|
-
- `focused`
|
|
203
|
-
- groups failures by error type
|
|
204
|
-
- shows a few representative failing tests or modules
|
|
205
|
-
- `verbose`
|
|
206
|
-
- flat list of visible failing tests or modules and their normalized reason
|
|
207
|
-
- useful when Codex needs to know exactly what to fix first
|
|
208
|
-
|
|
209
|
-
Examples:
|
|
210
|
-
|
|
211
|
-
```bash
|
|
212
|
-
sift exec --preset test-status -- npm test
|
|
213
|
-
sift rerun
|
|
214
|
-
sift rerun --remaining --detail focused
|
|
215
|
-
sift rerun --remaining --detail verbose
|
|
216
|
-
sift rerun --remaining --detail verbose --show-raw
|
|
217
|
-
```
|
|
174
|
+
## Diagnose JSON
|
|
218
175
|
|
|
219
|
-
|
|
176
|
+
Start with text. Use JSON only when automation needs machine-readable output:
|
|
220
177
|
|
|
221
178
|
```bash
|
|
222
|
-
sift exec --preset test-status -- pytest
|
|
223
|
-
sift rerun
|
|
224
|
-
sift rerun --remaining --detail focused
|
|
225
|
-
sift rerun --remaining --detail verbose --show-raw
|
|
226
|
-
```
|
|
227
|
-
|
|
228
|
-
`sift rerun --remaining` currently supports only cached argv-mode `pytest ...` or `python -m pytest ...` runs. If the cached command is not subset-capable, run a narrowed pytest command manually with `sift exec --preset test-status -- <narrowed pytest command>`.
|
|
229
|
-
|
|
230
|
-
Typical shapes:
|
|
231
|
-
|
|
232
|
-
`standard`
|
|
233
|
-
```text
|
|
234
|
-
- Tests did not complete.
|
|
235
|
-
- 114 errors occurred during collection.
|
|
236
|
-
- Import/dependency blocker: repeated collection failures are caused by missing dependencies.
|
|
237
|
-
- Anchor: path/to/failing_test.py
|
|
238
|
-
- Fix: Install the missing dependencies and rerun the affected tests.
|
|
239
|
-
- Decision: stop and act. Do not escalate unless you need exact traceback lines.
|
|
240
|
-
- Next: Fix bucket 1 first, then rerun the full suite at standard.
|
|
241
|
-
- Stop signal: diagnosis complete; raw not needed.
|
|
242
|
-
```
|
|
243
|
-
|
|
244
|
-
`standard` can also separate more than one failure family in a single pass:
|
|
245
|
-
```text
|
|
246
|
-
- Tests did not pass.
|
|
247
|
-
- 3 tests failed. 124 errors occurred.
|
|
248
|
-
- Shared blocker: DB-isolated tests are missing a required test env var.
|
|
249
|
-
- Anchor: search <TEST_ENV_VAR> in path/to/test_setup.py
|
|
250
|
-
- Fix: Set the required test env var and rerun the suite.
|
|
251
|
-
- Contract drift: snapshot expectations are out of sync with the current API or model state.
|
|
252
|
-
- Anchor: search <route-or-entity> in path/to/freeze_test.py
|
|
253
|
-
- Fix: Review the drift and regenerate the snapshots if the change is intentional.
|
|
254
|
-
- Decision: stop and act. Do not escalate unless you need exact traceback lines.
|
|
255
|
-
- Next: Fix bucket 1 first, then rerun the full suite at standard. Secondary buckets are already visible behind it.
|
|
256
|
-
- Stop signal: diagnosis complete; raw not needed.
|
|
257
|
-
```
|
|
258
|
-
|
|
259
|
-
`focused`
|
|
260
|
-
```text
|
|
261
|
-
- Tests did not complete.
|
|
262
|
-
- 114 errors occurred during collection.
|
|
263
|
-
- Import/dependency blocker: missing dependencies are blocking collection.
|
|
264
|
-
- Missing modules include <module-a>, <module-b>.
|
|
265
|
-
- path/to/test_a.py -> missing module: <module-a>
|
|
266
|
-
- path/to/test_b.py -> missing module: <module-b>
|
|
267
|
-
- Hint: Install the missing dependencies and rerun the affected tests.
|
|
268
|
-
- Next: Fix bucket 1 first, then rerun the full suite at standard.
|
|
269
|
-
- Stop signal: diagnosis complete; raw not needed.
|
|
179
|
+
sift exec --preset test-status --goal diagnose --format json -- pytest -q
|
|
180
|
+
sift rerun --goal diagnose --format json
|
|
270
181
|
```
|
|
271
182
|
|
|
272
|
-
`
|
|
273
|
-
```text
|
|
274
|
-
- Tests did not complete.
|
|
275
|
-
- 114 errors occurred during collection.
|
|
276
|
-
- Import/dependency blocker: missing dependencies are blocking collection.
|
|
277
|
-
- path/to/test_a.py -> missing module: <module-a>
|
|
278
|
-
- path/to/test_b.py -> missing module: <module-b>
|
|
279
|
-
- path/to/test_c.py -> missing module: <module-c>
|
|
280
|
-
- Hint: Install the missing dependencies and rerun the affected tests.
|
|
281
|
-
- Next: Fix bucket 1 first, then rerun the full suite at standard.
|
|
282
|
-
- Stop signal: diagnosis complete; raw not needed.
|
|
283
|
-
```
|
|
183
|
+
The JSON is summary-first: `remaining_summary`, `resolved_summary`, `read_targets` with optional `context_hint`, and `remaining_subset_available` to tell you whether `sift rerun --remaining` can zoom safely.
|
|
284
184
|
|
|
285
|
-
|
|
286
|
-
1. Use `standard` for the full suite first.
|
|
287
|
-
2. Treat `standard` as the map. If it already shows bucket-level root cause, `Anchor`, and `Fix`, trust it and report or act from there directly.
|
|
288
|
-
3. Use `sift escalate` only when you want a deeper render of the same cached output without rerunning the command.
|
|
289
|
-
4. After fixing something, run `sift rerun` to refresh the full-suite truth at `standard`.
|
|
290
|
-
5. Only then use `sift rerun --remaining --detail focused` as the zoom lens after the full-suite truth is refreshed.
|
|
291
|
-
6. Then use `sift rerun --remaining --detail verbose`.
|
|
292
|
-
7. Then use `sift rerun --remaining --detail verbose --show-raw`.
|
|
293
|
-
8. Fall back to the raw pytest command only if you still need exact traceback lines for the remaining failing subset.
|
|
185
|
+
Add `--include-test-ids` only when you need every raw failing test ID.
|
|
294
186
|
|
|
295
187
|
## Built-in presets
|
|
296
188
|
|
|
@@ -303,8 +195,6 @@ Recommended debugging order for tests:
|
|
|
303
195
|
- `build-failure`: explain the most likely build failure
|
|
304
196
|
- `log-errors`: extract the most relevant error signals
|
|
305
197
|
|
|
306
|
-
List or inspect them:
|
|
307
|
-
|
|
308
198
|
```bash
|
|
309
199
|
sift presets list
|
|
310
200
|
sift presets show test-status
|
|
@@ -312,27 +202,14 @@ sift presets show test-status
|
|
|
312
202
|
|
|
313
203
|
## Agent setup
|
|
314
204
|
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
Repo scope is the default because it is safer:
|
|
205
|
+
`sift` can install a managed instruction block so Codex or Claude Code uses `sift` by default for long command output:
|
|
318
206
|
|
|
319
207
|
```bash
|
|
320
|
-
sift agent show codex
|
|
321
|
-
sift agent show codex --raw
|
|
322
|
-
sift agent install codex --dry-run
|
|
323
|
-
sift agent install codex --dry-run --raw
|
|
324
208
|
sift agent install codex
|
|
325
209
|
sift agent install claude
|
|
326
210
|
```
|
|
327
211
|
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
```bash
|
|
331
|
-
sift agent install codex --scope global
|
|
332
|
-
sift agent install claude --scope global
|
|
333
|
-
```
|
|
334
|
-
|
|
335
|
-
Useful commands:
|
|
212
|
+
This writes a managed block to `AGENTS.md` or `CLAUDE.md` in the current repo. Use `--dry-run` to preview, or `--scope global` for machine-wide instructions.
|
|
336
213
|
|
|
337
214
|
```bash
|
|
338
215
|
sift agent status
|
|
@@ -340,78 +217,33 @@ sift agent remove codex
|
|
|
340
217
|
sift agent remove claude
|
|
341
218
|
```
|
|
342
219
|
|
|
343
|
-
|
|
220
|
+
## CI usage
|
|
344
221
|
|
|
345
|
-
|
|
346
|
-
- writes to `AGENTS.md` or `CLAUDE.md` by default in the current repo
|
|
347
|
-
- uses marked managed blocks instead of rewriting the whole file
|
|
348
|
-
- preserves your surrounding notes and instructions
|
|
349
|
-
- can use global files when you explicitly choose `--scope global`
|
|
350
|
-
- keeps previews short by default
|
|
351
|
-
- shows the exact managed block or final dry-run content only with `--raw`
|
|
352
|
-
|
|
353
|
-
What the managed block tells the agent:
|
|
354
|
-
- start with `sift` for long non-interactive command output so the agent spends less context-window and token budget on raw logs
|
|
355
|
-
- for tests, begin with the normal `test-status` summary
|
|
356
|
-
- if `standard` already identifies the main buckets, stop there instead of escalating automatically
|
|
357
|
-
- use `sift escalate` only for the same cached output when more detail is needed without rerunning the command
|
|
358
|
-
- after a fix, refresh the truth with `sift rerun`
|
|
359
|
-
- only then zoom into the remaining failing pytest subset with `sift rerun --remaining --detail focused`, then `verbose`, then `--show-raw`
|
|
360
|
-
- fall back to the raw test command only when exact traceback lines are still needed
|
|
361
|
-
|
|
362
|
-
## CI-friendly usage
|
|
363
|
-
|
|
364
|
-
Some commands succeed technically but should still block CI. `--fail-on` handles that for the built-in semantic presets that have stable machine-readable output:
|
|
222
|
+
Some commands succeed technically but should still block CI. `--fail-on` handles that:
|
|
365
223
|
|
|
366
224
|
```bash
|
|
367
225
|
sift exec --preset audit-critical --fail-on -- npm audit
|
|
368
226
|
sift exec --preset infra-risk --fail-on -- terraform plan
|
|
369
227
|
```
|
|
370
228
|
|
|
371
|
-
Supported presets for `--fail-on`:
|
|
372
|
-
- `audit-critical`
|
|
373
|
-
- `infra-risk`
|
|
374
|
-
|
|
375
229
|
## Config
|
|
376
230
|
|
|
377
|
-
Useful commands:
|
|
378
|
-
|
|
379
231
|
```bash
|
|
380
|
-
sift config
|
|
381
|
-
sift config
|
|
382
|
-
sift config show
|
|
232
|
+
sift config show # masks secrets by default
|
|
233
|
+
sift config show --show-secrets
|
|
383
234
|
sift config validate
|
|
384
|
-
sift doctor
|
|
385
235
|
```
|
|
386
236
|
|
|
387
|
-
`sift config show` masks secrets by default. Use `--show-secrets` only when you explicitly need raw values.
|
|
388
|
-
|
|
389
237
|
Config precedence:
|
|
390
238
|
1. CLI flags
|
|
391
239
|
2. environment variables
|
|
392
|
-
3. repo-local `sift.config.yaml`
|
|
393
|
-
4. machine-wide `~/.config/sift/config.yaml`
|
|
240
|
+
3. repo-local `sift.config.yaml`
|
|
241
|
+
4. machine-wide `~/.config/sift/config.yaml`
|
|
394
242
|
5. built-in defaults
|
|
395
243
|
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
To compare raw pytest output against the `test-status` reduction ladder on fixed fixtures, run:
|
|
399
|
-
|
|
400
|
-
```bash
|
|
401
|
-
npm run bench:test-status-ab
|
|
402
|
-
npm run bench:test-status-live
|
|
403
|
-
```
|
|
404
|
-
|
|
405
|
-
This uses the real `o200k_base` tokenizer and reports both:
|
|
406
|
-
- command-output budget as the primary benchmark
|
|
407
|
-
- deterministic recipe-budget comparisons as supporting evidence only
|
|
408
|
-
- live-session scorecards for captured mixed full-suite agent transcripts
|
|
409
|
-
|
|
410
|
-
The benchmark is meant to show context-window and command-output reduction first. In normal debugging flows, `test-status` should usually stop at `standard`; `focused` and `verbose` are escalation tools, and raw pytest is the last resort when exact traceback evidence is still needed.
|
|
244
|
+
If you pass `--config <path>`, that path is strict — missing paths are errors.
|
|
411
245
|
|
|
412
|
-
|
|
413
|
-
|
|
414
|
-
Minimal config example:
|
|
246
|
+
Minimal YAML config:
|
|
415
247
|
|
|
416
248
|
```yaml
|
|
417
249
|
provider:
|
|
@@ -430,54 +262,37 @@ runtime:
|
|
|
430
262
|
rawFallback: true
|
|
431
263
|
```
|
|
432
264
|
|
|
433
|
-
## OpenAI vs OpenAI-compatible
|
|
434
|
-
|
|
435
|
-
Use `provider: openai` for `api.openai.com`.
|
|
436
|
-
|
|
437
|
-
Use `provider: openai-compatible` for third-party compatible gateways or self-hosted endpoints.
|
|
438
|
-
|
|
439
|
-
For OpenAI:
|
|
440
|
-
```bash
|
|
441
|
-
export OPENAI_API_KEY=your_openai_api_key
|
|
442
|
-
```
|
|
443
|
-
|
|
444
|
-
For third-party compatible endpoints, use either the endpoint-native env var or:
|
|
445
|
-
|
|
446
|
-
```bash
|
|
447
|
-
export SIFT_PROVIDER_API_KEY=your_provider_api_key
|
|
448
|
-
```
|
|
449
|
-
|
|
450
|
-
Known compatible env fallbacks include:
|
|
451
|
-
- `OPENROUTER_API_KEY`
|
|
452
|
-
- `TOGETHER_API_KEY`
|
|
453
|
-
- `GROQ_API_KEY`
|
|
454
|
-
|
|
455
265
|
## Safety and limits
|
|
456
266
|
|
|
457
267
|
- redaction is optional and regex-based
|
|
458
|
-
- retriable provider failures
|
|
459
|
-
- `sift exec` detects
|
|
460
|
-
- pipe mode does not preserve upstream
|
|
268
|
+
- retriable provider failures (`429`, timeouts, `5xx`) are retried once
|
|
269
|
+
- `sift exec` detects interactive prompts (`[y/N]`, `password:`) and skips reduction
|
|
270
|
+
- pipe mode does not preserve upstream pipeline failures; use `set -o pipefail` if needed
|
|
461
271
|
|
|
462
272
|
## Releasing
|
|
463
273
|
|
|
464
274
|
This repo uses a manual GitHub Actions release workflow with npm trusted publishing.
|
|
465
275
|
|
|
466
|
-
Release flow:
|
|
467
276
|
1. bump `package.json`
|
|
468
277
|
2. merge to `main`
|
|
469
278
|
3. run the `release` workflow manually
|
|
470
279
|
|
|
471
280
|
The workflow runs typecheck, tests, coverage, build, packaging smoke checks, npm publish, tag creation, and GitHub Release creation.
|
|
472
281
|
|
|
473
|
-
|
|
282
|
+
Release notes: if `release-notes/v<version>.md` or `release-notes/<version>.md` exists, the workflow uses it. Otherwise it falls back to GitHub generated notes.
|
|
474
283
|
|
|
475
|
-
|
|
284
|
+
## Maintainer benchmark
|
|
285
|
+
|
|
286
|
+
```bash
|
|
287
|
+
npm run bench:test-status-ab
|
|
288
|
+
npm run bench:test-status-live
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
Uses the `o200k_base` tokenizer and reports command-output budget as the primary benchmark, with deterministic recipe-budget comparisons and live-session scorecards as supporting evidence.
|
|
292
|
+
|
|
293
|
+
## Brand assets
|
|
476
294
|
|
|
477
|
-
|
|
478
|
-
- badge/app: teal, black, monochrome
|
|
479
|
-
- icon-only: teal, black, monochrome
|
|
480
|
-
- 24px icon: teal, black, monochrome
|
|
295
|
+
Logo assets live in `assets/brand/`: badge/app, icon-only, and 24px icon variants in teal, black, and monochrome.
|
|
481
296
|
|
|
482
297
|
## License
|
|
483
298
|
|