@bilalimamoglu/sift 0.4.4 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +103 -76
- package/dist/cli.js +11005 -7333
- package/dist/index.d.ts +62 -4
- package/dist/index.js +1009 -57
- package/package.json +14 -3
package/README.md
CHANGED
|
@@ -4,9 +4,9 @@
|
|
|
4
4
|
|
|
5
5
|
# sift
|
|
6
6
|
|
|
7
|
-
### Turn noisy command output into actionable
|
|
7
|
+
### Turn noisy command output into a short, actionable first pass for your coding agent
|
|
8
8
|
|
|
9
|
-
**
|
|
9
|
+
**Local heuristics first. Group repeated failures into likely root causes and next steps before your agent reads the full log.**
|
|
10
10
|
|
|
11
11
|
[](https://www.npmjs.com/package/@bilalimamoglu/sift)
|
|
12
12
|
[](LICENSE)
|
|
@@ -21,7 +21,7 @@
|
|
|
21
21
|
npm install -g @bilalimamoglu/sift
|
|
22
22
|
```
|
|
23
23
|
|
|
24
|
-
<sub>
|
|
24
|
+
<sub>Best today on noisy pytest, vitest, jest, `tsc`, ESLint, common build failures, `npm audit`, and `terraform plan` output.</sub>
|
|
25
25
|
|
|
26
26
|
</div>
|
|
27
27
|
|
|
@@ -29,14 +29,16 @@ npm install -g @bilalimamoglu/sift
|
|
|
29
29
|
|
|
30
30
|
## Why Sift?
|
|
31
31
|
|
|
32
|
-
When an agent hits noisy output, it
|
|
32
|
+
When an agent hits noisy output, it can eventually make sense of the log wall, but it wastes time and tokens getting there.
|
|
33
33
|
|
|
34
|
-
`sift`
|
|
34
|
+
`sift` narrows that output locally first. It groups repeated failures, surfaces likely root causes, and points to the next useful step so your agent starts from signal instead of raw noise.
|
|
35
|
+
|
|
36
|
+
It is not a generic repo summarizer, not a shell telemetry product, and not a benchmark dashboard. It is a local-first triage layer for noisy command output in coding-agent workflows.
|
|
35
37
|
|
|
36
38
|
Turn 13,000 lines of test output into 2 root causes.
|
|
37
39
|
|
|
38
40
|
<p align="center">
|
|
39
|
-
<img src="assets/readme/test-status-demo.gif" alt="sift turning a pytest failure wall into a short
|
|
41
|
+
<img src="assets/readme/test-status-demo.gif" alt="sift turning a pytest failure wall into a short, actionable first pass" width="960" />
|
|
40
42
|
</p>
|
|
41
43
|
|
|
42
44
|
With `sift`, the same run becomes:
|
|
@@ -53,13 +55,56 @@ With `sift`, the same run becomes:
|
|
|
53
55
|
- Decision: stop and act.
|
|
54
56
|
```
|
|
55
57
|
|
|
56
|
-
In
|
|
58
|
+
In one large `test-status` benchmark fixture, `sift` compressed 198,026 raw output tokens to 129. That is scoped proof for a noisy test-debugging case, not a promise that every preset behaves the same way.
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Quick Start
|
|
63
|
+
|
|
64
|
+
### 1. Install
|
|
65
|
+
|
|
66
|
+
```bash
|
|
67
|
+
npm install -g @bilalimamoglu/sift
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
Requires Node.js 20+.
|
|
71
|
+
|
|
72
|
+
### 2. Try the main workflow
|
|
73
|
+
|
|
74
|
+
If you are new, start here and ignore hook beta and native surfaces for now:
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
sift exec --preset test-status -- pytest -q
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
Other common entry points:
|
|
81
|
+
|
|
82
|
+
```bash
|
|
83
|
+
sift exec --preset test-status -- npx vitest run
|
|
84
|
+
sift exec --preset test-status -- npx jest
|
|
85
|
+
sift exec "what changed?" -- git diff
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### 3. Zoom only if needed
|
|
89
|
+
|
|
90
|
+
Think of the workflow like this:
|
|
91
|
+
|
|
92
|
+
- `standard` = map
|
|
93
|
+
- `focused` = zoom
|
|
94
|
+
- raw traceback = last resort
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
sift rerun
|
|
98
|
+
sift rerun --remaining --detail focused
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
If `standard` already gives you the likely root cause, anchor, and fix, stop there and act.
|
|
57
102
|
|
|
58
103
|
---
|
|
59
104
|
|
|
60
105
|
## Benchmark Results
|
|
61
106
|
|
|
62
|
-
The output reduction above measures a single command's raw output. The table below measures
|
|
107
|
+
The output reduction above measures a single command's raw output. The table below measures one replayed end-to-end debug loop: how many tokens, tool calls, and seconds the agent spent to reach the same outcome in that benchmarked scenario.
|
|
63
108
|
|
|
64
109
|
Real debug loop on a 640-test Python backend with 124 repeated setup errors, 3 contract failures, and 511 passing tests:
|
|
65
110
|
|
|
@@ -69,9 +114,9 @@ Real debug loop on a 640-test Python backend with 124 repeated setup errors, 3 c
|
|
|
69
114
|
| Tool calls | 40.8 | 12 | 71% fewer |
|
|
70
115
|
| Wall-clock time | 244s | 85s | 65% faster |
|
|
71
116
|
| Commands | 15.5 | 6 | 61% fewer |
|
|
72
|
-
|
|
|
117
|
+
| Outcome | Same | Same | Same outcome |
|
|
73
118
|
|
|
74
|
-
Same
|
|
119
|
+
Same outcome, less agent thrash.
|
|
75
120
|
|
|
76
121
|
Methodology and caveats: [BENCHMARK_NOTES.md](BENCHMARK_NOTES.md)
|
|
77
122
|
|
|
@@ -83,7 +128,7 @@ Methodology and caveats: [BENCHMARK_NOTES.md](BENCHMARK_NOTES.md)
|
|
|
83
128
|
|
|
84
129
|
1. **Capture output.** Run the noisy command or accept already-existing piped output.
|
|
85
130
|
2. **Run local heuristics.** Detect known failure shapes first so common cases stay cheap and deterministic.
|
|
86
|
-
3. **Return
|
|
131
|
+
3. **Return a useful first pass.** When heuristics are confident, `sift` gives the agent grouped failures, likely root causes, and the next step.
|
|
87
132
|
4. **Fall back only when needed.** If heuristics are not enough, `sift` uses a cheaper model instead of spending your main agent budget.
|
|
88
133
|
|
|
89
134
|
Your agent spends tokens fixing, not reading.
|
|
@@ -96,13 +141,13 @@ Your agent spends tokens fixing, not reading.
|
|
|
96
141
|
<tr>
|
|
97
142
|
<td width="33%" valign="top">
|
|
98
143
|
|
|
99
|
-
### Test Failure
|
|
100
|
-
Collapse repeated pytest, vitest, and jest failures into
|
|
144
|
+
### Test Failure Guidance
|
|
145
|
+
Collapse repeated pytest, vitest, and jest failures into grouped issues with likely root causes, anchors, and fix hints.
|
|
101
146
|
|
|
102
147
|
</td>
|
|
103
148
|
<td width="33%" valign="top">
|
|
104
149
|
|
|
105
|
-
### Typecheck and Lint
|
|
150
|
+
### Typecheck and Lint Guidance
|
|
106
151
|
Group noisy `tsc` and ESLint output into the few issues that actually matter instead of dumping the whole log back into the model.
|
|
107
152
|
|
|
108
153
|
</td>
|
|
@@ -129,7 +174,7 @@ Every built-in preset tries local parsing first. When the heuristic handles the
|
|
|
129
174
|
<td width="33%" valign="top">
|
|
130
175
|
|
|
131
176
|
### Agent and Automation Friendly
|
|
132
|
-
Use `sift` in Codex, Claude, CI, hooks, or shell scripts
|
|
177
|
+
Use `sift` in Codex, Claude, CI, hooks, or shell scripts when you want downstream tooling to receive a short first pass instead of the raw log wall.
|
|
133
178
|
|
|
134
179
|
</td>
|
|
135
180
|
</tr>
|
|
@@ -137,90 +182,69 @@ Use `sift` in Codex, Claude, CI, hooks, or shell scripts so downstream tooling g
|
|
|
137
182
|
|
|
138
183
|
---
|
|
139
184
|
|
|
140
|
-
##
|
|
141
|
-
|
|
142
|
-
Most built-in presets run entirely on local heuristics with no API key needed. For presets that fall back to a model (`diff-summary`, `log-errors`, or when heuristics are not confident enough), sift supports OpenAI-compatible and OpenRouter-compatible endpoints.
|
|
143
|
-
|
|
144
|
-
Set up the provider first, then install the managed instruction block for the agent you want to steer:
|
|
185
|
+
## Presets
|
|
145
186
|
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
187
|
+
| Preset | What it does | Needs provider? |
|
|
188
|
+
|--------|--------------|:---------------:|
|
|
189
|
+
| `test-status` | Groups pytest, vitest, and jest failures into root-cause buckets with anchors and fix suggestions. | No |
|
|
190
|
+
| `typecheck-summary` | Parses `tsc` output and groups issues by error code. | No |
|
|
191
|
+
| `lint-failures` | Parses ESLint output and groups failures by rule. | No |
|
|
192
|
+
| `build-failure` | Extracts the first concrete build error from common toolchains. | Fallback only |
|
|
193
|
+
| `contract-drift` | Detects explicit snapshot, golden, OpenAPI, manifest, or generated-artifact drift without broadening into generic repo analysis. | Fallback only |
|
|
194
|
+
| `audit-critical` | Pulls high and critical `npm audit` findings. | No |
|
|
195
|
+
| `infra-risk` | Detects destructive signals in `terraform plan`. | No |
|
|
196
|
+
| `diff-summary` | Summarizes change sets and likely risks in diff output. | Yes |
|
|
197
|
+
| `log-errors` | Extracts the strongest error signals from noisy logs. | Fallback only |
|
|
152
198
|
|
|
153
|
-
|
|
199
|
+
When output already exists in a pipeline, use pipe mode instead of `exec`:
|
|
154
200
|
|
|
155
201
|
```bash
|
|
156
|
-
sift
|
|
157
|
-
sift
|
|
158
|
-
sift agent remove codex
|
|
202
|
+
pytest -q 2>&1 | sift preset test-status
|
|
203
|
+
npm audit 2>&1 | sift preset audit-critical
|
|
159
204
|
```
|
|
160
205
|
|
|
161
|
-
Command-first details live in [docs/cli-reference.md](docs/cli-reference.md).
|
|
162
|
-
|
|
163
206
|
---
|
|
164
207
|
|
|
165
|
-
##
|
|
208
|
+
## Setup and Agent Integration
|
|
166
209
|
|
|
167
|
-
|
|
210
|
+
If you want deeper integration after the first successful `sift exec` run, start with:
|
|
168
211
|
|
|
169
212
|
```bash
|
|
170
|
-
|
|
213
|
+
sift install
|
|
171
214
|
```
|
|
172
215
|
|
|
173
|
-
|
|
216
|
+
Most built-in presets run entirely on local heuristics with no API key required. If you want deeper fallback for ambiguous cases, `sift` also supports OpenAI-compatible and OpenRouter-compatible endpoints.
|
|
174
217
|
|
|
175
|
-
|
|
218
|
+
During install, pick the mode that matches reality:
|
|
219
|
+
- `agent-escalation`: `sift` gives the first answer, then your agent keeps going
|
|
220
|
+
- `provider-assisted`: `sift` itself can ask a cheap fallback model when needed
|
|
221
|
+
- `local-only`: keep everything local
|
|
176
222
|
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
223
|
+
Runtime-native files are small guidance surfaces, not a second execution system:
|
|
224
|
+
- Codex: managed `AGENTS.md` block plus a generated `SKILL.md`
|
|
225
|
+
- Claude: managed `CLAUDE.md` block plus a generated `.claude/commands/sift/` command pack
|
|
226
|
+
- Cursor: optional `.cursor/skills/sift/SKILL.md` path when you want an explicit native Cursor skill
|
|
180
227
|
|
|
181
|
-
|
|
228
|
+
Default rule:
|
|
229
|
+
- use `sift exec` for the normal first pass
|
|
230
|
+
- use `sift hook` only as an optional beta shortcut for a tiny known-command set
|
|
182
231
|
|
|
183
|
-
|
|
184
|
-
sift exec --preset test-status -- npx vitest run
|
|
185
|
-
sift exec --preset test-status -- npx jest
|
|
186
|
-
sift exec "what changed?" -- git diff
|
|
187
|
-
```
|
|
188
|
-
|
|
189
|
-
### 3. Zoom only if needed
|
|
190
|
-
|
|
191
|
-
Think of the workflow like this:
|
|
192
|
-
|
|
193
|
-
- `standard` = map
|
|
194
|
-
- `focused` = zoom
|
|
195
|
-
- raw traceback = last resort
|
|
232
|
+
Optional local evidence surfaces:
|
|
196
233
|
|
|
197
234
|
```bash
|
|
198
|
-
sift
|
|
199
|
-
sift
|
|
235
|
+
sift gain
|
|
236
|
+
sift discover
|
|
200
237
|
```
|
|
201
238
|
|
|
202
|
-
|
|
239
|
+
- `gain` shows local, metadata-only first-pass history
|
|
240
|
+
- `discover` stays quiet unless your own local history is strong enough to justify a concrete suggestion
|
|
203
241
|
|
|
204
|
-
|
|
242
|
+
If you want the full install, ownership, and touched-files details, see [docs/cli-reference.md](docs/cli-reference.md). The short version: `sift` does **not** write shell rc files, PATH entries, git hooks, or arbitrary repo files during install.
|
|
205
243
|
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
| Preset | What it does | Needs provider? |
|
|
209
|
-
|--------|--------------|:---------------:|
|
|
210
|
-
| `test-status` | Groups pytest, vitest, and jest failures into root-cause buckets with anchors and fix suggestions. | No |
|
|
211
|
-
| `typecheck-summary` | Parses `tsc` output and groups issues by error code. | No |
|
|
212
|
-
| `lint-failures` | Parses ESLint output and groups failures by rule. | No |
|
|
213
|
-
| `build-failure` | Extracts the first concrete build error from common toolchains. | Fallback only |
|
|
214
|
-
| `audit-critical` | Pulls high and critical `npm audit` findings. | No |
|
|
215
|
-
| `infra-risk` | Detects destructive signals in `terraform plan`. | No |
|
|
216
|
-
| `diff-summary` | Summarizes change sets and likely risks in diff output. | Yes |
|
|
217
|
-
| `log-errors` | Extracts the strongest error signals from noisy logs. | Fallback only |
|
|
218
|
-
|
|
219
|
-
When output already exists in a pipeline, use pipe mode instead of `exec`:
|
|
244
|
+
If you want this repo's tracked pre-push verification hook to actually run on your machine, you still need to activate it once:
|
|
220
245
|
|
|
221
246
|
```bash
|
|
222
|
-
|
|
223
|
-
npm audit 2>&1 | sift preset audit-critical
|
|
247
|
+
npm run setup:hooks
|
|
224
248
|
```
|
|
225
249
|
|
|
226
250
|
---
|
|
@@ -250,6 +274,8 @@ sift exec --preset test-status --goal diagnose --format json -- pytest -q
|
|
|
250
274
|
sift rerun --goal diagnose --format json
|
|
251
275
|
```
|
|
252
276
|
|
|
277
|
+
Diagnose JSON is summary-first on purpose. If `read_targets.anchor_kind=traceback` and `read_targets.context_hint.kind=exact_window`, read that narrow range first. If the read target is lower-confidence or `search_only`, treat it as a representative hint rather than exact root-cause proof.
|
|
278
|
+
|
|
253
279
|
---
|
|
254
280
|
|
|
255
281
|
## Limitations
|
|
@@ -257,7 +283,8 @@ sift rerun --goal diagnose --format json
|
|
|
257
283
|
- sift adds the most value when output is long, repetitive, and shaped by a small number of root causes. For short, obvious failures it may not save much.
|
|
258
284
|
- The deepest local heuristic coverage is in test debugging (pytest, vitest, jest). Other presets have solid heuristics but less depth.
|
|
259
285
|
- sift does not help with interactive or TUI-based commands.
|
|
260
|
-
-
|
|
286
|
+
- sift is not a generic repo summarizer or broad mismatch detector. It works best when the command output itself carries strong failure or drift evidence.
|
|
287
|
+
- When heuristics cannot explain the output confidently, sift either falls back to a provider or returns the strongest local first pass it can, depending on how you choose to use it.
|
|
261
288
|
|
|
262
289
|
---
|
|
263
290
|
|
|
@@ -279,7 +306,7 @@ MIT
|
|
|
279
306
|
|
|
280
307
|
<div align="center">
|
|
281
308
|
|
|
282
|
-
|
|
309
|
+
Local-first output guidance for coding agents.
|
|
283
310
|
|
|
284
311
|
[Report Bug](https://github.com/bilalimamoglu/sift/issues) | [Request Feature](https://github.com/bilalimamoglu/sift/issues)
|
|
285
312
|
|