@bilalimamoglu/sift 0.4.4 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (5) hide show
  1. package/README.md +103 -76
  2. package/dist/cli.js +11005 -7333
  3. package/dist/index.d.ts +62 -4
  4. package/dist/index.js +1009 -57
  5. package/package.json +14 -3
package/README.md CHANGED
@@ -4,9 +4,9 @@
4
4
 
5
5
  # sift
6
6
 
7
- ### Turn noisy command output into actionable diagnoses for your coding agent
7
+ ### Turn noisy command output into a short, actionable first pass for your coding agent
8
8
 
9
- **Benchmark-backed test triage - Heuristic-first reductions - Agent-ready terminal workflows**
9
+ **Local heuristics first. Group repeated failures into likely root causes and next steps before your agent reads the full log.**
10
10
 
11
11
  [![npm version](https://img.shields.io/npm/v/@bilalimamoglu/sift)](https://www.npmjs.com/package/@bilalimamoglu/sift)
12
12
  [![license](https://img.shields.io/github/license/bilalimamoglu/sift)](LICENSE)
@@ -21,7 +21,7 @@
21
21
  npm install -g @bilalimamoglu/sift
22
22
  ```
23
23
 
24
- <sub>Works with pytest, vitest, jest, tsc, ESLint, webpack, Cargo, terraform, npm audit, and more.</sub>
24
+ <sub>Best today on noisy pytest, vitest, jest, `tsc`, ESLint, common build failures, `npm audit`, and `terraform plan` output.</sub>
25
25
 
26
26
  </div>
27
27
 
@@ -29,14 +29,16 @@ npm install -g @bilalimamoglu/sift
29
29
 
30
30
  ## Why Sift?
31
31
 
32
- When an agent hits noisy output, it burns budget reading logs instead of fixing the problem.
32
+ When an agent hits noisy output, it can eventually make sense of the log wall, but it wastes time and tokens getting there.
33
33
 
34
- `sift` sits in front of that output and reduces it into a small, actionable first pass. Your agent reads the diagnosis, not the wall of text.
34
+ `sift` narrows that output locally first. It groups repeated failures, surfaces likely root causes, and points to the next useful step so your agent starts from signal instead of raw noise.
35
+
36
+ It is not a generic repo summarizer, not a shell telemetry product, and not a benchmark dashboard. It is a local-first triage layer for noisy command output in coding-agent workflows.
35
37
 
36
38
  Turn 13,000 lines of test output into 2 root causes.
37
39
 
38
40
  <p align="center">
39
- <img src="assets/readme/test-status-demo.gif" alt="sift turning a pytest failure wall into a short diagnosis" width="960" />
41
+ <img src="assets/readme/test-status-demo.gif" alt="sift turning a pytest failure wall into a short, actionable first pass" width="960" />
40
42
  </p>
41
43
 
42
44
  With `sift`, the same run becomes:
@@ -53,13 +55,56 @@ With `sift`, the same run becomes:
53
55
  - Decision: stop and act.
54
56
  ```
55
57
 
56
- In the largest benchmark fixture, sift compressed 198,026 raw output tokens to 129. That is what the agent reads instead of the full log.
58
+ In one large `test-status` benchmark fixture, `sift` compressed 198,026 raw output tokens to 129. That is scoped proof for a noisy test-debugging case, not a promise that every preset behaves the same way.
59
+
60
+ ---
61
+
62
+ ## Quick Start
63
+
64
+ ### 1. Install
65
+
66
+ ```bash
67
+ npm install -g @bilalimamoglu/sift
68
+ ```
69
+
70
+ Requires Node.js 20+.
71
+
72
+ ### 2. Try the main workflow
73
+
74
+ If you are new, start here and ignore hook beta and native surfaces for now:
75
+
76
+ ```bash
77
+ sift exec --preset test-status -- pytest -q
78
+ ```
79
+
80
+ Other common entry points:
81
+
82
+ ```bash
83
+ sift exec --preset test-status -- npx vitest run
84
+ sift exec --preset test-status -- npx jest
85
+ sift exec "what changed?" -- git diff
86
+ ```
87
+
88
+ ### 3. Zoom only if needed
89
+
90
+ Think of the workflow like this:
91
+
92
+ - `standard` = map
93
+ - `focused` = zoom
94
+ - raw traceback = last resort
95
+
96
+ ```bash
97
+ sift rerun
98
+ sift rerun --remaining --detail focused
99
+ ```
100
+
101
+ If `standard` already gives you the likely root cause, anchor, and fix, stop there and act.
57
102
 
58
103
  ---
59
104
 
60
105
  ## Benchmark Results
61
106
 
62
- The output reduction above measures a single command's raw output. The table below measures the full end-to-end debug session: how many tokens, tool calls, and seconds the agent spends to reach the same diagnosis.
107
+ The output reduction above measures a single command's raw output. The table below measures one replayed end-to-end debug loop: how many tokens, tool calls, and seconds the agent spent to reach the same outcome in that benchmarked scenario.
63
108
 
64
109
  Real debug loop on a 640-test Python backend with 124 repeated setup errors, 3 contract failures, and 511 passing tests:
65
110
 
@@ -69,9 +114,9 @@ Real debug loop on a 640-test Python backend with 124 repeated setup errors, 3 c
69
114
  | Tool calls | 40.8 | 12 | 71% fewer |
70
115
  | Wall-clock time | 244s | 85s | 65% faster |
71
116
  | Commands | 15.5 | 6 | 61% fewer |
72
- | Diagnosis | Same | Same | Same outcome |
117
+ | Outcome | Same | Same | Same outcome |
73
118
 
74
- Same diagnosis, less agent thrash.
119
+ Same outcome, less agent thrash.
75
120
 
76
121
  Methodology and caveats: [BENCHMARK_NOTES.md](BENCHMARK_NOTES.md)
77
122
 
@@ -83,7 +128,7 @@ Methodology and caveats: [BENCHMARK_NOTES.md](BENCHMARK_NOTES.md)
83
128
 
84
129
  1. **Capture output.** Run the noisy command or accept already-existing piped output.
85
130
  2. **Run local heuristics.** Detect known failure shapes first so common cases stay cheap and deterministic.
86
- 3. **Return the diagnosis.** When heuristics are confident, `sift` gives the agent the root cause, anchor, and next step.
131
+ 3. **Return a useful first pass.** When heuristics are confident, `sift` gives the agent grouped failures, likely root causes, and the next step.
87
132
  4. **Fall back only when needed.** If heuristics are not enough, `sift` uses a cheaper model instead of spending your main agent budget.
88
133
 
89
134
  Your agent spends tokens fixing, not reading.
@@ -96,13 +141,13 @@ Your agent spends tokens fixing, not reading.
96
141
  <tr>
97
142
  <td width="33%" valign="top">
98
143
 
99
- ### Test Failure Triage
100
- Collapse repeated pytest, vitest, and jest failures into a short diagnosis with root-cause buckets, anchors, and fix hints.
144
+ ### Test Failure Guidance
145
+ Collapse repeated pytest, vitest, and jest failures into grouped issues with likely root causes, anchors, and fix hints.
101
146
 
102
147
  </td>
103
148
  <td width="33%" valign="top">
104
149
 
105
- ### Typecheck and Lint Reduction
150
+ ### Typecheck and Lint Guidance
106
151
  Group noisy `tsc` and ESLint output into the few issues that actually matter instead of dumping the whole log back into the model.
107
152
 
108
153
  </td>
@@ -129,7 +174,7 @@ Every built-in preset tries local parsing first. When the heuristic handles the
129
174
  <td width="33%" valign="top">
130
175
 
131
176
  ### Agent and Automation Friendly
132
- Use `sift` in Codex, Claude, CI, hooks, or shell scripts so downstream tooling gets short, structured answers instead of raw noise.
177
+ Use `sift` in Codex, Claude, CI, hooks, or shell scripts when you want downstream tooling to receive a short first pass instead of the raw log wall.
133
178
 
134
179
  </td>
135
180
  </tr>
@@ -137,90 +182,69 @@ Use `sift` in Codex, Claude, CI, hooks, or shell scripts so downstream tooling g
137
182
 
138
183
  ---
139
184
 
140
- ## Setup and Agent Integration
141
-
142
- Most built-in presets run entirely on local heuristics with no API key needed. For presets that fall back to a model (`diff-summary`, `log-errors`, or when heuristics are not confident enough), sift supports OpenAI-compatible and OpenRouter-compatible endpoints.
143
-
144
- Set up the provider first, then install the managed instruction block for the agent you want to steer:
185
+ ## Presets
145
186
 
146
- ```bash
147
- sift config setup
148
- sift doctor
149
- sift agent install codex
150
- sift agent install claude
151
- ```
187
+ | Preset | What it does | Needs provider? |
188
+ |--------|--------------|:---------------:|
189
+ | `test-status` | Groups pytest, vitest, and jest failures into root-cause buckets with anchors and fix suggestions. | No |
190
+ | `typecheck-summary` | Parses `tsc` output and groups issues by error code. | No |
191
+ | `lint-failures` | Parses ESLint output and groups failures by rule. | No |
192
+ | `build-failure` | Extracts the first concrete build error from common toolchains. | Fallback only |
193
+ | `contract-drift` | Detects explicit snapshot, golden, OpenAPI, manifest, or generated-artifact drift without broadening into generic repo analysis. | Fallback only |
194
+ | `audit-critical` | Pulls high and critical `npm audit` findings. | No |
195
+ | `infra-risk` | Detects destructive signals in `terraform plan`. | No |
196
+ | `diff-summary` | Summarizes change sets and likely risks in diff output. | Yes |
197
+ | `log-errors` | Extracts the strongest error signals from noisy logs. | Fallback only |
152
198
 
153
- You can also preview, inspect, or remove those blocks:
199
+ When output already exists in a pipeline, use pipe mode instead of `exec`:
154
200
 
155
201
  ```bash
156
- sift agent show codex
157
- sift agent status
158
- sift agent remove codex
202
+ pytest -q 2>&1 | sift preset test-status
203
+ npm audit 2>&1 | sift preset audit-critical
159
204
  ```
160
205
 
161
- Command-first details live in [docs/cli-reference.md](docs/cli-reference.md).
162
-
163
206
  ---
164
207
 
165
- ## Quick Start
208
+ ## Setup and Agent Integration
166
209
 
167
- ### 1. Install
210
+ If you want deeper integration after the first successful `sift exec` run, start with:
168
211
 
169
212
  ```bash
170
- npm install -g @bilalimamoglu/sift
213
+ sift install
171
214
  ```
172
215
 
173
- Requires Node.js 20+.
216
+ Most built-in presets run entirely on local heuristics with no API key required. If you want deeper fallback for ambiguous cases, `sift` also supports OpenAI-compatible and OpenRouter-compatible endpoints.
174
217
 
175
- ### 2. Run Sift in front of a noisy command
218
+ During install, pick the mode that matches reality:
219
+ - `agent-escalation`: `sift` gives the first answer, then your agent keeps going
220
+ - `provider-assisted`: `sift` itself can ask a cheap fallback model when needed
221
+ - `local-only`: keep everything local
176
222
 
177
- ```bash
178
- sift exec --preset test-status -- pytest -q
179
- ```
223
+ Runtime-native files are small guidance surfaces, not a second execution system:
224
+ - Codex: managed `AGENTS.md` block plus a generated `SKILL.md`
225
+ - Claude: managed `CLAUDE.md` block plus a generated `.claude/commands/sift/` command pack
226
+ - Cursor: optional `.cursor/skills/sift/SKILL.md` path when you want an explicit native Cursor skill
180
227
 
181
- Other common entry points:
228
+ Default rule:
229
+ - use `sift exec` for the normal first pass
230
+ - use `sift hook` only as an optional beta shortcut for a tiny known-command set
182
231
 
183
- ```bash
184
- sift exec --preset test-status -- npx vitest run
185
- sift exec --preset test-status -- npx jest
186
- sift exec "what changed?" -- git diff
187
- ```
188
-
189
- ### 3. Zoom only if needed
190
-
191
- Think of the workflow like this:
192
-
193
- - `standard` = map
194
- - `focused` = zoom
195
- - raw traceback = last resort
232
+ Optional local evidence surfaces:
196
233
 
197
234
  ```bash
198
- sift rerun
199
- sift rerun --remaining --detail focused
235
+ sift gain
236
+ sift discover
200
237
  ```
201
238
 
202
- If `standard` already gives you the root cause, anchor, and fix, stop there and act.
239
+ - `gain` shows local, metadata-only first-pass history
240
+ - `discover` stays quiet unless your own local history is strong enough to justify a concrete suggestion
203
241
 
204
- ---
242
+ If you want the full install, ownership, and touched-files details, see [docs/cli-reference.md](docs/cli-reference.md). The short version: `sift` does **not** write shell rc files, PATH entries, git hooks, or arbitrary repo files during install.
205
243
 
206
- ## Presets
207
-
208
- | Preset | What it does | Needs provider? |
209
- |--------|--------------|:---------------:|
210
- | `test-status` | Groups pytest, vitest, and jest failures into root-cause buckets with anchors and fix suggestions. | No |
211
- | `typecheck-summary` | Parses `tsc` output and groups issues by error code. | No |
212
- | `lint-failures` | Parses ESLint output and groups failures by rule. | No |
213
- | `build-failure` | Extracts the first concrete build error from common toolchains. | Fallback only |
214
- | `audit-critical` | Pulls high and critical `npm audit` findings. | No |
215
- | `infra-risk` | Detects destructive signals in `terraform plan`. | No |
216
- | `diff-summary` | Summarizes change sets and likely risks in diff output. | Yes |
217
- | `log-errors` | Extracts the strongest error signals from noisy logs. | Fallback only |
218
-
219
- When output already exists in a pipeline, use pipe mode instead of `exec`:
244
+ If you want this repo's tracked pre-push verification hook to actually run on your machine, you still need to activate it once:
220
245
 
221
246
  ```bash
222
- pytest -q 2>&1 | sift preset test-status
223
- npm audit 2>&1 | sift preset audit-critical
247
+ npm run setup:hooks
224
248
  ```
225
249
 
226
250
  ---
@@ -250,6 +274,8 @@ sift exec --preset test-status --goal diagnose --format json -- pytest -q
250
274
  sift rerun --goal diagnose --format json
251
275
  ```
252
276
 
277
+ Diagnose JSON is summary-first on purpose. If `read_targets.anchor_kind=traceback` and `read_targets.context_hint.kind=exact_window`, read that narrow range first. If the read target is lower-confidence or `search_only`, treat it as a representative hint rather than exact root-cause proof.
278
+
253
279
  ---
254
280
 
255
281
  ## Limitations
@@ -257,7 +283,8 @@ sift rerun --goal diagnose --format json
257
283
  - sift adds the most value when output is long, repetitive, and shaped by a small number of root causes. For short, obvious failures it may not save much.
258
284
  - The deepest local heuristic coverage is in test debugging (pytest, vitest, jest). Other presets have solid heuristics but less depth.
259
285
  - sift does not help with interactive or TUI-based commands.
260
- - When heuristics cannot explain the output confidently, sift falls back to a provider. If no provider is configured, it returns what the heuristics could extract and signals that raw output may still be needed.
286
+ - sift is not a generic repo summarizer or broad mismatch detector. It works best when the command output itself carries strong failure or drift evidence.
287
+ - When heuristics cannot explain the output confidently, sift either falls back to a provider or returns the strongest local first pass it can, depending on how you choose to use it.
261
288
 
262
289
  ---
263
290
 
@@ -279,7 +306,7 @@ MIT
279
306
 
280
307
  <div align="center">
281
308
 
282
- Built for agent-first terminal workflows.
309
+ Local-first output guidance for coding agents.
283
310
 
284
311
  [Report Bug](https://github.com/bilalimamoglu/sift/issues) | [Request Feature](https://github.com/bilalimamoglu/sift/issues)
285
312