npm - @bilalimamoglu/sift - Versions diffs - 0.4.4 → 0.5.0 - Mend

@bilalimamoglu/sift 0.4.4 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -4,9 +4,9 @@
 # sift
-### Turn noisy command output into actionable diagnoses for your coding agent
+### Turn noisy command output into a short, actionable first pass for your coding agent
-**Benchmark-backed test triage - Heuristic-first reductions - Agent-ready terminal workflows**
+**Local heuristics first. Group repeated failures into likely root causes and next steps before your agent reads the full log.**
 [![npm version](https://img.shields.io/npm/v/@bilalimamoglu/sift)](https://www.npmjs.com/package/@bilalimamoglu/sift)
 [![license](https://img.shields.io/github/license/bilalimamoglu/sift)](LICENSE)
@@ -21,7 +21,7 @@
 npm install -g @bilalimamoglu/sift
 ```
-<sub>Works with pytest, vitest, jest, tsc, ESLint, webpack, Cargo, terraform, npm audit, and more.</sub>
+<sub>Best today on noisy pytest, vitest, jest, `tsc`, ESLint, common build failures, `npm audit`, and `terraform plan` output.</sub>
 </div>
@@ -29,14 +29,16 @@ npm install -g @bilalimamoglu/sift
 ## Why Sift?
-When an agent hits noisy output, it burns budget reading logs instead of fixing the problem.
+When an agent hits noisy output, it can eventually make sense of the log wall, but it wastes time and tokens getting there.
-`sift` sits in front of that output and reduces it into a small, actionable first pass. Your agent reads the diagnosis, not the wall of text.
+`sift` narrows that output locally first. It groups repeated failures, surfaces likely root causes, and points to the next useful step so your agent starts from signal instead of raw noise.
+It is not a generic repo summarizer, not a shell telemetry product, and not a benchmark dashboard. It is a local-first triage layer for noisy command output in coding-agent workflows.
 Turn 13,000 lines of test output into 2 root causes.
 <p align="center">
-  <img src="assets/readme/test-status-demo.gif" alt="sift turning a pytest failure wall into a short diagnosis" width="960" />
+  <img src="assets/readme/test-status-demo.gif" alt="sift turning a pytest failure wall into a short, actionable first pass" width="960" />
 </p>
 With `sift`, the same run becomes:
@@ -53,13 +55,56 @@ With `sift`, the same run becomes:
 - Decision: stop and act.
 ```
-In the largest benchmark fixture, sift compressed 198,026 raw output tokens to 129. That is what the agent reads instead of the full log.
+In one large `test-status` benchmark fixture, `sift` compressed 198,026 raw output tokens to 129. That is scoped proof for a noisy test-debugging case, not a promise that every preset behaves the same way.
+---
+## Quick Start
+### 1. Install
+```bash
+npm install -g @bilalimamoglu/sift
+```
+Requires Node.js 20+.
+### 2. Try the main workflow
+If you are new, start here and ignore hook beta and native surfaces for now:
+```bash
+sift exec --preset test-status -- pytest -q
+```
+Other common entry points:
+```bash
+sift exec --preset test-status -- npx vitest run
+sift exec --preset test-status -- npx jest
+sift exec "what changed?" -- git diff
+```
+### 3. Zoom only if needed
+Think of the workflow like this:
+- `standard` = map
+- `focused` = zoom
+- raw traceback = last resort
+```bash
+sift rerun
+sift rerun --remaining --detail focused
+```
+If `standard` already gives you the likely root cause, anchor, and fix, stop there and act.
 ---
 ## Benchmark Results
-The output reduction above measures a single command's raw output. The table below measures the full end-to-end debug session: how many tokens, tool calls, and seconds the agent spends to reach the same diagnosis.
+The output reduction above measures a single command's raw output. The table below measures one replayed end-to-end debug loop: how many tokens, tool calls, and seconds the agent spent to reach the same outcome in that benchmarked scenario.
 Real debug loop on a 640-test Python backend with 124 repeated setup errors, 3 contract failures, and 511 passing tests:
@@ -69,9 +114,9 @@ Real debug loop on a 640-test Python backend with 124 repeated setup errors, 3 c
 | Tool calls | 40.8 | 12 | 71% fewer |
 | Wall-clock time | 244s | 85s | 65% faster |
 | Commands | 15.5 | 6 | 61% fewer |
-| Diagnosis | Same | Same | Same outcome |
+| Outcome | Same | Same | Same outcome |
-Same diagnosis, less agent thrash.
+Same outcome, less agent thrash.
 Methodology and caveats: [BENCHMARK_NOTES.md](BENCHMARK_NOTES.md)
@@ -83,7 +128,7 @@ Methodology and caveats: [BENCHMARK_NOTES.md](BENCHMARK_NOTES.md)
 1. **Capture output.** Run the noisy command or accept already-existing piped output.
 2. **Run local heuristics.** Detect known failure shapes first so common cases stay cheap and deterministic.
-3. **Return the diagnosis.** When heuristics are confident, `sift` gives the agent the root cause, anchor, and next step.
+3. **Return a useful first pass.** When heuristics are confident, `sift` gives the agent grouped failures, likely root causes, and the next step.
 4. **Fall back only when needed.** If heuristics are not enough, `sift` uses a cheaper model instead of spending your main agent budget.
 Your agent spends tokens fixing, not reading.
@@ -96,13 +141,13 @@ Your agent spends tokens fixing, not reading.
 <tr>
 <td width="33%" valign="top">
-### Test Failure Triage
-Collapse repeated pytest, vitest, and jest failures into a short diagnosis with root-cause buckets, anchors, and fix hints.
+### Test Failure Guidance
+Collapse repeated pytest, vitest, and jest failures into grouped issues with likely root causes, anchors, and fix hints.
 </td>
 <td width="33%" valign="top">
-### Typecheck and Lint Reduction
+### Typecheck and Lint Guidance
 Group noisy `tsc` and ESLint output into the few issues that actually matter instead of dumping the whole log back into the model.
 </td>
@@ -129,7 +174,7 @@ Every built-in preset tries local parsing first. When the heuristic handles the
 <td width="33%" valign="top">
 ### Agent and Automation Friendly
-Use `sift` in Codex, Claude, CI, hooks, or shell scripts so downstream tooling gets short, structured answers instead of raw noise.
+Use `sift` in Codex, Claude, CI, hooks, or shell scripts when you want downstream tooling to receive a short first pass instead of the raw log wall.
 </td>
 </tr>
@@ -137,90 +182,69 @@ Use `sift` in Codex, Claude, CI, hooks, or shell scripts so downstream tooling g
 ---
-## Setup and Agent Integration
-Most built-in presets run entirely on local heuristics with no API key needed. For presets that fall back to a model (`diff-summary`, `log-errors`, or when heuristics are not confident enough), sift supports OpenAI-compatible and OpenRouter-compatible endpoints.
-Set up the provider first, then install the managed instruction block for the agent you want to steer:
+## Presets
-```bash
-sift config setup
-sift doctor
-sift agent install codex
-sift agent install claude
-```
+| Preset | What it does | Needs provider? |
+|--------|--------------|:---------------:|
+| `test-status` | Groups pytest, vitest, and jest failures into root-cause buckets with anchors and fix suggestions. | No |
+| `typecheck-summary` | Parses `tsc` output and groups issues by error code. | No |
+| `lint-failures` | Parses ESLint output and groups failures by rule. | No |
+| `build-failure` | Extracts the first concrete build error from common toolchains. | Fallback only |
+| `contract-drift` | Detects explicit snapshot, golden, OpenAPI, manifest, or generated-artifact drift without broadening into generic repo analysis. | Fallback only |
+| `audit-critical` | Pulls high and critical `npm audit` findings. | No |
+| `infra-risk` | Detects destructive signals in `terraform plan`. | No |
+| `diff-summary` | Summarizes change sets and likely risks in diff output. | Yes |
+| `log-errors` | Extracts the strongest error signals from noisy logs. | Fallback only |
-You can also preview, inspect, or remove those blocks:
+When output already exists in a pipeline, use pipe mode instead of `exec`:
 ```bash
-sift agent show codex
-sift agent status
-sift agent remove codex
+pytest -q 2>&1 | sift preset test-status
+npm audit 2>&1 | sift preset audit-critical
 ```
-Command-first details live in [docs/cli-reference.md](docs/cli-reference.md).
 ---
-## Quick Start
+## Setup and Agent Integration
-### 1. Install
+If you want deeper integration after the first successful `sift exec` run, start with:
 ```bash
-npm install -g @bilalimamoglu/sift
+sift install
 ```
-Requires Node.js 20+.
+Most built-in presets run entirely on local heuristics with no API key required. If you want deeper fallback for ambiguous cases, `sift` also supports OpenAI-compatible and OpenRouter-compatible endpoints.
-### 2. Run Sift in front of a noisy command
+During install, pick the mode that matches reality:
+- `agent-escalation`: `sift` gives the first answer, then your agent keeps going
+- `provider-assisted`: `sift` itself can ask a cheap fallback model when needed
+- `local-only`: keep everything local
-```bash
-sift exec --preset test-status -- pytest -q
-```
+Runtime-native files are small guidance surfaces, not a second execution system:
+- Codex: managed `AGENTS.md` block plus a generated `SKILL.md`
+- Claude: managed `CLAUDE.md` block plus a generated `.claude/commands/sift/` command pack
+- Cursor: optional `.cursor/skills/sift/SKILL.md` path when you want an explicit native Cursor skill
-Other common entry points:
+Default rule:
+- use `sift exec` for the normal first pass
+- use `sift hook` only as an optional beta shortcut for a tiny known-command set
-```bash
-sift exec --preset test-status -- npx vitest run
-sift exec --preset test-status -- npx jest
-sift exec "what changed?" -- git diff
-```
-### 3. Zoom only if needed
-Think of the workflow like this:
-- `standard` = map
-- `focused` = zoom
-- raw traceback = last resort
+Optional local evidence surfaces:
 ```bash
-sift rerun
-sift rerun --remaining --detail focused
+sift gain
+sift discover
 ```
-If `standard` already gives you the root cause, anchor, and fix, stop there and act.
+- `gain` shows local, metadata-only first-pass history
+- `discover` stays quiet unless your own local history is strong enough to justify a concrete suggestion
----
+If you want the full install, ownership, and touched-files details, see [docs/cli-reference.md](docs/cli-reference.md). The short version: `sift` does **not** write shell rc files, PATH entries, git hooks, or arbitrary repo files during install.
-## Presets
-| Preset | What it does | Needs provider? |
-|--------|--------------|:---------------:|
-| `test-status` | Groups pytest, vitest, and jest failures into root-cause buckets with anchors and fix suggestions. | No |
-| `typecheck-summary` | Parses `tsc` output and groups issues by error code. | No |
-| `lint-failures` | Parses ESLint output and groups failures by rule. | No |
-| `build-failure` | Extracts the first concrete build error from common toolchains. | Fallback only |
-| `audit-critical` | Pulls high and critical `npm audit` findings. | No |
-| `infra-risk` | Detects destructive signals in `terraform plan`. | No |
-| `diff-summary` | Summarizes change sets and likely risks in diff output. | Yes |
-| `log-errors` | Extracts the strongest error signals from noisy logs. | Fallback only |
-When output already exists in a pipeline, use pipe mode instead of `exec`:
+If you want this repo's tracked pre-push verification hook to actually run on your machine, you still need to activate it once:
 ```bash
-pytest -q 2>&1 | sift preset test-status
-npm audit 2>&1 | sift preset audit-critical
+npm run setup:hooks
 ```
 ---
@@ -250,6 +274,8 @@ sift exec --preset test-status --goal diagnose --format json -- pytest -q
 sift rerun --goal diagnose --format json
 ```
+Diagnose JSON is summary-first on purpose. If `read_targets.anchor_kind=traceback` and `read_targets.context_hint.kind=exact_window`, read that narrow range first. If the read target is lower-confidence or `search_only`, treat it as a representative hint rather than exact root-cause proof.
 ---
 ## Limitations
@@ -257,7 +283,8 @@ sift rerun --goal diagnose --format json
 - sift adds the most value when output is long, repetitive, and shaped by a small number of root causes. For short, obvious failures it may not save much.
 - The deepest local heuristic coverage is in test debugging (pytest, vitest, jest). Other presets have solid heuristics but less depth.
 - sift does not help with interactive or TUI-based commands.
-- When heuristics cannot explain the output confidently, sift falls back to a provider. If no provider is configured, it returns what the heuristics could extract and signals that raw output may still be needed.
+- sift is not a generic repo summarizer or broad mismatch detector. It works best when the command output itself carries strong failure or drift evidence.
+- When heuristics cannot explain the output confidently, sift either falls back to a provider or returns the strongest local first pass it can, depending on how you choose to use it.
 ---
@@ -279,7 +306,7 @@ MIT
 <div align="center">
-Built for agent-first terminal workflows.
+Local-first output guidance for coding agents.
 [Report Bug](https://github.com/bilalimamoglu/sift/issues) | [Request Feature](https://github.com/bilalimamoglu/sift/issues)