@delfini/drift-engine 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +172 -0
- package/dist/diff-filter.d.ts +33 -0
- package/dist/diff-filter.d.ts.map +1 -0
- package/dist/diff-filter.js +579 -0
- package/dist/doc-scope.d.ts +119 -0
- package/dist/doc-scope.d.ts.map +1 -0
- package/dist/doc-scope.js +260 -0
- package/dist/index.d.ts +11 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +46 -0
- package/dist/prompt-budget.d.ts +2 -0
- package/dist/prompt-budget.d.ts.map +1 -0
- package/dist/prompt-budget.js +16 -0
- package/dist/prompt-builder.d.ts +21 -0
- package/dist/prompt-builder.d.ts.map +1 -0
- package/dist/prompt-builder.js +267 -0
- package/dist/reconcile.d.ts +17 -0
- package/dist/reconcile.d.ts.map +1 -0
- package/dist/reconcile.js +290 -0
- package/dist/relevance.d.ts +73 -0
- package/dist/relevance.d.ts.map +1 -0
- package/dist/relevance.js +266 -0
- package/dist/schema.d.ts +293 -0
- package/dist/schema.d.ts.map +1 -0
- package/dist/schema.js +50 -0
- package/dist/types.d.ts +81 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +6 -0
- package/package.json +39 -0
- package/src/prompt.md +360 -0
package/README.md
ADDED
|
@@ -0,0 +1,172 @@
|
|
|
1
|
+
# @delfini/drift-engine
|
|
2
|
+
|
|
3
|
+
Pure-logic drift analysis core shared by `@delfini/action` (CI surface, `apps/action`) and `@delfini/cli` (Skill surface, `packages/cli`). No I/O, no LLM client, no credentials, no `fetch`. Runtime deps: `zod` + `picomatch` (both pure CPU — no I/O, no network, no env).
|
|
4
|
+
|
|
5
|
+
The package exists so both surfaces consume the same prompt assembly, schema, and reconciliation logic — algorithm parity between the Action and the Skill holds **by construction** (FR139, NFR44). A finding surfaced locally by the Skill is the same finding the Action will surface on the eventual PR.
|
|
6
|
+
|
|
7
|
+
## Public API
|
|
8
|
+
|
|
9
|
+
```ts
|
|
10
|
+
import {
|
|
11
|
+
buildPrompt,
|
|
12
|
+
validateAndReconcile,
|
|
13
|
+
estimatePromptTokens,
|
|
14
|
+
analysisSchema,
|
|
15
|
+
type AnalysisInput,
|
|
16
|
+
type AnalysisResult,
|
|
17
|
+
type DocFile,
|
|
18
|
+
type Contradiction,
|
|
19
|
+
type Addition,
|
|
20
|
+
type ClarifyingQuestion,
|
|
21
|
+
type PRMetadata,
|
|
22
|
+
type Severity,
|
|
23
|
+
} from '@delfini/drift-engine'
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
Internal helpers (`dedupeOverlappingContradictions`, `filterActionableContradictions`, `reconcileLineNumbers`, `reconcileAdditiveAnchors`, `ContradictionSchema`, `AdditionSchema`, `locateQuote`, `locateAnchorHeading`, etc.) are **not** re-exported. Callers compose only through the surface above.
|
|
27
|
+
|
|
28
|
+
## Relevance gating (opt-in)
|
|
29
|
+
|
|
30
|
+
`buildPrompt(input, template, options?)` accepts an optional third argument:
|
|
31
|
+
|
|
32
|
+
```ts
|
|
33
|
+
interface BuildPromptOptions {
|
|
34
|
+
relevanceThreshold?: number
|
|
35
|
+
}
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
When `relevanceThreshold` is a positive integer, docs whose relevance score is below the threshold are dropped before the prompt is rendered. Scoring breakdown:
|
|
39
|
+
|
|
40
|
+
| Tier | Signal | Score |
|
|
41
|
+
|---|---|---|
|
|
42
|
+
| 1 | The doc itself appears in the diff (`diff --git a/<doc> ...`) | +20 |
|
|
43
|
+
| 2 | A code-file path from the diff appears in the doc body | +10 per file |
|
|
44
|
+
| 3 | An identifier from the diff appears in the doc body | +3 per identifier, capped at +30 |
|
|
45
|
+
| 4 | A heading whose terms overlap with diff identifiers | +5 per heading |
|
|
46
|
+
|
|
47
|
+
Default behaviour (`options` omitted, or `relevanceThreshold` is `undefined`, `0`, or non-finite) is observably no-op — the prompt-snapshot gate (NFR44 gate A) enforces byte-equality with the legacy output.
|
|
48
|
+
|
|
49
|
+
**Recommended starting threshold:** `5` — keeps any doc with at least one Tier-2 hit or the doc-itself-in-diff signal.
|
|
50
|
+
|
|
51
|
+
## NFR44 release gates
|
|
52
|
+
|
|
53
|
+
The extraction is observably no-op for the Action's public behaviour. Two release gates guard that invariant on every PR:
|
|
54
|
+
|
|
55
|
+
| Gate | Lives at | Catches | Misses (caught by the other gate) |
|
|
56
|
+
|---|---|---|---|
|
|
57
|
+
| **A — snapshot parity** | `packages/drift-engine/__tests__/prompt-snapshot.test.ts` | Any change to `prompt.md` text, `prompt-builder.ts` rendering, `prefixDocLines`, or `renderDocsBlock` that perturbs the rendered prompt | Schema / reconcile / orchestrator regressions — none flow through `buildPrompt` |
|
|
58
|
+
| **B — Action pipeline** | `apps/action/src/__tests__/pipeline.test.ts` | Any change to `reconcile.ts`, `schema.ts`, orchestrator wiring, or the round-trip from LLM JSON → rendered PR comment / check verdict / intake payload | Pure prompt-text drift that the LLM-mocked fixtures don't surface |
|
|
59
|
+
|
|
60
|
+
Both gates must be green for a drift-engine PR to merge. Gate A catches regressions **before** the LLM ever runs; gate B catches everything downstream of the LLM.
|
|
61
|
+
|
|
62
|
+
## When to update the snapshot
|
|
63
|
+
|
|
64
|
+
Update `__tests__/fixtures/canonical-prompt.snapshot.md` **only** when the prompt change is intentional. Examples that require an update:
|
|
65
|
+
|
|
66
|
+
- Editing the text of any section of `src/prompt.md`.
|
|
67
|
+
- Adding, removing, or renaming a placeholder in `src/prompt.md` plus the corresponding substitution in `src/prompt-builder.ts`.
|
|
68
|
+
- Changing how `prefixDocLines` formats line-number prefixes (e.g. `${n}: ` → `${n}| `).
|
|
69
|
+
- Changing how `renderDocsBlock` iterates documents.
|
|
70
|
+
|
|
71
|
+
Examples that should **not** require a snapshot update:
|
|
72
|
+
|
|
73
|
+
- Edits to `reconcile.ts`, `schema.ts`, `prompt-budget.ts`, or `types.ts` — none of these flow through `buildPrompt`.
|
|
74
|
+
- Edits to `src/index.ts` re-exports.
|
|
75
|
+
- Tooling / config changes (`package.json`, `tsconfig.json`, ESLint rules).
|
|
76
|
+
|
|
77
|
+
A fixture-only change (editing `canonical-input.json` without touching the prompt) implies the snapshot no longer matches and the test fails. Always update fixture + snapshot together.
|
|
78
|
+
|
|
79
|
+
## How to update the snapshot
|
|
80
|
+
|
|
81
|
+
The snapshot is regenerated by hand to keep human-reviewer eyes on every byte of drift:
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
cd packages/drift-engine
|
|
85
|
+
|
|
86
|
+
# One-shot regeneration via a temporary node script. Inline so nothing checked
|
|
87
|
+
# into the repo can ever auto-update the snapshot.
|
|
88
|
+
node --import tsx -e "
|
|
89
|
+
import { readFileSync, writeFileSync } from 'node:fs'
|
|
90
|
+
import { fileURLToPath } from 'node:url'
|
|
91
|
+
import { buildPrompt } from './src/prompt-builder.ts'
|
|
92
|
+
|
|
93
|
+
const input = JSON.parse(readFileSync('./__tests__/fixtures/canonical-input.json', 'utf8'))
|
|
94
|
+
const template = readFileSync('./src/prompt.md', 'utf8')
|
|
95
|
+
writeFileSync('./__tests__/fixtures/canonical-prompt.snapshot.md', buildPrompt(input, template))
|
|
96
|
+
"
|
|
97
|
+
|
|
98
|
+
# Verify the test now passes
|
|
99
|
+
pnpm test prompt-snapshot.test
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
In the PR description, include:
|
|
103
|
+
|
|
104
|
+
1. **Which prompt change drove the snapshot update** — name the section of `prompt.md` or the rendering helper that changed.
|
|
105
|
+
2. **Confirmation that NFR44 gate B is unaffected** — `pnpm --filter @delfini/action test` still passes (gate B usually does — the rewrite is at the prompt-text level, not the reconcile-shape level).
|
|
106
|
+
3. **Explicit reviewer sign-off on the snapshot diff** — at least one reviewer confirms every byte of drift is intentional.
|
|
107
|
+
|
|
108
|
+
## What NOT to do
|
|
109
|
+
|
|
110
|
+
- **Do not `it.skip` the snapshot test** to make a PR go green. If the test fires, either the change is intentional (update the snapshot per above) or unintentional (revert the source change).
|
|
111
|
+
- **Do not auto-update the snapshot via tooling** without reviewer eyes on the diff. The manual `node --import tsx` workflow above is intentional friction.
|
|
112
|
+
- **Do not use Vitest's `toMatchSnapshot` / `toMatchInlineSnapshot`** — both auto-create on first run and auto-update on `--update-snapshots`, defeating the human-review-required semantics. The test uses `readFileSync` + a literal-string comparison so the snapshot is a real, reviewable file in the PR diff.
|
|
113
|
+
- **Do not change `core.autocrlf` / `.gitattributes`** to "fix" a Windows checkout that's producing CRLF snapshots. The repo's `.gitattributes` pins `packages/drift-engine/src/prompt.md`, `__tests__/fixtures/*.snapshot.md`, and `__tests__/fixtures/*.json` to `eol=lf`; renormalise with `git add --renormalize <path>` if a checkout drifted.
|
|
114
|
+
|
|
115
|
+
## NFR49(b) Parity-Gate Policy
|
|
116
|
+
|
|
117
|
+
PRD v6.7 added the token-efficient retrieval stage (FR150–FR153) — section-granularity doc retrieval (`selectRelevantSections`), deterministic diff pre-filtering (`filterDiff`), ranked-fill prompt budget (`rankedFillSections` + `buildPromptWithDrops`), and the working-tree-at-branch-HEAD doc-read invariant. The stage changes `buildPrompt` output **when enabled**, so it would trip the NFR44 release gates without discipline. The policy that keeps the gates green during roll-out:
|
|
118
|
+
|
|
119
|
+
### Three allowed paths — and one forbidden
|
|
120
|
+
|
|
121
|
+
- **Path A — opt-in / default-off (V1 default).** Every retrieval knob ships behind a `BuildPromptOptions` field with an explicit no-op fast-path (`relevanceThreshold` and `promptTokenBudget` both default to undefined / `<= 0` / non-finite → keep everything, identical output). The default `buildPrompt(input, template)` call path produces **byte-identical** output vs. the pre-v6.7 baseline. All three NFR44 release gates (A drift-engine snapshot, B Action pipeline, C bundled-CLI parity) pass with **no re-snapshot**. This is the V1-expected path and the path the FR150–FR152 stories landed on.
|
|
122
|
+
- **Path B — deliberate, reviewed re-snapshot.** When a `prompt.md` wording change or a default-on activation is genuinely required, the PR re-snapshots `canonical-prompt.snapshot.md` (gate A) and the bundled-CLI parity fixture (gate C) in lockstep. The PR description names the change explicitly with a sentence such as: *"intentional `prompt.md` re-snapshot under NFR49(b) Path B — <short rationale>."* Reviewers read the snapshot diff as part of code review; sign-off is required.
|
|
123
|
+
- **Path C — silent regen — FORBIDDEN.** Running `pnpm test -u` to make a failing snapshot diff disappear without explicit PR acknowledgement is a process violation. If the snapshot is moving and you cannot justify why, the source change is not yet ready.
|
|
124
|
+
|
|
125
|
+
### PR-description templates
|
|
126
|
+
|
|
127
|
+
Copy the right paragraph into the PR body:
|
|
128
|
+
|
|
129
|
+
**Path A:**
|
|
130
|
+
|
|
131
|
+
> **NFR49(b) parity statement (Path A — no prompt wording change / opt-in default-off).** The retrieval / filter / budget logic this PR adds is gated behind a `BuildPromptOptions` knob whose default (undefined / `<= 0` / non-finite) keeps every section and every diff hunk. `buildPrompt(input, template)` emits byte-identical output vs. the pre-PR baseline. All three NFR44 release gates (A drift-engine snapshot, B Action pipeline, C bundled-CLI parity) stay green with no re-snapshot.
|
|
132
|
+
|
|
133
|
+
**Path B:**
|
|
134
|
+
|
|
135
|
+
> **NFR49(b) parity statement (Path B — deliberate re-snapshot).** This PR sharpens `packages/drift-engine/src/prompt.md` (commit `<sha>`, lines `<range>`). The NFR44 gate A snapshot (`__tests__/prompt-snapshot.test.ts`'s `canonical-prompt.snapshot.md`) and the bundled-CLI parity gate C's expected output are deliberately re-snapped in lockstep. Rationale: `<one-sentence rationale>`. Reviewer sign-off on the snapshot diff: `<reviewer>`.
|
|
136
|
+
|
|
137
|
+
### Release-time recall
|
|
138
|
+
|
|
139
|
+
Per-commit recall is asserted via the **retention gate** in `__tests__/token-efficiency.test.ts` (LLM-free): for each labelled fixture in `__tests__/fixtures/token-efficiency/` and `__tests__/fixtures/residual-drift/`, the section identified by `expected.json`'s `groundTruthDocPath` + `groundTruthSection` MUST survive `selectRelevantSections` + `rankedFillSections`. A regression that drops it fails CI.
|
|
140
|
+
|
|
141
|
+
Behavioural LLM recall — does the model actually flag the drift in the assembled prompt? — is verified **at release** on `apps/action`'s NFR40 eval set, inherited by construction via the shared `buildPrompt`. **No LLM call runs in the drift-engine / CLI unit suite** (design-spec NG6 — model-quality testing lives in the Action's eval harness). The `__tests__/token-efficiency.test.ts` suite is sub-second and dispatches no subagent.
|
|
142
|
+
|
|
143
|
+
### `scripts/measure-tokens.ts` — local iteration tool
|
|
144
|
+
|
|
145
|
+
For local before/after token measurement without re-running the whole vitest suite, use the `tsx` script:
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
# All fixtures in the corpus
|
|
149
|
+
node --import tsx packages/drift-engine/scripts/measure-tokens.ts
|
|
150
|
+
|
|
151
|
+
# A single fixture
|
|
152
|
+
node --import tsx packages/drift-engine/scripts/measure-tokens.ts \
|
|
153
|
+
packages/drift-engine/__tests__/fixtures/token-efficiency/case-01-doc-heavy/analysis-input.json
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
Output is one line per case: `<case-slug>: tokens off=<N1> on=<N2> ratio=<r.rr> Δ=<Δ%>`. The script uses the same `estimatePromptTokens(buildPrompt(...))` math as the CI gate in `token-efficiency.test.ts`, so the live number and the CI number agree by construction. The script is not wired into CI, has no assertions, and is not surfaced as a npm script — it is purely a developer-iteration aid. Errors (missing fixture, malformed JSON) print to stderr and exit non-zero.
|
|
157
|
+
|
|
158
|
+
### Source pointers
|
|
159
|
+
|
|
160
|
+
- **PRD v6.7 — NFR49** in `_bmad-output/planning-artifacts/prd.md` §"NFR49" — normative target + parity-gate policy.
|
|
161
|
+
- **Architecture ADR (2026-06-03)** in `_bmad-output/planning-artifacts/architecture.md` §"Token-Efficient Drift-Analysis Retrieval" — design rationale.
|
|
162
|
+
- **Story P3.7.5** in `_bmad-output/implementation-artifacts/skill-p3-7-5-token-efficiency-measurement-parity-gate.md` — the harness story that ratified this policy.
|
|
163
|
+
|
|
164
|
+
## Runtime constraints (lint-enforced)
|
|
165
|
+
|
|
166
|
+
The package's `eslint.config.js` rules forbid imports that would compromise its pure-logic posture:
|
|
167
|
+
|
|
168
|
+
- **No I/O** — no `fs`, `child_process`, `http`, `https`, `node:fs`, etc.
|
|
169
|
+
- **No LLM client** — no `@anthropic-ai/sdk`, `openai`, `@langchain/*`.
|
|
170
|
+
- **No env-var reads** — no `process.env`. Pure functions of explicit arguments.
|
|
171
|
+
|
|
172
|
+
Runtime deps: `zod` + `picomatch` (both pure CPU — no I/O, no network, no env). DevDeps: `vitest`, `@types/node`, `typescript`. Adding anything else is a regression.
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
/** Why a path or hunk was dropped. */
|
|
2
|
+
export type DropReason = 'lockfile' | 'generated' | 'vendored' | 'fixture' | 'whitespace-only' | 'import-only';
|
|
3
|
+
export interface DroppedPath {
|
|
4
|
+
path: string;
|
|
5
|
+
reason: DropReason;
|
|
6
|
+
}
|
|
7
|
+
export interface DroppedHunk {
|
|
8
|
+
path: string;
|
|
9
|
+
hunkHeader: string;
|
|
10
|
+
reason: DropReason;
|
|
11
|
+
}
|
|
12
|
+
export interface FilterDiffResult {
|
|
13
|
+
/** The diff after dropping path-level and hunk-level noise. */
|
|
14
|
+
keptDiff: string;
|
|
15
|
+
/** Files dropped in their entirety. */
|
|
16
|
+
droppedPaths: DroppedPath[];
|
|
17
|
+
/** Individual hunks dropped from otherwise-kept files. */
|
|
18
|
+
droppedHunks: DroppedHunk[];
|
|
19
|
+
}
|
|
20
|
+
/**
|
|
21
|
+
* Filter a unified-diff string deterministically.
|
|
22
|
+
*
|
|
23
|
+
* The input shape is the same one `buildPrompt`'s `AnalysisInput.diff`
|
|
24
|
+
* consumes: a sequence of `diff --git a/<path> b/<path>` blocks, each with a
|
|
25
|
+
* `--- a/...` / `+++ b/...` preamble followed by one or more `@@ ... @@`
|
|
26
|
+
* hunks. Content before the first `diff --git` header (rare in practice — git
|
|
27
|
+
* does not emit any) is preserved as a leading "noise" segment so callers do
|
|
28
|
+
* not lose surrounding context.
|
|
29
|
+
*
|
|
30
|
+
* Identical input → identical output (NFR46 reproducibility carries forward).
|
|
31
|
+
*/
|
|
32
|
+
export declare function filterDiff(diff: string): FilterDiffResult;
|
|
33
|
+
//# sourceMappingURL=diff-filter.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"diff-filter.d.ts","sourceRoot":"","sources":["../src/diff-filter.ts"],"names":[],"mappings":"AAsBA,sCAAsC;AACtC,MAAM,MAAM,UAAU,GAClB,UAAU,GACV,WAAW,GACX,UAAU,GACV,SAAS,GACT,iBAAiB,GACjB,aAAa,CAAA;AAEjB,MAAM,WAAW,WAAW;IAC1B,IAAI,EAAE,MAAM,CAAA;IACZ,MAAM,EAAE,UAAU,CAAA;CACnB;AAED,MAAM,WAAW,WAAW;IAC1B,IAAI,EAAE,MAAM,CAAA;IACZ,UAAU,EAAE,MAAM,CAAA;IAClB,MAAM,EAAE,UAAU,CAAA;CACnB;AAED,MAAM,WAAW,gBAAgB;IAC/B,+DAA+D;IAC/D,QAAQ,EAAE,MAAM,CAAA;IAChB,uCAAuC;IACvC,YAAY,EAAE,WAAW,EAAE,CAAA;IAC3B,0DAA0D;IAC1D,YAAY,EAAE,WAAW,EAAE,CAAA;CAC5B;AAID;;;;;;;;;;;GAWG;AACH,wBAAgB,UAAU,CAAC,IAAI,EAAE,MAAM,GAAG,gBAAgB,CA0DzD"}
|