npm - mongo-query-normalizer - Versions diffs - 0.2.2 → 0.2.3 - Mend

mongo-query-normalizer 0.2.2 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -1,525 +1,191 @@
-# Mongo Query Normalizer
+# mongo-query-normalizer
-**English** | [中文](README.zh-CN.md)
-An **observable, level-based** normalizer for MongoDB query objects. It stabilizes query **shape** at the conservative default, and adds **`predicate`** and **`scope`** levels with **documented, test-backed contracts** (see [SPEC.md](SPEC.md) and [docs/normalization-matrix.md](docs/normalization-matrix.md) / [中文](docs/normalization-matrix.zh-CN.md)). It returns **predictable** output plus **metadata**—not a MongoDB planner optimizer.
-> **Default posture:** **`shape`** is the smallest, structural-only pass and the recommended default for the widest production use. **`predicate`** and **`scope`** apply additional conservative rewrites under explicit contracts; adopt them when you need those transforms and accept their modeled-operator scope (opaque operators stay preserved).
->
-> **As of `v0.2.0`:** predicate rewrites are intentionally narrowed to an explicitly validated surface (`eq.eq`, `eq.ne`, `eq.in`, `eq.range`, `range.range`). High-risk combinations (for example null-vs-missing, array-sensitive semantics, `$exists`/`$nin`, object-vs-dotted-path mixes, opaque mixes) remain conservative by design.
-> **Note:** `predicate.safetyPolicy.allowArraySensitiveRewrite` is **deprecated**. It no longer enables unsatisfiable deductions for `$eq`/`$in` non-membership (the normalizer must not emit `IMPOSSIBLE_SELECTOR` solely from `eq ∉ in` without schema).
+> A safe MongoDB query normalizer — **correctness over cleverness**
 ---
-## Why it exists
-- Query **shape** diverges across builders and hand-written filters.
-- Outputs can be **hard to compare**, log, or diff without a stable pass.
-- You need a **low-risk normalization layer** that defaults to conservative behavior.
-This library does **not** promise to make queries faster or to pick optimal indexes.
----
-## Features
-- **Level-based** normalization (`shape` → `predicate` → `scope`)
-- **Conservative default**: `shape` only out of the box (lowest-risk structural pass)
-- **Observable** `meta`: changed flags, applied/skipped rules, warnings, hashes, optional stats
-- **Stable / idempotent** output when rules apply (same options)
-- **Opaque fallback** for unsupported operators (passthrough, not semantically rewritten)
----
+## ✨ What it does
-## Install
+**Turn messy Mongo queries into clean, stable, and predictable ones — safely.**
-```bash
-npm install mongo-query-normalizer
-```
----
-## Quick start
-```ts
-import { normalizeQuery } from "mongo-query-normalizer";
-const result = normalizeQuery({
-    $and: [{ status: "open" }, { $and: [{ priority: { $gte: 1 } }] }],
-});
+```js
+// before
+{
+  $and: [
+    { status: "open" },
+    { status: { $in: ["open", "closed"] } }
+  ]
+}
-console.log(result.query);
-console.log(result.meta);
+// after
+{ status: "open" }
 ```
 ---
-## Complete usage guide
-### 1) Minimal usage (recommended default)
-```ts
-import { normalizeQuery } from "mongo-query-normalizer";
-const { query: normalizedQuery, meta } = normalizeQuery(inputQuery);
-```
-- Without `options`, default behavior is `level: "shape"`.
-- Best for low-risk structural stabilization: logging, cache-key normalization, query diff alignment.
-### 2) Pick a level explicitly
-```ts
-normalizeQuery(inputQuery, { level: "shape" }); // structural only (default)
-normalizeQuery(inputQuery, { level: "predicate" }); // modeled predicate cleanup
-normalizeQuery(inputQuery, { level: "scope" }); // scope propagation / conservative pruning
-```
-- `shape`: safest structural normalization.
-- `predicate`: dedupe / merge comparable predicates, and contradiction collapse only for **provably safe** modeled cases (no schema assumption; multikey/array fields stay conservative).
-- `scope`: adds inherited-constraint propagation and conservative branch decisions on top of `predicate`.
-### 3) Full `options` example
-```ts
-import { normalizeQuery } from "mongo-query-normalizer";
-const result = normalizeQuery(inputQuery, {
-    level: "scope",
-    rules: {
-        // shape-related
-        flattenLogical: true,
-        removeEmptyLogical: true,
-        collapseSingleChildLogical: true,
-        dedupeLogicalChildren: true,
-        // predicate-related
-        dedupeSameFieldPredicates: true,
-        mergeComparablePredicates: true,
-        collapseContradictions: true,
-        // ordering-related
-        sortLogicalChildren: true,
-        sortFieldPredicates: true,
-        // scope observe-only rule (no structural hoist)
-        detectCommonPredicatesInOr: true,
-    },
-    safety: {
-        maxNormalizeDepth: 32,
-        maxNodeGrowthRatio: 1.5,
-    },
-    observe: {
-        collectWarnings: true,
-        collectMetrics: false,
-        collectPredicateTraces: false,
-        collectScopeTraces: false,
-    },
-    predicate: {
-        safetyPolicy: {
-            // override only fields you care about
-        },
-    },
-    scope: {
-        safetyPolicy: {
-            // override only fields you care about
-        },
-    },
-});
-```
-### 4) Inspect resolved runtime options
+## ⚠️ Why this matters
-```ts
-import { resolveNormalizeOptions } from "mongo-query-normalizer";
+If you build dynamic queries, you will eventually get:
-const resolvedOptions = resolveNormalizeOptions({
-    level: "predicate",
-    observe: { collectMetrics: true },
-});
+* duplicated conditions
+* inconsistent query shapes
+* hard-to-debug filters
+* subtle semantic bugs
-console.log(resolvedOptions);
-```
+Most tools try to “optimize” queries.
-- Useful for debugging why a rule is enabled/disabled.
-- Useful for logging a startup-time normalization config snapshot.
+👉 This library does something different:
-### 5) Consume `query` and `meta`
+> **It only applies transformations that are provably safe.**
-```ts
-const { query: normalizedQuery, meta } = normalizeQuery(inputQuery, options);
+---
-if (meta.bailedOut) {
-    logger.warn({ reason: meta.bailoutReason }, "normalization bailed out");
-}
+## 🛡️ Safe by design
-if (meta.changed) {
-    logger.info(
-        {
-            level: meta.level,
-            beforeHash: meta.beforeHash,
-            afterHash: meta.afterHash,
-            appliedRules: meta.appliedRules,
-        },
-        "query normalized"
-    );
+```js
+// NOT simplified (correctly)
+{
+  $and: [
+    { uids: "1" },
+    { uids: "2" }
+  ]
 }
 ```
-- `query`: normalized query object.
-- `meta`: observability data (changed flag, rule traces, warnings, hashes, optional stats/traces).
-### 6) Typical integration patterns
+Why?
-```ts
-// A. Normalize centrally in data-access layer
-export function normalizeForFind(rawFilter) {
-    return normalizeQuery(rawFilter, { level: "shape" }).query;
-}
+Because MongoDB arrays can match both:
-// B. Use stronger convergence in offline paths
-export function normalizeForBatch(rawFilter) {
-    return normalizeQuery(rawFilter, { level: "predicate" }).query;
-}
+```js
+{ uids: ["1", "2"] }
 ```
-- Prefer `shape` for online request paths.
-- Enable `predicate` / `scope` when there is clear benefit plus test coverage.
-### 7) Errors and boundaries
-- Invalid `level` throws an error (for example, typos).
-- Unsupported or unknown operators are generally preserved as opaque; semantic merge behavior is not guaranteed for them.
-- The library target is stability and observability, not query planning optimization.
----
-## Default behavior
-- **Default `level` is `"shape"`** (see `resolveNormalizeOptions()`).
-- By default there is **no** predicate merge at `shape`. At **`scope`**, core work is inherited-constraint propagation and conservative branch decisions; **`detectCommonPredicatesInOr`** is an **optional, observe-only** rule (warnings / traces)—never a structural hoist.
-- The goal is **stability and observability**, not “smart optimization.”
----
-## Choosing a level
-- Use **`shape`** when you only need structural stabilization (flatten, dedupe children, ordering, etc.).
-- Use **`predicate`** when you need same-field dedupe, modeled comparable merges, and contradiction collapse on **modeled** operators; opaque subtrees stay preserved.
-- Use **`scope`** when you need inherited-constraint propagation, conservative pruning, and narrow coverage elimination as described in the spec and matrix. **`detectCommonPredicatesInOr`** (when enabled) is **observe-only** and does not rewrite structure.
-Authoritative behavior boundaries are in **[SPEC.md](SPEC.md)**, **[docs/normalization-matrix.md](docs/normalization-matrix.md)**, and contract tests under **`test/contracts/`**—not informal README prose alone.
----
-## Levels
-### `shape` (default)
-**Recommended default** for the lowest-risk path. Safe structural normalization only, for example:
-- flatten compound (`$and` / `$or`) nodes
-- remove empty compound nodes
-- collapse single-child compound nodes
-- dedupe compound children
-- canonical ordering
-### `predicate`
-On top of `shape`, conservative **predicate** cleanup on **modeled** operators:
-- dedupe same-field predicates
-- merge comparable predicates where modeled
-- collapse clear contradictions to an unsatisfiable filter
-- merge **direct** `$and` children that share the same field name before further predicate work (so contradictions like `{ $and: [{ a: 1 }, { a: 2 }] }` can be detected)
-### `scope`
-On top of `predicate`:
-- **Inherited constraint propagation** (phase-1 allowlist) and **conservative branch pruning**; **coverage elimination** only in narrow, tested cases when policy allows
-- Optional **`detectCommonPredicatesInOr`**: observe-only (warnings / traces); **no** structural rewrite
----
-## `meta` fields
-| Field | Meaning |
-|--------|---------|
-| `changed` | Structural/predicate output differs from input (hash-based) |
-| `level` | Resolved normalization level |
-| `appliedRules` / `skippedRules` | Rule tracing |
-| `warnings` | Non-fatal issues when `observe.collectWarnings` is enabled (rule notices, detection text, etc.) |
-| `bailedOut` | Safety stop; output reverts to pre-pass parse for that call |
-| `bailoutReason` | Why bailout happened, if any |
-| `beforeHash` / `afterHash` | Stable hashes for diffing |
-| `stats` | Optional before/after tree metrics (`observe.collectMetrics`) |
-| `predicateTraces` | When `observe.collectPredicateTraces`: per-field planner / skip / contradiction signals |
-| `scopeTrace` | When `observe.collectScopeTraces`: constraint extraction rejections + scope decision events |
----
-## Unsupported / opaque behavior
-Structures such as **`$nor`**, **`$regex`**, **`$not`**, **`$elemMatch`**, **`$expr`**, geo/text queries, and **unknown** operators are generally treated as **opaque**: they pass through or are preserved without full semantic rewriting. They are **not** guaranteed to participate in merge or contradiction logic.
 ---
-## Stability policy
-The **public contract** is:
+## ❌ What this is NOT
-- `normalizeQuery`
-- `resolveNormalizeOptions`
-- the exported **types** listed in the package entry
+* Not a query optimizer
+* Not an index advisor
+* Not a performance tool
-**Not** part of the public contract: internal AST, `parseQuery`, `compileQuery`, individual rules/passes, or utilities. They may change between versions.
+It will **never guess**:
----
-## Principles (explicit)
+* field cardinality
+* schema constraints
+* data distribution
-1. Default level is **`shape`**.
-2. **`predicate`** / **`scope`** may change structure while aiming for **semantic equivalence** on **modeled** operators.
-3. **Opaque** nodes are not rewritten semantically.
-4. Output should be **idempotent** under the same options when no bailout occurs.
-5. This library is **not** the MongoDB query planner or an optimizer.
+If unsure → **skip**
 ---
-## Example scenarios
-**Online main path** — use default (`shape`); this remains the most production-safe baseline in `v0.2.0`:
+## 🚀 Quick start
 ```ts
-normalizeQuery(query);
-```
-**Predicate or scope** — pass `level` explicitly; review [SPEC.md](SPEC.md) and contract tests for supported vs preserved patterns:
+import { normalizeQuery } from "mongo-query-normalizer";
-```ts
-normalizeQuery(query, { level: "predicate" });
+const { query } = normalizeQuery(inputQuery);
 ```
 ---
-## Public API
+## 🧠 Where it fits
-```ts
-normalizeQuery(query, options?) => { query, meta }
-resolveNormalizeOptions(options?) => ResolvedNormalizeOptions
+```text
+Query Builder / ORM
+        ↓
+   normalizeQuery   ← (this library)
+        ↓
+      MongoDB
 ```
-Types: `NormalizeLevel`, `NormalizeOptions`, `NormalizeRules`, `NormalizeSafety`, `NormalizeObserve`, `ResolvedNormalizeOptions`, `NormalizeResult`, `NormalizeStats`, `PredicateSafetyPolicy`, `ScopeSafetyPolicy`, trace-related types (see package exports).
----
-## Testing
-### Test layout
-This repository organizes tests by **API surface**, **normalization level**, and **cross-level contracts**, while preserving deeper semantic and regression suites.
-### Directory responsibilities
-#### `test/api/`
-Tests the public API and configuration surface.
-Put tests here when they verify:
-* `normalizeQuery` return shape and top-level behavior
-* `resolveNormalizeOptions`
-* package exports
-Do **not** put level-specific normalization behavior here.
----
-#### `test/levels/`
-Tests the behavior boundary of each `NormalizeLevel`.
-Current levels:
-* `shape`
-* `predicate`
-* `scope`
-Each level test file should focus on four things:
-1. positive capabilities of that level
-2. behavior explicitly not enabled at that level
-3. contrast with the adjacent level(s)
-4. a small number of representative contracts for that level
-Prefer asserting:
-* normalized query structure
-* observable cross-level differences
-* stable public metadata
-Avoid overfitting to:
-* exact warning text
-* exact internal rule IDs
-* fixed child ordering unless ordering itself is part of the contract
----
-#### `test/contracts/`
-Tests contracts that should hold across levels, or default behavior that is separate from any single level.
-Put tests here when they verify:
-* default level behavior
-* idempotency across all levels
-* output invariants across all levels
-* opaque subtree preservation across all levels
-* formal **`predicate` / `scope`** contracts (supported merges, opaque preservation, scope policy guards, rule toggles)—see `test/contracts/predicate-scope-stable-contract.test.js`
-Use `test/helpers/level-contract-runner.js` for all-level suites.
+You don’t replace your builder.
+You **sanitize its output**.
 ---
-#### `test/semantic/`
+## 🧩 When to use
-Tests semantic equivalence against execution behavior.
-These tests validate that normalization preserves meaning.
-This directory is intentionally separate from `levels/` and `contracts/`.
+* dynamic filters / search APIs
+* BI / reporting systems
+* user-generated queries
+* multi-team codebases with inconsistent query styles
+* logging / caching / diffing queries
 ---
-#### `test/property/`
-Tests property-based and metamorphic behavior.
-Use this directory for:
+## ⚙️ Levels
-* randomized semantic checks
-* metamorphic invariants
-* broad input-space validation
+| Level       | What it does                   | Safety    |
+| ----------- | ------------------------------ | --------- |
+| `shape`     | structural normalization       | 🟢 safest |
+| `predicate` | safe predicate simplification  | 🟡        |
+| `scope`     | limited constraint propagation | 🟡        |
-Do not use it as the primary place to express level boundaries.
+Default is `shape`.
 ---
-#### `test/regression/`
+## 📦 Output
-Tests known historical failures and hand-crafted regression cases.
-Add a regression test here when fixing a bug that should stay fixed.
----
-#### `test/performance/`
-Tests performance guards or complexity-sensitive behavior.
-These tests should stay focused on performance-related expectations, not general normalization structure.
+```ts
+{
+  query, // normalized query
+  meta   // debug / trace info
+}
+```
 ---
-### Helper files
-#### `test/helpers/level-runner.js`
-Shared helper for running a query at a specific level.
-#### `test/helpers/level-cases.js`
-Shared fixed inputs used across level tests.
-Prefer adding reusable representative cases here instead of duplicating inline fixtures.
+## 🎯 Design philosophy
-#### `test/helpers/level-contract-runner.js`
+> If a rewrite might be wrong, don’t do it.
-Shared `LEVELS` list and helpers for all-level contract suites.
+* no schema assumptions
+* no array guessing
+* no unsafe merges
+* deterministic output
+* idempotent results
 ---
-### Rules for adding new tests
-#### When adding a new normalization rule
-Ask first:
-* Is this a public API behavior?
-  * Add to `test/api/`
-* Is this enabled only at a specific level?
-  * Add to `test/levels/`
-* Should this hold for all levels?
+## 🔍 Example
-  * Add to `test/contracts/`
-* Is this about semantic preservation or randomized validation?
-  * Add to `test/semantic/` or `test/property/`
-* Is this a bug fix for a previously broken case?
-  * Add to `test/regression/`
----
-#### When adding a new level
-At minimum, update all of the following:
+```ts
+const result = normalizeQuery({
+  $and: [
+    { status: "open" },
+    { status: { $in: ["open", "closed"] } }
+  ]
+});
-1. add a new `test/levels/<level>-level.test.js`
-2. register the level in `test/helpers/level-contract-runner.js`
-3. ensure all-level contract suites cover it
-4. add at least one contrast case against the adjacent level
+console.log(result.query);
+// { status: "open" }
+```
 ---
-### Testing style guidance
-Prefer:
-* example-based tests for level boundaries
-* query-shape assertions
-* contrast tests between adjacent levels
-* shared fixtures for representative cases
+## 📚 Docs
-Avoid:
+* [`SPEC.md`](SPEC.md) — behavior spec
+* [`docs/normalization-matrix.md`](docs/normalization-matrix.md) — rule coverage by operator and level
+* [`docs/CANONICAL_FORM.md`](docs/CANONICAL_FORM.md) — canonical output shape and idempotency
+* [`CHANGELOG.md`](CHANGELOG.md) — release notes
+* [`test/REGRESSION.md`](test/REGRESSION.md) — reproducing property / semantic test failures
-* coupling level tests to unstable implementation details
-* repeating the same fixture with only superficial assertion changes
-* putting default-level behavior inside a specific level test
-* mixing exports/API tests with normalization behavior tests
+**中文：** [`README.zh-CN.md`](README.zh-CN.md) · [`SPEC.zh-CN.md`](SPEC.zh-CN.md) · [`docs/normalization-matrix.zh-CN.md`](docs/normalization-matrix.zh-CN.md) · [`CHANGELOG.zh-CN.md`](CHANGELOG.zh-CN.md)
 ---
-### Practical rule of thumb
+## 🧪 Testing
-* `api/` answers: **how the library is used**
-* `levels/` answers: **what each level does and does not do**
-* `contracts/` answers: **what must always remain true**
-* `semantic/property/regression/performance` answer: **whether the system remains correct, robust, and efficient**
+* semantic equivalence tests (real MongoDB)
+* property-based testing
+* regression suites
 ---
-### npm scripts and property-test tooling
-Randomized semantic tests use **`mongodb-memory-server`** + **`fast-check`** to compare **real** `find` results (same `sort` / `skip` / `limit`, projection `{ _id: 1 }`) before and after `normalizeQuery` on a **fixed document schema** and a **restricted operator set** (see `test/helpers/arbitraries.js`). They assert matching **`_id` order**, **idempotency** of the returned `query`, and (for opaque operators) **non-crash / stable second pass** only. **`FC_SEED` / `FC_RUNS` defaults are centralized in `test/helpers/fc-config.js`** (also re-exported from `arbitraries.js`).
-To **avoid downloading** a MongoDB binary, set one of **`MONGODB_BINARY`**, **`MONGOD_BINARY`**, or **`MONGOMS_SYSTEM_BINARY`** to your local `mongod` path before running semantic tests (see `test/helpers/mongo-fixture.js`).
-* **`npm run test`** — build, then `test:unit`, then `test:semantic`.
-* **`npm run test:api`** — `test/api/**/*.test.js` only.
-* **`npm run test:levels`** — `test/levels/**/*.test.js` and `test/contracts/*.test.js`.
-* **`npm run test:unit`** — all `test/**/*.test.js` except `test/semantic/**`, `test/regression/**`, and `test/property/**` (includes `test/api/**`, `test/levels/**`, `test/contracts/**`, `test/performance/**`, and other unit tests).
-* **`npm run test:semantic`** — semantic + regression + property folders (defaults when env unset: see `fc-config.js`).
-* **`npm run test:semantic:quick`** — lower **`FC_RUNS`** (script sets `45`) + **`FC_SEED=42`**, still runs `test/regression/**` and `test/property/**`.
-* **`npm run test:semantic:ci`** — CI-oriented env (`FC_RUNS=200`, `FC_SEED=42` in script).
-Override property-test parameters: **`FC_SEED`**, **`FC_RUNS`**, optional **`FC_QUICK=1`** (see `fc-config.js`). How to reproduce failures and when to add a fixed regression case: **`test/REGRESSION.md`**.
-Full-text, geo, heavy **`$expr`**, **`$where`**, aggregation, collation, etc. stay **out** of the main semantic equivalence generator; opaque contracts live in **`test/contracts/opaque-operators.all-levels.test.js`**.
----
+## ⭐ Philosophy
-## Contributor notes
+Most query tools try to be smart.
-- [SPEC.md](SPEC.md) — behavior-oriented specification.
-- [docs/CANONICAL_FORM.md](docs/CANONICAL_FORM.md) — idempotency and canonical shape notes.
+This one tries to be **correct**.

package/README.zh-CN.md CHANGED Viewed

@@ -1,515 +1,192 @@
-# Mongo Query Normalizer
+# mongo-query-normalizer
 [English](README.md) | **中文**
-一个面向 **MongoDB 查询对象** 的 **可观测、分层式** 规范化器。它以保守默认策略稳定查询 **shape**，并提供 **`predicate`** 与 **`scope`** 两个带有**文档化、测试兜底契约**的层级（见 [SPEC.zh-CN.md](SPEC.zh-CN.md) 与 [docs/normalization-matrix.zh-CN.md](docs/normalization-matrix.zh-CN.md)；英文对照见 [SPEC.md](SPEC.md) 与 [docs/normalization-matrix.md](docs/normalization-matrix.md)）。它返回**可预测**的输出与 **metadata**，而不是 MongoDB 查询规划器优化器。
-> **默认策略：** **`shape`** 仅做结构规范化，适合作为**覆盖面最广**的默认路径。 **`predicate`**、**`scope`** 在 **SPEC**、**normalization-matrix** 与 **契约测试** 中有明确边界；仅在需要对应能力且接受「已建模算子」范围时启用；**opaque** 算子保持透传。
->
-> **`v0.2.0` 起：** `predicate` 改写面有意收敛到显式验证能力（`eq.eq`、`eq.ne`、`eq.in`、`eq.range`、`range.range`）。高风险组合（如 `null`/缺失语义、数组敏感语义、`$exists`/`$nin`、整对象与点路径混用、opaque 混用）按设计保持保守处理。
-> **说明：** `predicate.safetyPolicy.allowArraySensitiveRewrite` 已**废弃**。它不再用于启用 `$eq`/`$in` 的“未命中即判死”（无 schema 时不得仅基于 `eq ∉ in` 就输出 `IMPOSSIBLE_SELECTOR`）。
----
-## 为什么需要它
-- 查询 **结构** 在不同写法下容易发散。
-- 没有稳定层时，**对比、日志、回放** 成本高。
-- 需要一层 **低风险** 的 query normalization，默认行为要保守。
-本库**不以**「自动让查询更快」或「替代 planner」作为卖点。
----
-## 核心特性
-- **按 level 分层**：`shape` → `predicate` → `scope`
-- **默认保守**：开箱仅 `shape`（风险最小的结构层）
-- **可观测的 `meta`**：变更、规则、告警、哈希、可选统计
-- **稳定 / 幂等**（相同 options、未熔断时）
-- **不透明（opaque）回退**：不支持的算子以透传为主，不做完整语义改写
----
-## 安装
-```bash
-npm install mongo-query-normalizer
-```
+> 安全的 MongoDB 查询规范化器 —— **正确优先于「聪明」**
 ---
-## 快速开始
+## ✨ 它能做什么
-```ts
-import { normalizeQuery } from "mongo-query-normalizer";
+**把杂乱的 Mongo 查询，安全地变成干净、稳定、可预期的形态。**
-const result = normalizeQuery({
-    $and: [{ status: "open" }, { $and: [{ priority: { $gte: 1 } }] }],
-});
+```js
+// 之前
+{
+  $and: [
+    { status: "open" },
+    { status: { $in: ["open", "closed"] } }
+  ]
+}
-console.log(result.query);
-console.log(result.meta);
+// 之后
+{ status: "open" }
 ```
 ---
-## 完整使用说明
-### 1) 最小可用（推荐默认）
-```ts
-import { normalizeQuery } from "mongo-query-normalizer";
-const { query: normalizedQuery, meta } = normalizeQuery(inputQuery);
-```
-- 不传 `options` 时，默认 `level: "shape"`。
-- 适合日志归一化、缓存 key 稳定化、查询 diff 对齐等“低风险结构规范化”场景。
-### 2) 显式选择 level
-```ts
-normalizeQuery(inputQuery, { level: "shape" }); // 仅结构层（默认）
-normalizeQuery(inputQuery, { level: "predicate" }); // 启用已建模谓词整理
-normalizeQuery(inputQuery, { level: "scope" }); // 启用 scope 传播/保守剪枝能力
-```
-- `shape`：结构稳定优先，风险最低。
-- `predicate`：在已建模算子范围内做去重、可比合并；矛盾折叠仅针对**可证明安全**的情形（默认不做 schema 假设，数组/多键字段保持保守）。
-- `scope`：在 `predicate` 之上增加继承约束传播与保守分支决策。
-### 3) `options` 全量示例
-```ts
-import { normalizeQuery } from "mongo-query-normalizer";
-const result = normalizeQuery(inputQuery, {
-    level: "scope",
-    rules: {
-        // shape 相关
-        flattenLogical: true,
-        removeEmptyLogical: true,
-        collapseSingleChildLogical: true,
-        dedupeLogicalChildren: true,
-        // predicate 相关
-        dedupeSameFieldPredicates: true,
-        mergeComparablePredicates: true,
-        collapseContradictions: true,
-        // 排序相关
-        sortLogicalChildren: true,
-        sortFieldPredicates: true,
-        // scope 观测规则（仅观测，不上提改写）
-        detectCommonPredicatesInOr: true,
-    },
-    safety: {
-        maxNormalizeDepth: 32,
-        maxNodeGrowthRatio: 1.5,
-    },
-    observe: {
-        collectWarnings: true,
-        collectMetrics: false,
-        collectPredicateTraces: false,
-        collectScopeTraces: false,
-    },
-    predicate: {
-        safetyPolicy: {
-            // 仅覆盖你关心的字段；其余使用默认值
-        },
-    },
-    scope: {
-        safetyPolicy: {
-            // 仅覆盖你关心的字段；其余使用默认值
-        },
-    },
-});
-```
-### 4) 用 `resolveNormalizeOptions` 查看最终生效配置
+## ⚠️ 为什么重要
-```ts
-import { resolveNormalizeOptions } from "mongo-query-normalizer";
+如果你在做动态查询，迟早会遇到：
-const resolvedOptions = resolveNormalizeOptions({
-    level: "predicate",
-    observe: { collectMetrics: true },
-});
+* 重复条件
+* 查询结构不一致
+* 难以调试的过滤器
+* 隐蔽的语义问题
-console.log(resolvedOptions);
-```
+多数工具会试图「优化」查询。
-- 适合排查“某个规则为何启用/未启用”。
-- 适合在服务启动时打印一次“规范化配置快照”。
+👉 本库做法不同：
-### 5) 处理返回值（`query` + `meta`）
+> **只应用可证明安全的变换。**
-```ts
-const { query: normalizedQuery, meta } = normalizeQuery(inputQuery, options);
+---
-if (meta.bailedOut) {
-    logger.warn({ reason: meta.bailoutReason }, "normalization bailed out");
-}
+## 🛡️ 设计上就安全
-if (meta.changed) {
-    logger.info(
-        {
-            level: meta.level,
-            beforeHash: meta.beforeHash,
-            afterHash: meta.afterHash,
-            appliedRules: meta.appliedRules,
-        },
-        "query normalized"
-    );
+```js
+// 不会简化（这是对的）
+{
+  $and: [
+    { uids: "1" },
+    { uids: "2" }
+  ]
 }
 ```
-- `query`：规范化后的查询对象。
-- `meta`：观测信息（是否变化、规则轨迹、告警、哈希、可选统计与 trace）。
+原因？
-### 6) 常见接入模式
-```ts
-// A. 在数据访问层统一规范化
-export function normalizeForFind(rawFilter) {
-    return normalizeQuery(rawFilter, { level: "shape" }).query;
-}
+因为 MongoDB 数组可以同时满足两者：
-// B. 需要更多收敛能力的离线路径（如批处理）
-export function normalizeForBatch(rawFilter) {
-    return normalizeQuery(rawFilter, { level: "predicate" }).query;
-}
+```js
+{ uids: ["1", "2"] }
 ```
-- 在线主路径优先 `shape`。
-- `predicate` / `scope` 建议在有明确收益与测试兜底时再启用。
-### 7) 错误与边界
-- `level` 非法会抛错（例如拼写错误）。
-- 不支持或未知算子通常按 opaque 保留，不保证参与语义合并。
-- 本库目标是“稳定与可观测”，不是查询优化器。
----
-## 默认行为说明
-- **默认 `level` 为 `shape`**（见 `resolveNormalizeOptions()`）。
-- `shape` 默认**不做**谓词级合并。**`scope`** 主路径是继承约束传播与保守分支决策；**`detectCommonPredicatesInOr`** 为**可选、仅观测**规则（告警/轨迹），**从不**做结构上提。
-- 默认目标是 **稳定与可观测**，不是「智能优化」。
----
-## 如何选择 level
-- 仅需结构稳定时，用 **`shape`**。
-- 需要同字段去重、可建模比较合并、矛盾折叠时，用 **`predicate`**（仅针对已建模算子）。
-- 需要继承约束传播、保守剪枝与狭窄覆盖消除时，用 **`scope`**（详见 [SPEC.zh-CN.md](SPEC.zh-CN.md) 与 [docs/normalization-matrix.zh-CN.md](docs/normalization-matrix.zh-CN.md)）。**`detectCommonPredicatesInOr`**（开启时）仅观测，不改写结构。
-**行为边界**以 **SPEC**、**normalization-matrix** 与 **`test/contracts/`** 为准，而非仅靠 README 叙述。
----
-## Level 说明
-### `shape`（默认）
-**推荐默认路径**（风险最小）：只做安全结构规范化，例如：
-- 展平复合（`$and` / `$or`）节点
-- 移除空复合节点
-- 折叠单子复合节点
-- 复合子节点去重
-- canonical ordering
-### `predicate`
-在 `shape` 之上对**已建模**算子做**保守**谓词整理：
-- 同字段谓词去重
-- 可建模的比较类谓词合并
-- 明确矛盾收敛为不可满足过滤器
-- 在 `normalizePredicate` 中，**`$and` 下同名 field 的直接子 `FieldNode` 会先合并**，以便检出诸如 `{ $and: [{ a: 1 }, { a: 2 }] }` 的矛盾
-### `scope`
-在 `predicate` 之上：
-- **继承约束传播**（phase-1 白名单）、**保守分支剪枝**；**覆盖消除**仅在狭窄、已测试场景且策略允许时进行
-- 可选 **`detectCommonPredicatesInOr`**：仅观测（告警/轨迹）；**不改写**查询结构
----
-## `meta` 说明
-| 字段 | 含义 |
-|------|------|
-| `changed` | 输出相对输入是否变化（基于哈希） |
-| `level` | 实际使用的规范化层级 |
-| `appliedRules` / `skippedRules` | 规则应用轨迹 |
-| `warnings` | `observe.collectWarnings` 为真时的非致命告警（规则说明、检测文案等） |
-| `bailedOut` | 是否触发安全熔断 |
-| `bailoutReason` | 熔断原因 |
-| `beforeHash` / `afterHash` | 前后稳定哈希 |
-| `stats` | 可选的前后树统计（`observe.collectMetrics`） |
-| `predicateTraces` | `observe.collectPredicateTraces` 为真时：每字段 planner / 跳过 / 矛盾等轨迹 |
-| `scopeTrace` | `observe.collectScopeTraces` 为真时：约束抽取拒绝原因与 scope 决策事件 |
----
-## 不支持 / opaque 行为
-以下结构通常**只透传或不参与完整语义改写**，例如：
-`$nor`、`$regex`、`$not`、`$elemMatch`、`$expr`、geo / text、未知算子等。
 ---
-## 稳定性策略
-**对外承诺**仅包括：
+## ❌ 这不是什么
-- `normalizeQuery`
-- `resolveNormalizeOptions`
-- 入口导出的 **类型**
+* 不是查询优化器
+* 不是索引顾问
+* 不是性能工具
-**不属于**对外契约：内部 AST、`parseQuery`、`compileQuery`、各 pass/rule、工具函数等，版本间可能变化。
+**绝不会猜测**：
----
-## 必须明确的原则
+* 字段基数
+* schema 约束
+* 数据分布
-1. 默认是 **`shape`**。
-2. **`predicate` / `scope`** 可能改变查询结构，但在已建模算子上追求 **语义等价**。
-3. **opaque** 节点不会被语义重写。
-4. 在未熔断时，输出应对相同 options 保持 **幂等**。
-5. 本库 **不是** MongoDB 的 planner optimizer。
+不确定 → **跳过**
 ---
-## 示例场景
-**在线主路径** —— 使用默认（`shape`）；在 `v0.2.0` 中仍是最稳妥的生产基线：
+## 🚀 快速开始
 ```ts
-normalizeQuery(query);
-```
-**Predicate 或 Scope** —— 显式传 `level`；请结合 [SPEC.zh-CN.md](SPEC.zh-CN.md) 与契约测试理解“可改写”与“保留”边界：
+import { normalizeQuery } from "mongo-query-normalizer";
-```ts
-normalizeQuery(query, { level: "predicate" });
+const { query } = normalizeQuery(inputQuery);
 ```
 ---
-## 对外 API
+## 🧠 在架构中的位置
-```ts
-normalizeQuery(query, options?) => { query, meta }
-resolveNormalizeOptions(options?) => ResolvedNormalizeOptions
+```text
+Query Builder / ORM
+        ↓
+   normalizeQuery   ← （本库）
+        ↓
+      MongoDB
 ```
-类型：`NormalizeLevel`、`NormalizeOptions`、`NormalizeRules`、`NormalizeSafety`、`NormalizeObserve`、`ResolvedNormalizeOptions`、`NormalizeResult`、`NormalizeStats`、`PredicateSafetyPolicy`、`ScopeSafetyPolicy` 及轨迹相关类型（见包导出）。
----
-## 测试
-### 测试布局
-本仓库按 **对外 API**、**规范化 level** 与 **跨 level 契约** 组织测试，并保留更深的语义与回归套件。
-### 目录职责
-#### `test/api/`
-覆盖对外 API 与配置面。
-适合放在此处的验证包括：
-* `normalizeQuery` 的返回形态与顶层行为
-* `resolveNormalizeOptions`
-* 包导出
-**不要**把「某一 level 专属的规范化行为」放在这里。
----
-#### `test/levels/`
-覆盖每个 `NormalizeLevel` 的行为边界。
-当前 level：
-* `shape`
-* `predicate`
-* `scope`
-每个 level 的测试文件宜聚焦四件事：
-1. 该 level 的**正向能力**
-2. 该 level **明确未启用**的行为
-3. 与**相邻 level** 的对比
-4. 少量**代表性契约**
-断言上优先：
-* 规范化后的 **query 结构**
-* **跨 level 可观察的差异**
-* **稳定的对外 meta**（如 `meta.level` 等）
-尽量避免过度绑定：
-* warning **逐字全文**
-* 内部 **规则 ID 字符串**
-* **子句顺序**（除非顺序本身就是契约的一部分）
----
-#### `test/contracts/`
-覆盖「应对所有 level 成立」的契约，或与单一 level 无关的默认行为。
-适合放在此处的内容包括：
-* 默认 level 行为
-* 各 level 下的幂等
-* 各 level 下的输出不变式
-* 各 level 下的 opaque 子树保留
-* **`predicate` / `scope` 的正式契约**（支持合并、opaque 保留、scope 策略护栏、规则开关）——见 `test/contracts/predicate-scope-stable-contract.test.js`
-全 level 套件请配合 `test/helpers/level-contract-runner.js` 使用。
----
-#### `test/semantic/`
-对照真实执行行为做**语义等价**验证，确保规范化不改变含义。
-该目录有意与 `levels/`、`contracts/` 分开。
----
-#### `test/property/`
-基于属性的随机测试与变形（metamorphic）行为。
-适用于：
-* 随机语义检查
-* 变形不变式
-* 较宽输入空间上的校验
-**不要**把它当作表达「level 边界」的主战场。
+你不是要换掉构建器。
+你是要**净化它的输出**。
 ---
-#### `test/regression/`
+## 🧩 适用场景
-已知历史失败与手工回归用例。
-修复了一个不应再犯的 bug 时，把用例加在这里。
+* 动态筛选 / 搜索 API
+* BI / 报表系统
+* 用户生成的查询
+* 多团队、查询写法不一致的代码库
+* 日志 / 缓存 / 对查询做 diff
 ---
-#### `test/performance/`
+## ⚙️ Levels
-性能护栏或与复杂度相关的行为。
+| Level       | 作用           | 安全级别   |
+| ----------- | -------------- | ---------- |
+| `shape`     | 结构规范化     | 🟢 最稳妥 |
+| `predicate` | 安全的谓词简化 | 🟡         |
+| `scope`     | 有限的约束传播 | 🟡         |
-应聚焦性能相关预期，而非一般性的规范化结构细节。
+默认为 `shape`。
 ---
-### 辅助文件
-#### `test/helpers/level-runner.js`
-在指定 level 下执行 `normalizeQuery` 的共享封装。
-#### `test/helpers/level-cases.js`
-跨 level 测试共用的固定输入；优先把可复用的代表用例加在这里，避免在多个文件里复制同一段 fixture。
+## 📦 输出
-#### `test/helpers/level-contract-runner.js`
-全 level 契约套件共用的 `LEVELS` 与 `forEachLevel` 等辅助逻辑。
+```ts
+{
+  query, // 规范化后的查询
+  meta   // 调试 / 轨迹信息
+}
+```
 ---
-### 新增测试时的规则
-#### 新增一条规范化规则时
+## 🎯 设计理念
-先问：
+> 若某次改写可能出错，就不要做。
-* 是否属于对外 API 行为？→ 加到 `test/api/`
-* 是否仅在某一 level 启用？→ 加到 `test/levels/`
-* 是否应对所有 level 成立？→ 加到 `test/contracts/`
-* 是否关乎语义保持或随机验证？→ 加到 `test/semantic/` 或 `test/property/`
-* 是否针对曾坏过的场景的修复？→ 加到 `test/regression/`
+* 不做 schema 假设
+* 不猜数组语义
+* 不做不安全合并
+* 输出确定
+* 结果幂等
 ---
-#### 新增一个 level 时
+## 🔍 示例
-至少完成：
+```ts
+const result = normalizeQuery({
+  $and: [
+    { status: "open" },
+    { status: { $in: ["open", "closed"] } }
+  ]
+});
-1. 新增 `test/levels/<level>-level.test.js`
-2. 在 `test/helpers/level-contract-runner.js` 中注册该 level
-3. 确保全 level 契约套件会跑到它
-4. 至少补一条与相邻 level 的**对照**用例
+console.log(result.query);
+// { status: "open" }
+```
 ---
-### 测试风格建议
+## 📚 文档
-宜：
-* 用**基于示例**的用例表达 level 边界
-* 断言 **query 形状**
-* 做**相邻 level 对照**
-* **共享**代表性 fixture
-忌：
-* 把 level 测试绑死在易变的实现细节上
-* 同一 fixture 只改断言表面、重复堆砌
-* 把「默认 level」契约塞进某个具体 level 文件
-* 把导出/API 测试与规范化行为测试混在同一文件语义里
+* [`SPEC.zh-CN.md`](SPEC.zh-CN.md) — 行为规格（[English](SPEC.md)）
+* [`docs/normalization-matrix.zh-CN.md`](docs/normalization-matrix.zh-CN.md) — 规则覆盖（[English](docs/normalization-matrix.md)）
+* [`docs/CANONICAL_FORM.md`](docs/CANONICAL_FORM.md) — 规范形态与幂等性（目前仅英文）
+* [`CHANGELOG.zh-CN.md`](CHANGELOG.zh-CN.md) — 更新日志（[English](CHANGELOG.md)）
+* [`test/REGRESSION.md`](test/REGRESSION.md) — 复现 property / 语义测试失败（目前仅英文）
+* [`README.md`](README.md) — English README
 ---
-### 实用对照
+## 🧪 测试
-* `api/`：**库怎么用**
-* `levels/`：**每一层做与不做**
-* `contracts/`：**哪些必须恒真**
-* `semantic` / `property` / `regression` / `performance`：**正确、稳健、效率是否仍成立**
+* 语义等价测试（真实 MongoDB）
+* 基于属性的测试
+* 回归套件
 ---
-### npm 脚本与 property 测试工具链
-随机语义测试使用 **`mongodb-memory-server`** 与 **`fast-check`**，在固定文档 schema 与受限算子集合下，对比 normalize 前后真实 `find` 结果（相同 `sort` / `skip` / `limit`，投影 `{ _id: 1 }`），并断言 **`_id` 顺序一致**、返回 **`query` 幂等**；对 opaque 算子仅要求**不崩溃、第二次 normalize 稳定**。生成器见 `test/helpers/arbitraries.js`；**`FC_SEED` / `FC_RUNS` 默认值统一由 `test/helpers/fc-config.js` 管理**（也由 `arbitraries.js` 再导出）。
-为**避免在线下载** MongoDB 二进制，可在运行语义测试前设置 **`MONGODB_BINARY`**、**`MONGOD_BINARY`** 或 **`MONGOMS_SYSTEM_BINARY`** 指向本机 `mongod`（见 `test/helpers/mongo-fixture.js`）。
-* **`npm run test`**：先 build，再 `test:unit`，再 `test:semantic`。
-* **`npm run test:api`**：仅 `test/api/**/*.test.js`。
-* **`npm run test:levels`**：`test/levels/**/*.test.js` 与 `test/contracts/*.test.js`。
-* **`npm run test:unit`**：除 `test/semantic/**`、`test/regression/**`、`test/property/**` 外的 `test/**/*.test.js`（含 `test/api/**`、`test/levels/**`、`test/contracts/**`、`test/performance/**` 等单元侧用例）。
-* **`npm run test:semantic`**：语义 + 回归 + property（环境变量未设时的默认见 `fc-config.js`）。
-* **`npm run test:semantic:quick`**：降低 **`FC_RUNS`（脚本内为 45）** 并设 **`FC_SEED=42`**，仍包含 `test/regression/**` 与 `test/property/**`。
-* **`npm run test:semantic:ci`**：面向 CI（脚本内 `FC_RUNS=200`、`FC_SEED=42`）。
-可通过 **`FC_SEED`**、**`FC_RUNS`**、可选 **`FC_QUICK=1`** 覆盖 property 参数（见 `fc-config.js`）。**property 失败如何复现、何时沉淀成固定用例**：见 [`test/REGRESSION.md`](test/REGRESSION.md)。
-主随机语义等价**不包含**全文、地理、复杂 `$expr`、`$where`、聚合、collation 等；opaque 算子契约见 **`test/contracts/opaque-operators.all-levels.test.js`**。
----
+## ⭐ 理念
-## 延伸阅读
+多数查询工具追求「聪明」。
-- [SPEC.zh-CN.md](SPEC.zh-CN.md)
-- [docs/CANONICAL_FORM.md](docs/CANONICAL_FORM.md)
+本库追求**正确**。

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mongo-query-normalizer",
-  "version": "0.2.2",
+  "version": "0.2.3",
   "description": "Observable, level-based normalizer for MongoDB query objects. Defaults to conservative shape stabilization; optional predicate and scope levels with documented contracts. Predictable output and metadata—not planner optimization.",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",