iriq 0.2.0 → 0.30.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e629988f23137ecb0c0f4737e246f65a949fa2843f4bd16d244566ca76dc37ed
4
- data.tar.gz: c36cae38205a6a6f63a8a38e40849c922bbb6cf7046e9599aaaefc37fb443303
3
+ metadata.gz: 598c04e3c1777787ae9e5d1be98e2bc68d441e2020ebe7743bdf8075b20fdaec
4
+ data.tar.gz: 396ad6b0b0acffb76b7bc2b4e31792b02ac65749c6ac769fd70a47ce5d806496
5
5
  SHA512:
6
- metadata.gz: 75d0329756d16dd7b9c8e5ca7b8ba447aa51c4d3c34a1ec31549bb6e6725338bfbdb2ee4badb18c52342c8ae0ece5f005e9288c617fa5e140d7cfff400274561
7
- data.tar.gz: cb670e665a2e67feeb5f1bcad157a27e15da1aa61efc53e4ae66388a67ee498ef7caa1605be85d27e087fda44c399a9f93b6701242d1a22dec2653b667918c6d
6
+ metadata.gz: 16637ff46f4648a2cfc14404ed074d3cedc6fc08cabf0c46a6fa7a39553b8b78020907fde1d9024005cec3a4f2cdb4cb3e999802ec866a942c20c15be6c7af34
7
+ data.tar.gz: 458aa6deba73a571a07801bb3df445a4d08500a55c3d1d8c50b11df0811dbc785270fd1ef908ebbb58c6f53dfb1ecfc7013fe137e216e88bca81c9bc56d21fa4
data/CHANGELOG.md CHANGED
@@ -1,3 +1,81 @@
1
+ ### 0.30.2 (2026-06-23)
2
+ - Piped stdin and `--file` now **stream** the per-IRI sections (`-n`/`-p`/`-c`/`-e`) line by line, flushing each IRI as it's processed — `tail -f access.log | iriq -n` is live and memory stays bounded on huge inputs. Output is byte-identical to before; the aggregate views (deduped URL list, clusters, `--stats`) still read the whole input. Ruby, Go, and Rust.
3
+
4
+ ### 0.30.1 (2026-06-21)
5
+ - Batch sections (`--normalize` etc.) are now corpus-informed when `--corpus` is supplied, matching single-input behavior.
6
+ - Added a CLI end-to-end test suite (sections, formats, batch/cluster, subcommands) and a `make check` Rust gate + pre-push hook.
7
+
8
+ ### 0.30.0 (2026-06-21)
9
+ - Rust consolidated into a single crate (library + `iriq` binary) with SQLite always on by default — no separate sqlite build.
10
+ - Go moved into the `go/` subdirectory; import path is now `github.com/dpep/iriq/go`.
11
+
12
+ ### 0.11.0 (2026-05-27)
13
+ - New classifier types: `:color` (hex form `#fff`/`#ffffff`/`#ffffff80`), `:coordinate` (`lat,lng` pair with plausible-range validation), `:country` (ISO 3166-1 alpha-2, allowlisted), `:base64` (≥16 chars with `+`/`/`/`=` to disambiguate from `:opaque_id`).
14
+ - `SegmentClassifier.color_kind(value)` / `ColorKind(value)` returns `:hex` for hex-shaped colors — placeholder for future named / rgb / hsl support, mirrors the file_kind pattern.
15
+ - Param-name hint map extended: `color`/`bg`/`fg`/`background`/`foreground` → `:color`, `coords`/`coordinates`/`geo`/`location`/`position`/`latlng` → `:coordinate`, `country`/`country_code`/`nation` → `:country`.
16
+ - `-J` is now a short alias for `--ndjson` (combinable: `iriq -nJ < file`).
17
+ - New CLI `-e/--explain` flag — annotated normalization trace. For each path segment / query param, shows the value, type, output (placeholder or canonical value), and notes for every non-obvious transformation (hint suppression for semantic types, currency upcase, IP umbrella collapse, canonical date, param-name lift). JSON via `-e -j` returns the same structure.
18
+ - Library API: `Iriq::Trace.for(input)` (Ruby) / `iriq.Trace(input)` (Go) returns the same trace data structure.
19
+ - Classifier perf: each regex test is now gated on a cheap composition check (`String#include?` / `IndexByte` / `size`) so a literal like `"users"` skips ~20 regex matches instead of walking the full chain. Measured: Ruby normalize +12%, extract +27%; Go CLI wall time -25%.
20
+
21
+ ### 0.10.0 (2026-05-27)
22
+ - New classifier type `:file` — `name.ext` shape where `ext` is in a curated allowlist spanning image / document / data / text / web / audio / video / archive / code kinds. `image.png` and `report.pdf` classify as `:file` instead of falling through to `:opaque_id`. The per-extension kind (`:image`, `:document`, etc.) is surfaced via `SegmentClassifier.file_kind(value)` / `FileKindOf(value)` for verbose displays.
23
+ - `Cluster#param_summary` adds `:kind_distribution` for `:file`-typed params — buckets observed values by kind. Best-effort: only reflects values within the tracking cap.
24
+ - New phone format: NANP-style `555-666-7777`, `555.666.7777`, `(555) 666-7777`. Leading area-code + exchange digits constrained to 2-9 so dotted version strings / digit blobs don't shadow. The `+` E.164 form still covers international.
25
+ - Param-name hints — when a value's type is generic (`:literal`, `:opaque_id`, `:slug`), the param name can supply the type. `?phone=unknown` becomes `{phone}` and `?email=tbd` becomes `{email}`. Hint map covers phone/email/locale/currency/url/jwt/mime variations. Specific value types (e.g. `?phone=12345` → `:integer`) still win.
26
+
27
+ ### 0.9.0 (2026-05-27)
28
+ - Semantic types (`:version`, `:locale`, `:currency`, `:date`, `:boolean`, `:timestamp`, etc.) now surface as `{type}` placeholders instead of being run through the noun-singularize hint. `/api/v1/status` renders `/api/{version}/status` rather than the misleading `/api/{api_id}/status`. Only ID-shaped types (`:integer`, `:uuid`, `:hash`, `:opaque_id`, `:slug`) keep the `{noun_id}` form.
29
+ - `--normalize` collapses `:ipv4` and `:ipv6` to `{ip}` in placeholder form (previously rendered as `{ipv4}` / `{ipv6}`). The classifier still tracks the specific family; cluster summary keeps the distinct types.
30
+ - `--normalize` canonicalizes currency segments and params to ISO 4217 upper case — `/pricing/usd` → `/pricing/USD`, `?currency=eur` → `?currency=EUR`. Mirrors the existing date canonicalization (`canonical_currencies: true` flag on `PathShape`).
31
+ - `LOCALE_RE` tightened: the region/script portion now caps at 2-4 alphanumeric chars and the language portion is validated against the ISO 639-1 allowlist — `by-locale` no longer wrongly classifies as `:locale`.
32
+ - New classifier types: `:phone` (E.164 — `+` then 7-15 digits with optional separators), `:jwt` (three base64url segments separated by dots), `:mime` (RFC 2046 top-level type + subtype, e.g. `image/png`, `application/vnd.api+json`).
33
+ - New corpus-promoted type `:http_status` — integer positions whose observed range falls inside 100..599 with ≥2 distinct values and ≥5 samples get promoted. Same range-analysis pattern as `:year`.
34
+ - Scheme-less URL detection: query values like `?redirect=foo.com/path` classify as `:url`. Requires a dotted host with a TLD-like ≥2-letter suffix followed by a slash, so `image.png` stays as `:opaque_id`.
35
+ - `Cluster#param_summary` adds two new fields:
36
+ - `:value_distribution` — fractions per tracked value, for `:boolean` and `:enum` positions (e.g. `{ "true" => 0.97, "false" => 0.03 }`). Same data already in `value_counts`, surfaced as ratios.
37
+ - `:subtype_distribution` — int-vs-float split for `:number` positions (e.g. `{ integer: 0.4, float: 0.6 }`).
38
+ - `:boolean` now wins over `:enum` when the dominant type is boolean — a position of pure `true`/`false` stays `:boolean` rather than being demoted to a 2-value enum.
39
+
40
+ ### 0.8.0 (2026-05-27)
41
+ - **Breaking**: `:numeric` umbrella renamed to `:number` (Ruby) / `TypeNumeric` → `TypeNumber` (Go). Same semantics.
42
+ - New classifier types: `:boolean` (`true`/`false`, any case), `:version` (`v1`, `v2.0.1`, `v1.2.3-beta` — requires the `v` prefix), `:locale` (BCP 47-ish full forms like `en-US`/`fr_CA`, plus bare 2-letter language codes from an inline ISO 639-1 allowlist of ~55 entries — `en`, `fr`, `ja`, etc.), `:currency` (3-letter codes from an inline ISO 4217 allowlist of ~35 entries).
43
+ - `:year` is now corpus-only: an `:integer` position whose observed min/max land in 1900..2100 with ≥2 distinct values gets promoted. A single 4-digit integer in isolation classifies as `:integer` — only range analysis across observations is reliable.
44
+ - `PositionStats` now tracks `numeric_min` / `numeric_max` / `numeric_sum` / `numeric_count` for `:integer`/`:float` observations. `Cluster#param_summary` surfaces `min` / `max` / `avg` on any param with numeric observations.
45
+ - Shape-y variable types (`:version`, `:locale`, `:currency`, `:boolean`) now respect the stable-literal rule: a single dominant value at a position (`v1` only across many observations) stays as the literal `v1` instead of being placeholdered as `{version}`. High cardinality at the same position falls back to `{version}` / `{locale}` / etc. as expected.
46
+ - 0/1 booleans still classify as `:integer` individually; the existing `:enum` umbrella catches `?flag=0` / `?flag=1` patterns when they cluster.
47
+
48
+ ### 0.7.0 (2026-05-27)
49
+ - **Breaking**: `:integer_id` classifier type renamed to `:integer` (Ruby) / `TypeIntegerID` → `TypeInteger` (Go). The "ID" semantics live in the hints layer (which still produces `{user_id}` placeholders); the classifier now reflects pure shape. Update any direct `.classify(...) == :integer_id` checks, dump-file consumers, and persisted corpora — the type symbol changed in `type_counts` and raw shape strings (e.g. `/users/{integer_id}` → `/users/{integer}`).
50
+ - New `:enum` umbrella (corpus-only): when a param has a small bounded set of repeated values (default ≥20 observations, ≤10 distinct, each ≥2 occurrences, ≥95% coverage), `Cluster#param_type` returns `:enum` and `param_summary` includes the value list under `:values`. Normalize output keeps the `{enum}` placeholder — values aren't inlined.
51
+ - `iriq --host=full|registrable|reg|none` CLI flag plumbs `Corpus#host_strategy` from the command line. `reg` is a short alias for `registrable`.
52
+
53
+ ### 0.6.0 (2026-05-27)
54
+ - New classifier types: `:ipv4`, `:ipv6`, `:url`, `:email` (Ruby) / `TypeIPv4`, `TypeIPv6`, `TypeURL`, `TypeEmail` (Go). Slotted before the generic `:opaque_id` / `:literal` catch-alls so URL params like `?redirect=https://foo.com/...`, `?email=alice@example.com`, `?ip=192.168.1.1`, `?gateway=fe80::1` get distinct types instead of falling through.
55
+ - IPv4 validates octets ≤ 255 — out-of-range dotted-quads fall back to `:opaque_id`.
56
+ - IPv6 accepts the full eight-group form and any compressed form containing `::`. IPv4-mapped variants (`::ffff:192.0.2.1`) are not recognized.
57
+
58
+ ### 0.5.0 (2026-05-27)
59
+ - Float values now classify as `:float` instead of falling through to `:opaque_id` (Ruby `:float` / Go `TypeFloat`). Regex requires digits on both sides of the decimal — `3.14`, `-2.5`, `1.0` match; `.5`, `1.`, `1e10` do not.
60
+ - New `:numeric` umbrella (corpus-only): when a cluster sees both `:integer_id` and `:float` observations at the same param with neither subtype hitting the 80% confidence threshold, the param surfaces as `:numeric` in `param_summary` and renders as `{numeric}` in `Corpus#normalize` output. The classifier itself never returns `:numeric` directly — individual values are always specifically int or float.
61
+ - `Corpus.new(host_strategy: ...)` knob controls how host is keyed into clusters: `:full` (default, unchanged), `:registrable` (strip subdomains, so `api.foo.com` and `app.foo.com` cluster as `foo.com`), `:none` (ignore host, group all observations by shape alone). `:registrable` uses an inline allowlist of ~70 common multi-label TLDs (`co.uk`, `com.au`, `co.jp`, etc.) — niche multi-label suffixes like `.priv.no` will be over-stripped.
62
+
63
+ ### 0.4.0 (2026-05-27)
64
+ - Query-param clustering: each `Cluster` now tracks per-param presence, value cardinality, and type via `param_stats`. Surfaced on `cluster.to_h[:params]` (and the JSON cluster view), persisted in both JSON and SQLite backends.
65
+ - `Corpus#normalize` (Ruby) / `Corpus.NormalizeIdentifier` (Go) now include query params, rendered with corpus-informed types when available (falls back to mechanical classification otherwise).
66
+ - New `corpus.params_for(url)` / `Corpus.ParamsFor(url)` — returns the inferred params for the cluster `url` would fall into. Useful for "what params might this URL accept?" tooling.
67
+ - Date detection expanded to include `YYYY/MM/DD` and `YYYYMMDD` (with year/month/day sanity bounds) alongside the existing `YYYY-MM-DD`.
68
+ - `SegmentClassifier.canonical_date(value)` / `CanonicalDate(value)` returns the ISO form for any recognized date.
69
+ - `--normalize` output canonicalizes recognized date values to `YYYY-MM-DD` (path segments and query params). Cluster keys still use `{date}` placeholders so dated routes still group together.
70
+ - `PositionStats::DEFAULT_MAX_VALUES` is now the value cap for `cluster.param_stats[name]` too.
71
+
72
+ ### 0.3.0 (2026-05-25)
73
+ - Go: SQLite backend is now opt-in via `-tags sqlite`. Default `go install` and the `iriq` Homebrew formula ship a slim binary (~30% smaller) with JSON corpora only. SQLite users compile with `-tags sqlite` or install `dpep/tools/iriq-sqlite`.
74
+ - Makefile: `release` / `release-sqlite` targets strip debug symbols and use `-trimpath` for reproducible builds.
75
+ - CLI: `iriq --help` reports the active build (slim vs sqlite).
76
+ - Slim build returns a friendly error when a `.db` corpus path is opened, pointing at the iriq-sqlite formula.
77
+ - `PositionStats::DEFAULT_MAX_VALUES` / `DefaultMaxValuesPerPosition` raised from 1000 → 5000. Existing corpora keep whatever cap they were created with (the cap is persisted in the dump / SQLite meta table); only freshly-constructed corpora pick up the new default.
78
+
1
79
  ### 0.2.0 (2026-05-25)
2
80
  - Corpus storage backends: JSON (default) and SQLite, dispatched by file extension
3
81
  - Go: `iriq.OpenCorpus(path)`; Ruby: `Iriq::Corpus.open(path)`
data/CLAUDE.md CHANGED
@@ -1,34 +1,69 @@
1
1
  # Iriq development conventions
2
2
 
3
- ## Repo layout Ruby and Go intermixed at the root
4
-
5
- We chose to mix Ruby and Go at the repo root rather than nest the Go module
6
- under `/go/`. The signal is "both implementations are peers, not one-is-primary."
3
+ > **⚠️ Behavior changes touch ALL THREE runtimes.** Ruby is the reference; Go
4
+ > + Rust mirror it. Before committing any change to
5
+ > parser/normalizer/extractor/CLI/etc:
6
+ >
7
+ > 1. Update Ruby + specs.
8
+ > 2. `bundle exec ruby script/generate_fixtures.rb` (regenerate JSON parity fixtures).
9
+ > 3. Port the change to Go (the Go module lives in `go/`).
10
+ > 4. `go -C go test ./...` — fixture tests should still pass.
11
+ > 5. `make build && script/cli_parity.sh` — Ruby ↔ Go CLI parity should still pass.
12
+ > 6. Port the change to Rust under `rust/`.
13
+ > 7. `cd rust && cargo test --workspace` — Rust fixture tests should still pass
14
+ > (SQLite is a default feature).
15
+ > 8. `cd rust && cargo build --release --bin iriq && cd .. && script/rust_parity.sh`
16
+ > — Rust ↔ Go CLI parity (covers Ruby transitively).
17
+ > 9. Commit the regenerated fixtures alongside the code change.
18
+ >
19
+ > CI's parity + Rust jobs will fail if any step is skipped. The **Rust gate**
20
+ > (fmt + clippy + tests) is automated — run `make hooks` once to install the
21
+ > committed pre-push hook that runs `make check`. Full multi-runtime pre-push
22
+ > for a behavior change:
23
+ > `bundle exec rspec && go -C go test ./... && script/cli_parity.sh && make check && script/rust_parity.sh`.
24
+
25
+ ## Repo layout — Ruby at the root, Go and Rust in subdirs
26
+
27
+ The Ruby gem lives at the repo root (it's the reference implementation and the
28
+ published gem); the two mirror implementations are compartmentalized into
29
+ `go/` and `rust/`. Earlier the Go code was intermixed at the root; it now sits
30
+ in `go/`, symmetric with `rust/`, so the root reads as "Ruby + two ports."
7
31
 
8
32
  ```
9
33
  iriq/
10
- lib/ exe/ spec/ ← Ruby gem (library, CLI, specs)
34
+ lib/ exe/ spec/ ← Ruby gem (library, CLI, specs) — the reference
35
+ completions/ ← shell-completion scripts shipped by the gem
11
36
  iriq.gemspec
12
37
  Gemfile
13
38
 
14
- go.mod ← module github.com/dpep/iriq
15
- *.go Go package `iriq` at the root
16
- cmd/iriq/ ← Go CLI binary
17
- bin/ built Go binary (gitignored)
18
-
19
- script/ ← shared dev scripts (fixture gen, parity, benches)
20
- spec/fixtures/ golden JSON shared by Ruby specs + Go tests
21
- .github/workflows/ Ruby CI, Go CI, parity CI
39
+ go/ Go module github.com/dpep/iriq/go
40
+ go.mod go.sum
41
+ *.go ← Go package `iriq`
42
+ cmd/iriq/ ← Go CLI binary
43
+ completions/ ← Go's own embedded copy (go:embed can't reach ../)
44
+
45
+ rust/ Cargo workspace
46
+ Cargo.toml workspace root
47
+ iriq/ ← one crate: library + `iriq` CLI binary; inlines completions
48
+ REPORT.md ← Go → Rust port spike notes + perf
49
+ target/ ← Rust build artifacts (gitignored)
50
+
51
+ bin/ ← built Go binary (gitignored)
52
+ script/ ← shared dev scripts (fixture gen, parity, benches)
53
+ spec/fixtures/ ← golden JSON shared by Ruby specs + Go + Rust tests
54
+ .github/workflows/ ← Ruby CI, Go CI, Rust CI, parity CIs
22
55
  ```
23
56
 
24
- Trade-offs of this layout:
57
+ Notes on this layout:
25
58
 
26
- - Clean import path: `github.com/dpep/iriq` (no `/go/` artifact in consumers' code).
27
- - One version tag (`vX.Y.Z`) serves both runtimes — Ruby's gemspec and Go's
28
- module use the same tag stream.
29
- - Root `ls` is busier (~15 `.go` files next to Ruby ones), accepted in exchange.
30
- - The gemspec explicitly excludes Go files so `gem build` doesn't ship them:
31
- `git ls-files * ':!:spec' ':!:script' ':!:cmd' ':!:bin' ':!:*.go' ':!:go.mod' ':!:go.sum'`.
59
+ - Go's import path is now `github.com/dpep/iriq/go` (the `/go` suffix matches
60
+ the subdir). Consumers import `github.com/dpep/iriq/go`.
61
+ - One version tag (`vX.Y.Z`) serves all three runtimes — Ruby's gemspec, Go's
62
+ module, and Rust's `Cargo.toml` use the same tag stream.
63
+ - The gemspec ships only Ruby + `completions/`, excluding `go/` and `rust/`:
64
+ `git ls-files * ':!:spec' ':!:script' ':!:bin' ':!:rust' ':!:go'`.
65
+ - Completion scripts exist in three places (gem root `completions/`, `go/completions/`
66
+ for `go:embed`, and inlined in the Rust CLI) — keep them in sync like fixtures.
32
67
 
33
68
  ## Building
34
69
 
@@ -44,24 +79,40 @@ make uninstall # remove from $GOBIN
44
79
  make clean # remove ./bin/
45
80
  make test # go test ./...
46
81
 
47
- # Both via Homebrew
48
- brew install dpep/tools/iriq # uses the Ruby gem under the hood
82
+ # Rust one crate (library + `iriq` binary), SQLite bundled by default
83
+ cd rust && cargo build --release --bin iriq # ./rust/target/release/iriq
84
+ cd rust && cargo install --path iriq # install into ~/.cargo/bin
85
+ cd rust && cargo test --workspace
86
+
87
+ # Via Homebrew (builds the Rust CLI from main)
88
+ brew install dpep/tools/iriq
89
+
90
+ # Via crates.io
91
+ cargo install iriq
49
92
  ```
50
93
 
51
- ## Keeping Ruby and Go in sync
94
+ ## Keeping the three runtimes in sync
52
95
 
53
- The Ruby gem is the **reference implementation**. Go mirrors its public API
54
- and behavior. Two layers of parity testing keep them aligned:
96
+ Ruby is the **reference implementation**. Go and Rust mirror its public API
97
+ and behavior. Three layers of parity testing keep them aligned:
55
98
 
56
99
  1. **Golden JSON fixtures** (`spec/fixtures/*.json`)
57
100
  Generated by `script/generate_fixtures.rb` from the Ruby implementation
58
- over a curated set of inputs. Go's `fixtures_test.go` loads each file
59
- and asserts the same outputs from the Go side.
101
+ over a curated set of inputs. Go's `fixtures_test.go` and Rust's
102
+ `rust/iriq/tests/fixtures.rs` both load each file and assert the same
103
+ outputs.
60
104
 
61
- 2. **CLI parity harness** (`script/cli_parity.sh`)
105
+ 2. **Ruby ↔ Go CLI parity harness** (`script/cli_parity.sh`)
62
106
  Runs the same input through `bundle exec exe/iriq` and the Go binary and
63
107
  diffs stdout. Lives in CI as the `Ruby ↔ Go parity` job.
64
108
 
109
+ 3. **Rust ↔ Go CLI parity harness** (`script/rust_parity.sh`)
110
+ Same idea — runs every Phase 1 + Phase 2 scenario (single-input,
111
+ pipe-mode, JSON corpus, SQLite corpus, --stats, --reinfer,
112
+ --propose-recognizers, --cross-host-shapes, --host=reg) through the
113
+ Go and Rust binaries and diffs stdout. Lives in CI as the
114
+ `Rust ↔ Go parity` job. Rust transitively inherits Ruby parity via Go.
115
+
65
116
  When changing behavior:
66
117
 
67
118
  1. Update the Ruby code + specs first.
@@ -69,24 +120,54 @@ When changing behavior:
69
120
  3. Port the change to Go.
70
121
  4. `go test ./...` (uses the updated fixtures).
71
122
  5. `script/cli_parity.sh` should pass.
72
- 6. Commit fixtures with the change CI will fail if they're stale.
123
+ 6. Port the change to Rust under `rust/`.
124
+ 7. `cd rust && cargo test --workspace`.
125
+ 8. `cd rust && cargo build --release --bin iriq && cd .. && script/rust_parity.sh` should pass.
126
+ 9. Commit fixtures with the change — CI will fail if they're stale.
73
127
 
74
128
  ## Tests
75
129
 
76
130
  ```sh
77
- bundle exec rspec # Ruby suite (305+ examples)
78
- go test ./... # Go suite (native + fixture parity tests)
79
- script/cli_parity.sh # CLI parity (13+ scenarios)
131
+ bundle exec rspec # Ruby suite (305+ examples)
132
+ go test ./... # Go suite (native + fixture parity)
133
+ script/cli_parity.sh # Ruby ↔ Go CLI parity
134
+ cd rust && cargo test --workspace
135
+ cd rust && cargo fmt --check # formatting (CI-gated)
136
+ cd rust && cargo clippy --workspace --all-targets -- -D warnings
137
+ make check # the three Rust checks above, in one shot
138
+ script/rust_parity.sh # Rust ↔ Go CLI parity (~59 scenarios)
80
139
  ```
81
140
 
82
141
  ## Releases
83
142
 
84
- - One version tag covers both runtimes bump `lib/iriq/version.rb` (and
85
- optionally a matching constant on the Go side if we add one), tag `vX.Y.Z`,
86
- push.
87
- - `gem push iriq-X.Y.Z.gem` to publish to RubyGems.
88
- - Update `Formula/iriq.rb` in the homebrew-tools tap to the new version.
89
- - Go consumers pick up the tag automatically via `go get @vX.Y.Z`.
143
+ Versioning is single-stream: one `vX.Y.Z` covers all three runtimes. Bump the
144
+ three version constants **together** the `--version` parity checks fail
145
+ if they drift:
146
+
147
+ 1. `lib/iriq/version.rb` (`VERSION`), `go/version.go` (`Version`), and the two
148
+ `version = "X.Y.Z"` / `pub const VERSION` lines in `rust/iriq/Cargo.toml` and
149
+ `rust/iriq/src/lib.rs` — same string.
150
+ 2. `Gemfile.lock` — re-resolve so the pinned `iriq (X.Y.Z)` matches
151
+ (`bundle install`, or it regenerates on the next `bundle exec`). Commit it.
152
+ 3. Run `cd rust && cargo update -p iriq` to refresh `Cargo.lock`.
153
+ 4. Tag `vX.Y.Z` and push. Go consumers pick it up via
154
+ `go get github.com/dpep/iriq/go@vX.Y.Z`.
155
+ 5. `gem push iriq-X.Y.Z.gem` to publish to RubyGems.
156
+ 6. `cd rust && cargo publish -p iriq` to publish to crates.io (the crate ships
157
+ both the library and the `iriq` binary).
158
+
159
+ ### Keep Homebrew in sync — bump on EVERY version change
160
+
161
+ The tap (`~/code/lib/homebrew-tools`) ships a single `Formula/iriq.rb` that
162
+ builds the Rust CLI (`cargo install --path rust/iriq`) from `branch: "main"`.
163
+ SQLite is on by default (the `iriq` crate's `default` feature set), so there is
164
+ no longer a separate `iriq-sqlite` formula.
165
+
166
+ The formula pins a static `version "X.Y.Z"` label. Because the build tracks
167
+ `main` rather than a tagged tarball, `brew upgrade` only rebuilds when that
168
+ label changes. So on every bump here, update the `version` string in
169
+ `Formula/iriq.rb` to match `version.rb`, then commit + push the tap. Leaving it
170
+ stale means brew users never get the new code even though it's already on `main`.
90
171
 
91
172
  ## Corpus storage backends
92
173
 
@@ -106,10 +187,14 @@ SQLite file with JSON, etc.).
106
187
 
107
188
  The Ruby `sqlite3` gem is loaded lazily (only when a `.db` path is opened),
108
189
  keeping the iriq install footprint minimal for users that stick with JSON.
109
- On the Go side we use `modernc.org/sqlite` (pure Go — no cgo).
190
+ On the Go side we use `modernc.org/sqlite` (pure Go — no cgo). The Rust
191
+ side uses `rusqlite` with the `bundled` feature (statically links C SQLite,
192
+ ~3-4 MB binary cost). Schema v4 is shared across all three runtimes — a
193
+ `.db` written by any binary opens cleanly in any other.
110
194
 
111
- When adding a new backend, replicate the contract in both languages and
112
- add a parity scenario in `script/cli_parity.sh`'s `corpus_pair` section.
195
+ When adding a new backend, replicate the contract in all three languages
196
+ and add parity scenarios in `script/cli_parity.sh`'s `corpus_pair`
197
+ section + `script/rust_parity.sh`'s `corpus_pair`.
113
198
 
114
199
  ## What lives where in scripts
115
200
 
@@ -117,5 +202,7 @@ add a parity scenario in `script/cli_parity.sh`'s `corpus_pair` section.
117
202
  - `script/memory.rb` — Ruby-only memory profile.
118
203
  - `script/generate_fixtures.rb` — produces `spec/fixtures/*.json` for cross-runtime parity.
119
204
  - `script/cli_parity.sh` — Ruby ↔ Go CLI diff.
205
+ - `script/rust_parity.sh` — Rust ↔ Go CLI diff.
206
+ - `script/bench_three_way.sh` — Go vs Rust wall-clock comparison.
120
207
  - `script/bench_compare.sh` — Ruby vs Go CLI wall-time comparison.
121
208
  - `script/bench_storage.sh` — JSON vs SQLite backend timing (single-process, incremental, concurrent).
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- iriq (0.2.0)
4
+ iriq (0.30.2)
5
5
 
6
6
  GEM
7
7
  remote: https://rubygems.org/
@@ -54,7 +54,7 @@ GEM
54
54
  simplecov_json_formatter (~> 0.1)
55
55
  simplecov-html (0.13.2)
56
56
  simplecov_json_formatter (0.1.4)
57
- sqlite3 (2.9.4)
57
+ sqlite3 (2.9.5)
58
58
  mini_portile2 (~> 2.8.0)
59
59
  stringio (3.2.0)
60
60
  tsort (0.2.0)
@@ -78,7 +78,7 @@ CHECKSUMS
78
78
  erb (6.0.4) sha256=38e3803694be357fe2bfe312487c74beaf9fb4e5beb3e22498952fe1645b95d9
79
79
  io-console (0.8.2) sha256=d6e3ae7a7cc7574f4b8893b4fca2162e57a825b223a177b7afa236c5ef9814cc
80
80
  irb (1.17.0) sha256=168c4ddb93d8a361a045c41d92b2952c7a118fa73f23fe14e55609eb7a863aae
81
- iriq (0.2.0)
81
+ iriq (0.30.2)
82
82
  mini_portile2 (2.8.9) sha256=0cd7c7f824e010c072e33f68bc02d85a00aeb6fce05bb4819c03dfd3c140c289
83
83
  pp (0.6.3) sha256=2951d514450b93ccfeb1df7d021cae0da16e0a7f95ee1e2273719669d0ab9df6
84
84
  prettyprint (0.2.0) sha256=2bc9e15581a94742064a3cc8b0fb9d45aae3d03a1baa6ef80922627a0766f193
@@ -95,7 +95,7 @@ CHECKSUMS
95
95
  simplecov (0.22.0) sha256=fe2622c7834ff23b98066bb0a854284b2729a569ac659f82621fc22ef36213a5
96
96
  simplecov-html (0.13.2) sha256=bd0b8e54e7c2d7685927e8d6286466359b6f16b18cb0df47b508e8d73c777246
97
97
  simplecov_json_formatter (0.1.4) sha256=529418fbe8de1713ac2b2d612aa3daa56d316975d307244399fa4838c601b428
98
- sqlite3 (2.9.4) sha256=6161c5b9c17886b289558e6c8082b28a22a814736d2433c9a67f4c6bfcde5c97
98
+ sqlite3 (2.9.5) sha256=04572973a3f943ad50a8adfffc8dd752a5f06e4c3db2026f71838fed8a982606
99
99
  stringio (3.2.0) sha256=c37cb2e58b4ffbd33fe5cd948c05934af997b36e0b6ca6fdf43afa234cf222e1
100
100
  tsort (0.2.0) sha256=9650a793f6859a43b6641671278f79cfead60ac714148aabe4e3f0060480089f
101
101
 
data/Makefile CHANGED
@@ -1,48 +1,105 @@
1
1
  # Iriq Go binary — build/install/clean/uninstall helpers.
2
2
  #
3
- # make - same as `make help`
4
- # make build - build into ./bin/iriq
5
- # make install - go install into $GOBIN (defaults to $GOPATH/bin)
6
- # make test - go test ./...
7
- # make clean - remove ./bin/
8
- # make uninstall - remove the binary from $GOBIN
3
+ # make - same as `make help`
4
+ # make build - dev build into ./bin/iriq (no SQLite, debug info)
5
+ # make build-sqlite - dev build with SQLite backend included
6
+ # make release - stripped + trimpath build (no SQLite)
7
+ # make release-sqlite - stripped + trimpath build with SQLite
8
+ # make install - go install into $GOBIN
9
+ # make test - go test ./... (both tag states)
10
+ # make clean - remove ./bin/
11
+ # make uninstall - remove the binary from $GOBIN
12
+ #
13
+ # The default build excludes the SQLite backend to keep the binary lean.
14
+ # Pass `-tags sqlite` (or use the *-sqlite targets) to compile it in. The
15
+ # CLI's `--version` output tells you which backends are baked in.
9
16
  #
10
17
  # Ruby gem build/install is handled by Bundler/RubyGems; see CLAUDE.md.
11
18
 
12
- GO ?= go
13
- BIN_DIR := bin
14
- BIN := $(BIN_DIR)/iriq
15
- PKG := ./cmd/iriq
19
+ GO ?= go
20
+ GO_DIR := go
21
+ BIN_DIR := bin
22
+ BIN := $(BIN_DIR)/iriq
23
+ # Absolute output path: builds run inside $(GO_DIR) via `go -C`, so a
24
+ # relative -o would land under go/. Keep the binary at the repo-root bin/.
25
+ ABS_BIN := $(CURDIR)/$(BIN)
26
+ PKG := ./cmd/iriq
27
+
28
+ # Rust crate lives under rust/; CI gates fmt + clippy + tests there.
29
+ CARGO ?= cargo
30
+ RUST_DIR := rust
31
+
32
+ # Release flags strip the symbol table (-s), debug info (-w), and bake
33
+ # reproducible paths (-trimpath). Drops binary size ~30% with no
34
+ # functional impact; stack-trace function names are gone but file:line
35
+ # resolution still works.
36
+ RELEASE_FLAGS := -ldflags "-s -w" -trimpath
16
37
 
17
38
  # Resolve $GOBIN, falling back to $GOPATH/bin (Go's default install location).
18
- GOBIN := $(shell $(GO) env GOBIN)
39
+ GOBIN := $(shell $(GO) env GOBIN)
19
40
  ifeq ($(GOBIN),)
20
- GOBIN := $(shell $(GO) env GOPATH)/bin
41
+ GOBIN := $(shell $(GO) env GOPATH)/bin
21
42
  endif
22
- INSTALLED := $(GOBIN)/iriq
43
+ INSTALLED := $(GOBIN)/iriq
23
44
 
24
45
  .DEFAULT_GOAL := help
25
- .PHONY: help build install test clean uninstall
46
+ .PHONY: help build build-sqlite release release-sqlite install test clean uninstall check fmt hooks
26
47
 
27
48
  help:
28
49
  @echo "Iriq Go targets:"
29
- @echo " make build build into $(BIN)"
30
- @echo " make install go install into $(GOBIN)"
31
- @echo " make test run go test ./..."
32
- @echo " make clean remove $(BIN_DIR)/"
33
- @echo " make uninstall remove $(INSTALLED)"
50
+ @echo " make build slim dev build into $(BIN)"
51
+ @echo " make build-sqlite dev build with SQLite backend"
52
+ @echo " make release stripped slim build into $(BIN)"
53
+ @echo " make release-sqlite stripped build with SQLite backend"
54
+ @echo " make install go install into $(GOBIN)"
55
+ @echo " make test run go test ./... in both tag states"
56
+ @echo " make check Rust gate: cargo fmt --check + clippy + test (run before merging)"
57
+ @echo " make fmt cargo fmt the Rust crate"
58
+ @echo " make hooks enable the committed git hooks (pre-push runs 'make check')"
59
+ @echo " make clean remove $(BIN_DIR)/"
60
+ @echo " make uninstall remove $(INSTALLED)"
34
61
 
35
62
  build:
36
63
  @mkdir -p $(BIN_DIR)
37
- $(GO) build -o $(BIN) $(PKG)
38
- @echo "built $(BIN)"
64
+ $(GO) -C $(GO_DIR) build -o $(ABS_BIN) $(PKG)
65
+ @echo "built $(BIN) (slim, debug)"
66
+
67
+ build-sqlite:
68
+ @mkdir -p $(BIN_DIR)
69
+ $(GO) -C $(GO_DIR) build -tags sqlite -o $(ABS_BIN) $(PKG)
70
+ @echo "built $(BIN) (sqlite, debug)"
71
+
72
+ release:
73
+ @mkdir -p $(BIN_DIR)
74
+ $(GO) -C $(GO_DIR) build $(RELEASE_FLAGS) -o $(ABS_BIN) $(PKG)
75
+ @echo "built $(BIN) (slim, stripped)"
76
+
77
+ release-sqlite:
78
+ @mkdir -p $(BIN_DIR)
79
+ $(GO) -C $(GO_DIR) build -tags sqlite $(RELEASE_FLAGS) -o $(ABS_BIN) $(PKG)
80
+ @echo "built $(BIN) (sqlite, stripped)"
39
81
 
40
82
  install:
41
- $(GO) install $(PKG)
83
+ $(GO) -C $(GO_DIR) install $(PKG)
42
84
  @echo "installed $(INSTALLED)"
43
85
 
44
86
  test:
45
- $(GO) test ./...
87
+ $(GO) -C $(GO_DIR) test ./...
88
+ $(GO) -C $(GO_DIR) test -tags sqlite ./...
89
+
90
+ # The Rust gate — mirrors CI's Rust job. Run before merging/pushing (the
91
+ # pre-push hook runs this for you once `make hooks` is enabled).
92
+ check:
93
+ cd $(RUST_DIR) && $(CARGO) fmt --check
94
+ cd $(RUST_DIR) && $(CARGO) clippy --workspace --all-targets -- -D warnings
95
+ cd $(RUST_DIR) && $(CARGO) test --workspace
96
+
97
+ fmt:
98
+ cd $(RUST_DIR) && $(CARGO) fmt
99
+
100
+ hooks:
101
+ git config core.hooksPath .githooks
102
+ @echo "git hooks enabled (.githooks) — pre-push now runs 'make check'"
46
103
 
47
104
  clean:
48
105
  rm -rf $(BIN_DIR)