@blamejs/blamejs-shop 0.0.52 → 0.0.54
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +4 -0
- package/SECURITY.md +5 -3
- package/lib/analytics.js +400 -0
- package/lib/email.js +264 -0
- package/lib/giftcards.js +410 -0
- package/lib/index.js +4 -0
- package/lib/inventory-receive.js +494 -0
- package/lib/newsletter.js +176 -12
- package/lib/payment.js +193 -13
- package/lib/reviews.js +412 -0
- package/lib/storefront.js +52 -20
- package/lib/tax.js +391 -3
- package/lib/vendor/MANIFEST.json +2 -2
- package/lib/vendor/blamejs/CHANGELOG.md +2 -0
- package/lib/vendor/blamejs/SECURITY.md +0 -1
- package/lib/vendor/blamejs/api-snapshot.json +2 -2
- package/lib/vendor/blamejs/package.json +1 -1
- package/lib/vendor/blamejs/release-notes/v0.12.4.json +19 -0
- package/lib/webhooks.js +293 -16
- package/lib/wishlist.js +269 -0
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -8,6 +8,10 @@ upgrading across more than a few patches at a time.
|
|
|
8
8
|
|
|
9
9
|
## v0.0.x
|
|
10
10
|
|
|
11
|
+
- v0.0.54 (2026-05-22) — **Reviews, wishlist, inventory-receive, webhooks DLQ, payment idempotency, newsletter unsubscribe, email templates, VAT/GST, midnight theme, analytics events.** Ten new or extended primitives land together. `reviews`, `wishlist`, and `inventoryReceive` are brand-new with their own migrations and surface APIs. `webhooks` gains a dead-letter queue + exponential-backoff retry + sliding-window rate-limit + signed-incoming verification. `payment` gains idempotency-key tracking so a re-issued Stripe call replays the stored response instead of double-charging. `newsletter` gains an unsubscribe token flow + resubscribe path. `email` gains three new transactional templates (wishlist-discount, abandoned-cart, review-request). `tax` gains VAT/GST extraction, reverse-charge for EU B2B, and format-only VAT-ID validation for 29 country codes. `analytics` gains an event-stream surface with hashed identifiers + top-N aggregations + funnel math. `themes/midnight` ships as an alternate dark-mode theme that inherits the default theme's component CSS and overrides only the design tokens. **Added:** *`reviews` primitive — moderated customer reviews per product* — `bShop.reviews.create({ query?, cursorSecret? })` returns `{ submit, get, publish, reject, listForProduct, summaryForProduct, byCustomer, hashCustomerId, hashCustomerEmail }`. Email is stored hash-only via `b.crypto.namespaceHash` namespace `"reviews-customer"`; raw addresses never persist. Either `customer_id` or `customer_email` is required at submit; the hash collapses casing for dedup. Status pipeline: `pending → published → rejected`. `summaryForProduct` returns count + average + distribution histogram. Migration `0011_reviews.sql` (id, product_id, customer_id, customer_id_hash, rating CHECK 1..5, title cap 120, body cap 4000, verified_purchase, status CHECK enum, created_at, updated_at) with indexes on (product, status, created_at desc), (customer_hash, created_at), and (product, rating). · *`wishlist` primitive — saved-for-later with operator analytics* — `bShop.wishlist.create({ query?, cursorSecret? })` returns `{ add, remove, listForCustomer, isWishlisted, countForProduct, popularProducts }`. Idempotent dedup via `UNIQUE (customer_id, product_id, COALESCE(variant_id, ''))`. `popularProducts` returns the top-N most-wishlisted products for operator merchandising. HMAC-tagged pagination cursor on `created_at` desc + `id` desc. Migration `0012_wishlist.sql` (id, customer_id, product_id, variant_id nullable, notes cap 280, created_at). 38 layer-1 check assertions covering add/dedup, scoped remove, pagination, tamper-refused cursor, distinct-customer count. · *`inventoryReceive` primitive — bulk stock receipts with audit trail* — `bShop.inventoryReceive.create({ query?, catalog, cursorSecret? })` returns `{ draft, apply, reverse, get, byReference, list }`. Operators draft a `pending` receipt with N lines in one transaction, then `apply` walks each line + calls `catalog.inventory.restock(sku, qty)` — atomic, so a single restock failure aborts the whole apply and leaves the receipt pending for retry. `reverse` rolls back an applied receipt by issuing the inverse stock adjustment. Idempotent re-apply on `applied` status is a no-op. Migration `0018_inventory_receipts.sql` (receipts + receipt_lines tables, FK CASCADE, indexes on received_at + status + receipt_id + sku). · *`webhooks` — dead-letter queue + exponential backoff + sliding-window rate-limit + signed incoming verification* — Failed deliveries now retry on a `[60s, 5m, 30m, 4h, 24h]` exponential schedule. After 5 failures, the row moves to `webhook_dlq` for operator investigation; `replayFromDlq(dlq_id)` re-queues with `attempts=0`. Per-endpoint `rate_limit_per_minute` (default 60) is enforced via a sliding-window count over the deliveries table. New `verifyIncoming(payload, signature_header, secret, tolerance_seconds?)` matches the Stripe `t=<unix>,v1=<hmac>` shape via `b.webhook.verify`; `signOutgoing(payload, secret, ts?)` is the companion emitter so the framework can both produce and verify the standard signature shape. Migration `0017_webhook_dlq.sql` adds the DLQ table + the next_retry_at + last_attempted_at columns on deliveries. · *`payment` — idempotency-key tracking on mutating Stripe calls* — `createPaymentIntent`, `refund`, and the `subscriptions.*` mutating methods now accept an `idempotency_key` parameter. When the primitive is created with a `query` opt, the same key + canonicalised body hash returns the stored response without re-calling Stripe; the same key with a *different* body throws `TypeError("payment: idempotency_key collision (different inputs)")` — silently returning would let an attacker replay a different amount. Migration `0015_payment_idempotency.sql` stores (idempotency_key PK, operation, request_hash, response_status, response_body, expires_at) with a 24-hour TTL matching Stripe's. `cleanupExpired()` purges expired rows on operator schedule. · *`newsletter` — single-use unsubscribe token + resubscribe flow* — `issueUnsubscribeToken(signup_id)` mints a 24-byte base64url plaintext bearer (32 chars), stores only the `namespaceHash("newsletter-unsubscribe", plaintext)` with a one-year expiry; the plaintext is returned once, never re-issuable. `consumeUnsubscribeToken(plaintext)` is the single-use redeem — structured-result return distinguishes `not-found`, `already-consumed`, `expired`, `ok`. Uses `b.crypto.timingSafeEqual` to equalize hit-vs-miss CPU work. `resubscribe({ email })` clears `unsubscribed_at` for a previously-opted-out signup. Migration `0014_newsletter_unsubscribe_tokens.sql` (token_hash PK, signup_id, created_at, consumed_at, expires_at). · *`email` — three new transactional templates* — `sendWishlistDiscount({ customer_email, product_title, product_url, old_price, new_price, discount_pct, expires_at? })` — price-drop alert with the orange CTA back to the product. `sendAbandonedCartReminder({ customer_email, customer_name?, cart_url, lines, total, notes? })` — loops cart lines through the strict renderer once per row so every cell stays HTML-escaped. `sendReviewRequest({ customer_email, customer_name?, order_id, products, review_base_url })` — per-product review links built as `review_base_url + "/" + slug + "/review"`. All three emit both html + text via `b.mail.compose`, validate addresses via `b.guardEmail`, and refuse unknown / unused placeholders at the strict renderer. · *`tax` — VAT / GST extraction + reverse charge + VAT-ID format validation* — `calculateInclusive({ amount_minor, rate_bps, currency })` extracts VAT from a gross price; `calculateExclusive` adds VAT to a net price; both use banker's rounding to even and guarantee `net + tax === gross`. `applyReverseCharge({ ctx, seller_vat_id?, buyer_vat_id?, buyer_country })` returns `rate_bps: 0` when both VAT IDs format-validate, both countries are EU, and the buyer VAT-ID country matches. `validateVatId(vat_id, country_code)` regex-checks against 29 country formats (EU-27 + GB + CH, with GR aliasing to EL). `format({ rate_bps, locale?, reverse_charge? })` renders via `Intl.NumberFormat` with optional reverse-charge annotation. Operators wire VIES live-validation out of band — the primitive's JSDoc documents the boundary. · *`analytics` — event-stream surface with hashed identifiers + top-N + funnel* — `recordEvent({ event_type, session_id?, customer_id?, product_id?, search_q?, page_url?, user_agent_class?, payload?, occurred_at? })` hashes `session_id` and `customer_id` via `b.crypto.namespaceHash` before write — raw identifiers never persist. Refuses raw email or IP shapes on every string field. `topSearchTerms`, `topViewedProducts`, `funnel`, `sessionFlow(session_id, …)`, `dropAfter(ts)` cover the operator surfaces. Migration `0019_analytics_events.sql` (id, event_type CHECK enum, session_id_hash NOT NULL, customer_id_hash, payload_json bounded 4 KiB, product_id, search_q, page_url, user_agent_class CHECK enum, occurred_at) + 4 indexes covering (event_type, time), (session, time), (product, event_type, time), and (search_q, time). · *`themes/midnight` — alternate dark-mode theme via design-token override* — 2.59 KiB stylesheet at `themes/midnight/assets/css/main.css` that `@import`s the default theme's component CSS, then overrides ~26 design tokens (palette flipped, shadow alpha deepened, accent warmed to `#ff8a3a` for dark contrast). Two per-surface fixes for `.site-header` and `.site-footer` which hard-code colours outside the token system. Operators activate by setting `SHOP_THEME=midnight` in wrangler vars. The midnight theme demonstrates the inheritance pattern for operators authoring their own theme.
|
|
12
|
+
|
|
13
|
+
- v0.0.53 (2026-05-22) — **Docs refresh, sample catalog tripled, designed empty-cart card, worker hardening.** Four surfaces refresh in one release. `SECURITY.md` + `CONTRIBUTING.md` + `docs/deploy-cloudflare.md` cover the scoped package name + custom domain + the full 0001-0010 migration table. The sample catalog grows from 4 products to 12 (Operator Hoodie, Vault Stick, Signing Cable, Build Pass, Audit Log Kit, Self-Hosted Plan, Operator Mug, Sticker Pack — each with its own brand-coloured SVG hero). The empty-cart row gains a designed `cart-empty` card with the brand 🛒 icon, eyebrow + title + lede + dual CTAs (`Browse products` primary, `Find a specific product` ghost linking the header search). The Worker gets a `/robots.txt` edge fallback (so crawlers get a clean answer even during container cold-start), the warming-up page picks up `noindex`/`canonical`/`aria-live`/refresh tuned to `Sec-Fetch-Mode`, and a new `/_/version` probe surfaces deploy state. **Added:** *Sample catalog grows from 4 to 12 products with brand-coloured SVG heroes* — `scripts/sample-product-images/{operator-hoodie,vault-stick,signing-cable,build-pass,audit-log-kit,self-hosted-plan,operator-mug,sticker-pack}.svg` ship as the next round of reference imagery. Apparel SVGs use dark-ink gradients with accent-orange illustrations; hardware uses navy-to-blue gradients with grey/black device silhouettes + accent-orange chip highlights; digital uses purple-to-navy with credential-artifact motifs carrying a visible PQC / ML-DSA seal; bundles use deep-orange or green with stacked-object compositions. All 800×800 viewBox, system font stack, monospace SKU at bottom-right. `scripts/seed-sample-products.sql` + `scripts/seed-sample-product-media.sql` extend with the matching catalog + media rows (UUIDv7 ids continue the existing numbering 0005-000c). · *Designed empty-cart card* — `renderCart` emits a `.cart-empty` section instead of a bare `<td colspan>` row when there are no lines. The card carries the brand emoji in a dashed-border circle, `Cart` eyebrow, `Your cart is empty` headline, a lede that explains how the cart holds the add-time price, and dual CTAs — `Browse products →` (primary, → `/`) + `Find a specific product` (ghost, anchors `#site-search-q` for header-search focus without inline JS). Populated cart gains a `Shop / Cart` breadcrumb above the section head, matching the PDP's pattern. · *Worker `/robots.txt` edge fallback* — Even during container cold-start, the Worker now serves a minimal `User-agent: *\nAllow: /\nSitemap: https://blamejs.shop/sitemap.xml\n` directly at the edge with a 1h cache. Crawlers never see the warming page for robots probes. R2-uploaded `/assets/robots.txt` overrides still win for operators with a custom policy. · *Worker `/_/version` deploy probe* — Operator-friendly diagnostic endpoint returning `{ worker, container_image, time }`. No auth required (pure read-only probe); use it to verify a deploy reached the edge before sending traffic. · *`CONTRIBUTING.md`* — 210 lines covering: dev environment setup (clone → vendor-update → smoke), the release workflow (release-notes JSON → CHANGELOG rebuild → branch → PR → admin merge → tag → publish), code conventions (CommonJS, `var`, zero npm runtime deps, compose `b.*` primitives, vendored tree read-only, security defaults non-opt-in, PQC-first), how to run + write tests (smoke + layer-0/1/2, `waitUntil` over `setTimeout`), the explicit list of artifacts each publish produces, and a pointer to `SECURITY.md` for vuln reporting. **Changed:** *Warming-up page — `noindex`, canonical, ARIA-live, refresh tuned to navigation mode* — The cold-start fallback page picks up `<meta name="robots" content="noindex, nofollow">` + a canonical link so crawlers don't index the placeholder. The auto-refresh interval shifts based on `Sec-Fetch-Mode` — 5 seconds for `navigate` requests (real visitor in a browser tab), 8 seconds for non-navigation fetches (XHR/fetch/Stripe.js probes that shouldn't spam the container). A new `aria-live="polite"` region announces the warming state to screen readers. · *`SECURITY.md` — Stripe webhook clause updated to match shipped code* — The container's defense-in-depth verification clause used a placeholder phrase from before `lib/payment.js` shipped. Now describes the actual `b.webhook.verify` call (alg `hmac-sha256-stripe`) running inside `lib/payment.js` before any FSM transition. Every other audit point matched the workflow output as-is — SLSA L3 provenance, Sigstore-keyless SBOM signatures, SHA-256 + SHA3-512 digests, ML-DSA-65 PQC sidecar. · *`docs/deploy-cloudflare.md` — custom-domain section + 0001-0010 migration table + demo-seed step* — Adds a `Wire a custom domain` section (zone-add → Worker custom-domain bind → `wrangler.toml` route alternative → TLS verify). Replaces the bare-paragraph migration mention with the explicit 0001-0010 table (calling out the intentional `0007` gap from the abandoned discounts work). Adds a `Seed demo content` section that runs both seed SQL files via `wrangler d1 execute --remote`. Switches the existing migration-apply step to `--remote` for clarity.
|
|
14
|
+
|
|
11
15
|
- v0.0.52 (2026-05-22) — **Live deploy moves to the custom domain — README + wrangler config point at https://blamejs.shop.** The reference deploy now serves at `https://blamejs.shop` instead of the `*.workers.dev` subdomain. README header now reads 'Homepage: **https://blamejs.shop**' and the admin-API curl example uses a placeholder `your-shop.example.com` host since operators replace it with their own. `wrangler.toml#D1_BRIDGE_URL` switches to `https://blamejs.shop` so the container's externalDb adapter calls back through the canonical origin. **Changed:** *README header — 'Homepage' instead of 'Live demo', custom domain* — `Homepage: **https://blamejs.shop**` replaces the previous `Live demo: **https://blamejs-shop.coocoo.workers.dev/**`. Operators evaluating the framework now see the canonical address. The admin-API curl example in the operator quick-start drops the placeholder `<your-worker>.workers.dev` host in favour of a `your-shop.example.com` placeholder with a comment pointing at the reference deploy. · *`wrangler.toml#D1_BRIDGE_URL` → `https://blamejs.shop`* — The container's externalDb D1 adapter posts SQL to `<D1_BRIDGE_URL><D1_BRIDGE_PATH>` (the Worker's service-binding bridge endpoint). Updating the base URL routes those internal calls through the custom domain. Cloudflare resolves custom domains to the same Worker that serves `*.workers.dev`, so the bridge keeps working with no router change.
|
|
12
16
|
|
|
13
17
|
- v0.0.51 (2026-05-22) — **SEO crawl surfaces — `/robots.txt` + `/sitemap.xml` listing every active product.** Crawlers landing on the live shop had no discovery surface — every product page was reachable only by being linked from the home grid, and there was no robots.txt to direct the crawl. This release wires `/robots.txt` (allow-everything-except-session-scoped, pointing at the sitemap) and `/sitemap.xml` (lists the home + /admin + every `status='active'` product with its `updated_at` lastmod, capped at 1000 rows). The XML is hand-rolled — no node:xml dep — with attribute-escaped values so a slug containing `&` or `<` can't break out of the document. **Added:** *`GET /robots.txt`* — `User-agent: *` + `Allow: /` for crawlable surfaces. Disallow paths cover the session-scoped + operator-only surfaces (`/admin`, `/cart`, `/checkout`, `/pay/`, `/orders/`, `/account`) — none of those have crawl value and exposing them risks crawlers tripping rate limits on the cart-bound DO. `Sitemap:` directive resolves from the request `Host` header so the same handler serves the right absolute URL for `localhost:8080` (dev), `blamejs-shop.coocoo.workers.dev` (live), or any custom domain. `Cache-Control: public, max-age=3600`. Operators with stricter requirements override by uploading a `robots.txt` key to R2 — the Worker's static-asset bridge serves R2 keys ahead of the storefront router, so the bucket file wins. · *`GET /sitemap.xml`* — Lists the home page (priority 1.0, daily changefreq), the `/admin` landing (priority 0.3, monthly changefreq), and every `status='active'` product (priority 0.8, weekly changefreq) with the product's `updated_at` or `created_at` as `lastmod` (ISO date). Capped at 1000 rows — the sitemap spec allows 50,000 but a shop at that scale should pre-segment into a sitemap index. The XML is hand-rolled, attribute-escape applies `&`, `<`, `>`, `"`, `'` so an operator-supplied slug can't break the document. `Cache-Control: public, max-age=300` so a catalog update propagates inside five minutes without an operator action.
|
package/SECURITY.md
CHANGED
|
@@ -123,9 +123,11 @@ node -e "
|
|
|
123
123
|
- **Stripe webhook signature.** Inbound `POST` to
|
|
124
124
|
`/api/webhooks/stripe` is signature-verified at the Worker edge
|
|
125
125
|
(HMAC-SHA256 over `<timestamp>.<body>`, 5-minute tolerance window)
|
|
126
|
-
before forwarding to the container. The container
|
|
127
|
-
defense-in-depth
|
|
128
|
-
|
|
126
|
+
before forwarding to the container. The container re-verifies the
|
|
127
|
+
same signature defense-in-depth via `b.webhook.verify` (alg
|
|
128
|
+
`hmac-sha256-stripe`) inside `lib/payment.js` before any FSM
|
|
129
|
+
transition runs. An unsigned or out-of-window delivery never
|
|
130
|
+
touches origin resources.
|
|
129
131
|
- **Worker → Container trust boundary.** The Worker treats the
|
|
130
132
|
container as untrusted-for-D1: it never proxies arbitrary headers
|
|
131
133
|
into D1, only the explicit `sql` + `params` from the bridge body.
|
package/lib/analytics.js
CHANGED
|
@@ -46,6 +46,45 @@
|
|
|
46
46
|
* refunded). Every other status (pending / paid / fulfilling /
|
|
47
47
|
* shipped / delivered) counts at face value because the operator
|
|
48
48
|
* has either captured the funds or is committed to capturing them.
|
|
49
|
+
*
|
|
50
|
+
* Event-stream surface (writes against `analytics_events` from
|
|
51
|
+
* migration `0019_analytics_events.sql`):
|
|
52
|
+
*
|
|
53
|
+
* analytics.recordEvent({ event_type, session_id?, customer_id?,
|
|
54
|
+
* product_id?, search_q?, page_url?,
|
|
55
|
+
* user_agent_class?, payload? })
|
|
56
|
+
* → { id, occurred_at }
|
|
57
|
+
*
|
|
58
|
+
* analytics.topSearchTerms({ from?, to?, limit? })
|
|
59
|
+
* → [{ search_q, count }]
|
|
60
|
+
*
|
|
61
|
+
* analytics.topViewedProducts({ from?, to?, limit? })
|
|
62
|
+
* → [{ product_id, count }]
|
|
63
|
+
*
|
|
64
|
+
* analytics.funnel({ from?, to? })
|
|
65
|
+
* → { pdp_views, cart_adds, checkout_starts, checkout_completes,
|
|
66
|
+
* conversion_rate }
|
|
67
|
+
*
|
|
68
|
+
* analytics.sessionFlow(session_id, { limit? })
|
|
69
|
+
* → [{ id, event_type, session_id_hash, customer_id_hash,
|
|
70
|
+
* product_id, search_q, page_url, user_agent_class,
|
|
71
|
+
* payload, occurred_at }]
|
|
72
|
+
*
|
|
73
|
+
* analytics.dropAfter(ts)
|
|
74
|
+
* → { deleted }
|
|
75
|
+
*
|
|
76
|
+
* Privacy posture: `session_id` and `customer_id` are hashed via
|
|
77
|
+
* `b.crypto.namespaceHash` (namespaces `"analytics-session"` /
|
|
78
|
+
* `"analytics-customer"`) before the row reaches the database. The
|
|
79
|
+
* primitive REFUSES — with a TypeError, at every write entry point
|
|
80
|
+
* — to accept a value that looks like a raw email (contains `@`
|
|
81
|
+
* between two non-whitespace runs) or a raw IP (dotted-quad or
|
|
82
|
+
* colon-delimited hextet). The same refusal applies to `search_q`
|
|
83
|
+
* and `page_url` so an operator who accidentally pipes a logged-in
|
|
84
|
+
* user's identifier into a search term hits a loud error instead
|
|
85
|
+
* of a quiet PII leak. `payload` is JSON-encoded and bounded to
|
|
86
|
+
* 4 KiB; the primitive does not introspect its contents — operator
|
|
87
|
+
* discipline owns what goes inside.
|
|
49
88
|
*/
|
|
50
89
|
|
|
51
90
|
var bShop;
|
|
@@ -88,6 +127,103 @@ function _limit(n, label, max) {
|
|
|
88
127
|
}
|
|
89
128
|
}
|
|
90
129
|
|
|
130
|
+
// ---- event-stream validators -------------------------------------------
|
|
131
|
+
//
|
|
132
|
+
// Event-stream writes go through `recordEvent`; every other event-
|
|
133
|
+
// stream surface is read-only and reuses `_resolveWindow` /
|
|
134
|
+
// `_limit`. The validators below are the write-site gates.
|
|
135
|
+
|
|
136
|
+
var EVENT_TYPES = [
|
|
137
|
+
"pdp_view", "collection_view", "search_query",
|
|
138
|
+
"wishlist_add", "wishlist_remove", "cart_add", "cart_remove",
|
|
139
|
+
"checkout_start", "checkout_complete", "newsletter_signup",
|
|
140
|
+
];
|
|
141
|
+
|
|
142
|
+
var UA_CLASSES = ["desktop", "mobile", "bot", "other"];
|
|
143
|
+
|
|
144
|
+
var SESSION_NAMESPACE = "analytics-session";
|
|
145
|
+
var CUSTOMER_NAMESPACE = "analytics-customer";
|
|
146
|
+
|
|
147
|
+
// Payload size bound — operators put refinement filters / source
|
|
148
|
+
// attribution / variant ids here; 4 KiB is more than enough and
|
|
149
|
+
// caps the per-row footprint. Bigger payloads belong in a
|
|
150
|
+
// purpose-built table, not the event stream.
|
|
151
|
+
var MAX_PAYLOAD_BYTES = 4096;
|
|
152
|
+
// Bounded string lengths for the denormalised columns. The
|
|
153
|
+
// primitive surfaces a TypeError rather than silently truncating
|
|
154
|
+
// because a truncated identifier joins to nothing.
|
|
155
|
+
var MAX_SEARCH_Q = 256;
|
|
156
|
+
var MAX_PAGE_URL = 2048;
|
|
157
|
+
var MAX_PRODUCT_ID = 128;
|
|
158
|
+
var MAX_SESSION_ID = 512;
|
|
159
|
+
var MAX_CUSTOMER_ID = 512;
|
|
160
|
+
|
|
161
|
+
// Resolve a `{ from, to }` window for the event-stream queries.
|
|
162
|
+
// Mirrors `_resolveWindow` but uses the `from` / `to` naming the
|
|
163
|
+
// operator-facing surface advertises so a typo here doesn't tie a
|
|
164
|
+
// `since` error message to a `from`-keyed call site.
|
|
165
|
+
function _resolveEventWindow(opts) {
|
|
166
|
+
opts = opts || {};
|
|
167
|
+
var now = Date.now();
|
|
168
|
+
var from = opts.from == null ? (now - DEFAULT_WINDOW_MS) : opts.from;
|
|
169
|
+
var to = opts.to == null ? now : opts.to;
|
|
170
|
+
_epochMs(from, "from");
|
|
171
|
+
_epochMs(to, "to");
|
|
172
|
+
if (from >= to) {
|
|
173
|
+
throw new TypeError("analytics: from must be strictly less than to");
|
|
174
|
+
}
|
|
175
|
+
if ((to - from) > ONE_YEAR_MS) {
|
|
176
|
+
throw new TypeError("analytics: window (to - from) must be ≤ 1 year");
|
|
177
|
+
}
|
|
178
|
+
return { from: from, to: to };
|
|
179
|
+
}
|
|
180
|
+
|
|
181
|
+
// PII guard — refuse a value that looks like a raw email or IP.
|
|
182
|
+
// Hashed identifiers (hex) never satisfy either shape, so the gate
|
|
183
|
+
// is a one-way "operator handed us raw PII" detector, not a typing
|
|
184
|
+
// constraint. The check is intentionally permissive on the email
|
|
185
|
+
// side (we'd rather reject a borderline string than ingest a real
|
|
186
|
+
// address) and uses the same dotted-quad / hextet shapes the
|
|
187
|
+
// vendored zod schemas exercise.
|
|
188
|
+
var RAW_EMAIL_RE = /[^\s@]+@[^\s@]+\.[^\s@]+/;
|
|
189
|
+
// IPv4 dotted-quad or IPv6 with at least one colon-delimited
|
|
190
|
+
// hextet. The IPv6 shape catches the common forms ("::1",
|
|
191
|
+
// "2001:db8::1", full eight-group) without trying to be RFC-precise
|
|
192
|
+
// — anything with two-or-more colons separated by hex is enough to
|
|
193
|
+
// trigger the refusal.
|
|
194
|
+
var RAW_IPV4_RE = /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/;
|
|
195
|
+
var RAW_IPV6_RE = /(?:[0-9a-fA-F]{1,4}:){2,}[0-9a-fA-F]{0,4}/;
|
|
196
|
+
|
|
197
|
+
function _refuseRawPii(label, value) {
|
|
198
|
+
if (typeof value !== "string" || value.length === 0) return;
|
|
199
|
+
if (RAW_EMAIL_RE.test(value)) {
|
|
200
|
+
throw new TypeError(
|
|
201
|
+
"analytics: " + label + " looks like a raw email — hash via " +
|
|
202
|
+
"b.crypto.namespaceHash before passing to recordEvent"
|
|
203
|
+
);
|
|
204
|
+
}
|
|
205
|
+
if (RAW_IPV4_RE.test(value) || RAW_IPV6_RE.test(value)) {
|
|
206
|
+
throw new TypeError(
|
|
207
|
+
"analytics: " + label + " looks like a raw IP — IPs must not " +
|
|
208
|
+
"reach the analytics_events table; hash or drop at the caller"
|
|
209
|
+
);
|
|
210
|
+
}
|
|
211
|
+
}
|
|
212
|
+
|
|
213
|
+
function _optString(value, label, max) {
|
|
214
|
+
if (value == null) return null;
|
|
215
|
+
if (typeof value !== "string") {
|
|
216
|
+
throw new TypeError("analytics: " + label + " must be a string when provided");
|
|
217
|
+
}
|
|
218
|
+
if (value.length === 0) {
|
|
219
|
+
throw new TypeError("analytics: " + label + " must be non-empty when provided");
|
|
220
|
+
}
|
|
221
|
+
if (value.length > max) {
|
|
222
|
+
throw new TypeError("analytics: " + label + " exceeds " + max + " chars");
|
|
223
|
+
}
|
|
224
|
+
return value;
|
|
225
|
+
}
|
|
226
|
+
|
|
91
227
|
// ---- factory ------------------------------------------------------------
|
|
92
228
|
|
|
93
229
|
function create(opts) {
|
|
@@ -235,6 +371,270 @@ function create(opts) {
|
|
|
235
371
|
});
|
|
236
372
|
},
|
|
237
373
|
|
|
374
|
+
// Record a single event-stream row. `session_id` and
|
|
375
|
+
// `customer_id` are hashed via `b.crypto.namespaceHash` before
|
|
376
|
+
// the row reaches the database — raw identifiers never persist.
|
|
377
|
+
// At least one of the two MUST be supplied; an anonymous row
|
|
378
|
+
// with no join key is useless for funnel debugging and a sign
|
|
379
|
+
// the caller forgot to wire the session middleware.
|
|
380
|
+
//
|
|
381
|
+
// Refusals (TypeError, before any I/O):
|
|
382
|
+
// - bad `event_type` (not in the allowed enum)
|
|
383
|
+
// - missing both `session_id` and `customer_id`
|
|
384
|
+
// - oversized `payload` (> 4 KiB after JSON-encode)
|
|
385
|
+
// - bad `occurred_at` (not a non-negative integer epoch-ms)
|
|
386
|
+
// - any string-typed field that looks like a raw email or IP
|
|
387
|
+
// - any string-typed field that exceeds its length bound
|
|
388
|
+
recordEvent: async function (input) {
|
|
389
|
+
if (!input || typeof input !== "object") {
|
|
390
|
+
throw new TypeError("analytics.recordEvent: input object required");
|
|
391
|
+
}
|
|
392
|
+
var eventType = input.event_type;
|
|
393
|
+
if (EVENT_TYPES.indexOf(eventType) === -1) {
|
|
394
|
+
throw new TypeError(
|
|
395
|
+
"analytics.recordEvent: event_type must be one of " +
|
|
396
|
+
EVENT_TYPES.join(", ")
|
|
397
|
+
);
|
|
398
|
+
}
|
|
399
|
+
// Both raw identifiers run through the PII guard first — an
|
|
400
|
+
// email-shaped session_id is a tell that the caller wired the
|
|
401
|
+
// wrong field. The guard runs before hashing because a
|
|
402
|
+
// namespaceHash output is hex (no `@`, no dotted-quad) and
|
|
403
|
+
// would slip past trivially.
|
|
404
|
+
_refuseRawPii("session_id", input.session_id);
|
|
405
|
+
_refuseRawPii("customer_id", input.customer_id);
|
|
406
|
+
|
|
407
|
+
var sessionId = _optString(input.session_id, "session_id", MAX_SESSION_ID);
|
|
408
|
+
var customerId = _optString(input.customer_id, "customer_id", MAX_CUSTOMER_ID);
|
|
409
|
+
if (sessionId == null && customerId == null) {
|
|
410
|
+
throw new TypeError(
|
|
411
|
+
"analytics.recordEvent: at least one of session_id / " +
|
|
412
|
+
"customer_id is required"
|
|
413
|
+
);
|
|
414
|
+
}
|
|
415
|
+
|
|
416
|
+
// Denormalised columns — each is optional and bounded. The
|
|
417
|
+
// PII guard runs on every string-typed value, so an operator
|
|
418
|
+
// who pipes `?q=alice@example.com` straight into `search_q`
|
|
419
|
+
// hits a loud refusal at the write site instead of leaking
|
|
420
|
+
// the address into the aggregate table.
|
|
421
|
+
_refuseRawPii("product_id", input.product_id);
|
|
422
|
+
_refuseRawPii("search_q", input.search_q);
|
|
423
|
+
_refuseRawPii("page_url", input.page_url);
|
|
424
|
+
var productId = _optString(input.product_id, "product_id", MAX_PRODUCT_ID);
|
|
425
|
+
var searchQ = _optString(input.search_q, "search_q", MAX_SEARCH_Q);
|
|
426
|
+
var pageUrl = _optString(input.page_url, "page_url", MAX_PAGE_URL);
|
|
427
|
+
var uaClass = input.user_agent_class;
|
|
428
|
+
if (uaClass != null && UA_CLASSES.indexOf(uaClass) === -1) {
|
|
429
|
+
throw new TypeError(
|
|
430
|
+
"analytics.recordEvent: user_agent_class must be one of " +
|
|
431
|
+
UA_CLASSES.join(", ")
|
|
432
|
+
);
|
|
433
|
+
}
|
|
434
|
+
|
|
435
|
+
// Payload — JSON-encode in-process so the size bound is
|
|
436
|
+
// enforced on the encoded bytes (matches what the row will
|
|
437
|
+
// hold). `undefined` → "{}" so the column NOT NULL default
|
|
438
|
+
// covers operators that don't pass a payload.
|
|
439
|
+
var payloadInput = input.payload == null ? {} : input.payload;
|
|
440
|
+
if (typeof payloadInput !== "object" || Array.isArray(payloadInput)) {
|
|
441
|
+
throw new TypeError(
|
|
442
|
+
"analytics.recordEvent: payload must be a plain object when provided"
|
|
443
|
+
);
|
|
444
|
+
}
|
|
445
|
+
var payloadJson;
|
|
446
|
+
try {
|
|
447
|
+
payloadJson = JSON.stringify(payloadInput);
|
|
448
|
+
} catch (e) {
|
|
449
|
+
throw new TypeError(
|
|
450
|
+
"analytics.recordEvent: payload not JSON-serialisable (" +
|
|
451
|
+
(e && e.message ? e.message : "unknown") + ")"
|
|
452
|
+
);
|
|
453
|
+
}
|
|
454
|
+
if (Buffer.byteLength(payloadJson, "utf8") > MAX_PAYLOAD_BYTES) {
|
|
455
|
+
throw new TypeError(
|
|
456
|
+
"analytics.recordEvent: payload exceeds " + MAX_PAYLOAD_BYTES +
|
|
457
|
+
" bytes (JSON-encoded)"
|
|
458
|
+
);
|
|
459
|
+
}
|
|
460
|
+
|
|
461
|
+
// `occurred_at` — defaults to now. Operators can pin it for
|
|
462
|
+
// backfills, but a NaN / float / string here is a typo and
|
|
463
|
+
// throws.
|
|
464
|
+
var occurredAt;
|
|
465
|
+
if (input.occurred_at == null) {
|
|
466
|
+
occurredAt = Date.now();
|
|
467
|
+
} else {
|
|
468
|
+
_epochMs(input.occurred_at, "occurred_at");
|
|
469
|
+
occurredAt = input.occurred_at;
|
|
470
|
+
}
|
|
471
|
+
|
|
472
|
+
var b = _b();
|
|
473
|
+
var sessionHash = sessionId == null ? null : b.crypto.namespaceHash(SESSION_NAMESPACE, sessionId);
|
|
474
|
+
var customerHash = customerId == null ? null : b.crypto.namespaceHash(CUSTOMER_NAMESPACE, customerId);
|
|
475
|
+
// `session_id_hash` is NOT NULL in the schema — fall back to
|
|
476
|
+
// a customer-scoped hash when only the customer is supplied
|
|
477
|
+
// so the join key column always has a value. The customer
|
|
478
|
+
// hash uses its own namespace so a session_id_hash composed
|
|
479
|
+
// this way can't accidentally collide with a real session.
|
|
480
|
+
if (sessionHash == null) sessionHash = customerHash;
|
|
481
|
+
|
|
482
|
+
var id = b.uuid.v7();
|
|
483
|
+
await query(
|
|
484
|
+
"INSERT INTO analytics_events " +
|
|
485
|
+
"(id, event_type, session_id_hash, customer_id_hash, payload_json, " +
|
|
486
|
+
" product_id, search_q, page_url, user_agent_class, occurred_at) " +
|
|
487
|
+
"VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10)",
|
|
488
|
+
[id, eventType, sessionHash, customerHash, payloadJson,
|
|
489
|
+
productId, searchQ, pageUrl, uaClass == null ? null : uaClass, occurredAt],
|
|
490
|
+
);
|
|
491
|
+
return { id: id, occurred_at: occurredAt };
|
|
492
|
+
},
|
|
493
|
+
|
|
494
|
+
// Top-N search terms by event count across the window. Filters
|
|
495
|
+
// out NULL `search_q` (other event types share the table) and
|
|
496
|
+
// GROUPs by the denormalised column so the query never decodes
|
|
497
|
+
// the JSON payload. Default limit 10; max 100 (same envelope as
|
|
498
|
+
// `topSKUs`).
|
|
499
|
+
topSearchTerms: async function (windowOpts) {
|
|
500
|
+
var w = _resolveEventWindow(windowOpts);
|
|
501
|
+
var limit = (windowOpts && windowOpts.limit) == null ? 10 : windowOpts.limit;
|
|
502
|
+
_limit(limit, "limit");
|
|
503
|
+
var r = await query(
|
|
504
|
+
"SELECT search_q AS search_q, COUNT(*) AS count " +
|
|
505
|
+
" FROM analytics_events " +
|
|
506
|
+
" WHERE event_type = 'search_query' " +
|
|
507
|
+
" AND search_q IS NOT NULL " +
|
|
508
|
+
" AND occurred_at >= ?1 AND occurred_at < ?2 " +
|
|
509
|
+
" GROUP BY search_q " +
|
|
510
|
+
" ORDER BY count DESC, search_q ASC " +
|
|
511
|
+
" LIMIT ?3",
|
|
512
|
+
[w.from, w.to, limit],
|
|
513
|
+
);
|
|
514
|
+
return r.rows.map(function (row) {
|
|
515
|
+
return { search_q: row.search_q, count: Number(row.count) || 0 };
|
|
516
|
+
});
|
|
517
|
+
},
|
|
518
|
+
|
|
519
|
+
// Top-N viewed products by PDP-view count across the window.
|
|
520
|
+
// Same shape as `topSearchTerms` — different event_type +
|
|
521
|
+
// group-by column.
|
|
522
|
+
topViewedProducts: async function (windowOpts) {
|
|
523
|
+
var w = _resolveEventWindow(windowOpts);
|
|
524
|
+
var limit = (windowOpts && windowOpts.limit) == null ? 10 : windowOpts.limit;
|
|
525
|
+
_limit(limit, "limit");
|
|
526
|
+
var r = await query(
|
|
527
|
+
"SELECT product_id AS product_id, COUNT(*) AS count " +
|
|
528
|
+
" FROM analytics_events " +
|
|
529
|
+
" WHERE event_type = 'pdp_view' " +
|
|
530
|
+
" AND product_id IS NOT NULL " +
|
|
531
|
+
" AND occurred_at >= ?1 AND occurred_at < ?2 " +
|
|
532
|
+
" GROUP BY product_id " +
|
|
533
|
+
" ORDER BY count DESC, product_id ASC " +
|
|
534
|
+
" LIMIT ?3",
|
|
535
|
+
[w.from, w.to, limit],
|
|
536
|
+
);
|
|
537
|
+
return r.rows.map(function (row) {
|
|
538
|
+
return { product_id: row.product_id, count: Number(row.count) || 0 };
|
|
539
|
+
});
|
|
540
|
+
},
|
|
541
|
+
|
|
542
|
+
// PDP → cart → checkout-start → checkout-complete funnel for
|
|
543
|
+
// the window. Conversion rate is `completes / pdp_views`,
|
|
544
|
+
// clamped to [0, 1] — when there are zero PDP views the rate
|
|
545
|
+
// is reported as 0 (operators read the absolute counts to
|
|
546
|
+
// distinguish "no traffic" from "no conversion").
|
|
547
|
+
funnel: async function (windowOpts) {
|
|
548
|
+
var w = _resolveEventWindow(windowOpts);
|
|
549
|
+
var r = await query(
|
|
550
|
+
"SELECT event_type, COUNT(*) AS count " +
|
|
551
|
+
" FROM analytics_events " +
|
|
552
|
+
" WHERE occurred_at >= ?1 AND occurred_at < ?2 " +
|
|
553
|
+
" AND event_type IN ('pdp_view','cart_add','checkout_start','checkout_complete') " +
|
|
554
|
+
" GROUP BY event_type",
|
|
555
|
+
[w.from, w.to],
|
|
556
|
+
);
|
|
557
|
+
var counts = { pdp_view: 0, cart_add: 0, checkout_start: 0, checkout_complete: 0 };
|
|
558
|
+
for (var i = 0; i < r.rows.length; i += 1) {
|
|
559
|
+
counts[r.rows[i].event_type] = Number(r.rows[i].count) || 0;
|
|
560
|
+
}
|
|
561
|
+
var pdp = counts.pdp_view;
|
|
562
|
+
var carts = counts.cart_add;
|
|
563
|
+
var starts = counts.checkout_start;
|
|
564
|
+
var completes = counts.checkout_complete;
|
|
565
|
+
var rate = pdp > 0 ? (completes / pdp) : 0;
|
|
566
|
+
if (rate < 0) rate = 0;
|
|
567
|
+
if (rate > 1) rate = 1;
|
|
568
|
+
return {
|
|
569
|
+
pdp_views: pdp,
|
|
570
|
+
cart_adds: carts,
|
|
571
|
+
checkout_starts: starts,
|
|
572
|
+
checkout_completes: completes,
|
|
573
|
+
conversion_rate: rate,
|
|
574
|
+
};
|
|
575
|
+
},
|
|
576
|
+
|
|
577
|
+
// Per-session event sequence — operator hands the raw
|
|
578
|
+
// session_id, the primitive hashes it (same namespace as on
|
|
579
|
+
// write) and returns the chronological event list. Returning
|
|
580
|
+
// the `session_id_hash` alongside lets the operator confirm
|
|
581
|
+
// they're looking at the right session without ever seeing the
|
|
582
|
+
// raw id again. Default limit 100; max 500 (sessions don't
|
|
583
|
+
// legitimately emit more events than that in a single
|
|
584
|
+
// debugging window).
|
|
585
|
+
sessionFlow: async function (sessionId, opts) {
|
|
586
|
+
if (typeof sessionId !== "string" || sessionId.length === 0) {
|
|
587
|
+
throw new TypeError("analytics.sessionFlow: session_id required");
|
|
588
|
+
}
|
|
589
|
+
if (sessionId.length > MAX_SESSION_ID) {
|
|
590
|
+
throw new TypeError("analytics.sessionFlow: session_id exceeds " + MAX_SESSION_ID + " chars");
|
|
591
|
+
}
|
|
592
|
+
_refuseRawPii("session_id", sessionId);
|
|
593
|
+
var limit = (opts && opts.limit) == null ? 100 : opts.limit;
|
|
594
|
+
_limit(limit, "limit", 500);
|
|
595
|
+
var hash = _b().crypto.namespaceHash(SESSION_NAMESPACE, sessionId);
|
|
596
|
+
var r = await query(
|
|
597
|
+
"SELECT id, event_type, session_id_hash, customer_id_hash, " +
|
|
598
|
+
" payload_json, product_id, search_q, page_url, " +
|
|
599
|
+
" user_agent_class, occurred_at " +
|
|
600
|
+
" FROM analytics_events " +
|
|
601
|
+
" WHERE session_id_hash = ?1 " +
|
|
602
|
+
" ORDER BY occurred_at ASC, id ASC " +
|
|
603
|
+
" LIMIT ?2",
|
|
604
|
+
[hash, limit],
|
|
605
|
+
);
|
|
606
|
+
return r.rows.map(function (row) {
|
|
607
|
+
var payload;
|
|
608
|
+
try { payload = JSON.parse(row.payload_json || "{}"); }
|
|
609
|
+
catch (_e) { payload = {}; }
|
|
610
|
+
return {
|
|
611
|
+
id: row.id,
|
|
612
|
+
event_type: row.event_type,
|
|
613
|
+
session_id_hash: row.session_id_hash,
|
|
614
|
+
customer_id_hash: row.customer_id_hash,
|
|
615
|
+
product_id: row.product_id,
|
|
616
|
+
search_q: row.search_q,
|
|
617
|
+
page_url: row.page_url,
|
|
618
|
+
user_agent_class: row.user_agent_class,
|
|
619
|
+
payload: payload,
|
|
620
|
+
occurred_at: Number(row.occurred_at) || 0,
|
|
621
|
+
};
|
|
622
|
+
});
|
|
623
|
+
},
|
|
624
|
+
|
|
625
|
+
// Retention sweep — DELETE every event older than `ts`.
|
|
626
|
+
// Operators run this on a schedule (cron, queue worker) to
|
|
627
|
+
// satisfy data-minimisation obligations. The primitive returns
|
|
628
|
+
// the row count so the caller can log the size of each sweep.
|
|
629
|
+
dropAfter: async function (ts) {
|
|
630
|
+
_epochMs(ts, "ts");
|
|
631
|
+
var r = await query(
|
|
632
|
+
"DELETE FROM analytics_events WHERE occurred_at < ?1",
|
|
633
|
+
[ts],
|
|
634
|
+
);
|
|
635
|
+
return { deleted: Number(r.rowCount) || 0 };
|
|
636
|
+
},
|
|
637
|
+
|
|
238
638
|
// Most-recent orders. No window — strictly most-recent-N. Used
|
|
239
639
|
// by the dashboard's "Recent activity" sidebar.
|
|
240
640
|
recentOrders: async function (recentOpts) {
|