@orgloop/transform-dedup 0.1.0 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +14 -39
  2. package/package.json +2 -2
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # @orgloop/transform-dedup
2
2
 
3
- Deduplicates events within a configurable time window. Events with the same key hash seen within the window are dropped; the first occurrence passes through.
3
+ Deduplicate events within a configurable time window using SHA-256 hashing.
4
4
 
5
5
  ## Install
6
6
 
@@ -16,56 +16,31 @@ transforms:
16
16
  type: package
17
17
  package: "@orgloop/transform-dedup"
18
18
  config:
19
- key: # fields used to build the dedup hash
19
+ key:
20
20
  - source
21
21
  - type
22
- - provenance.platform_event
23
22
  - payload.pr_number
24
- window: "5m" # time window for dedup (default: 5m)
25
- store: "memory" # storage backend (only "memory" for now)
23
+ window: "5m"
26
24
  ```
27
25
 
28
- ### Config options
26
+ ## Config options
29
27
 
30
28
  | Field | Type | Required | Default | Description |
31
29
  |-------|------|----------|---------|-------------|
32
- | `key` | `string[]` | yes | `["source", "type", "id"]` | Dot-path fields used to build the dedup hash. Values are concatenated and SHA-256 hashed |
33
- | `window` | `string` | yes | `"5m"` | Duration window. Supported units: `ms`, `s`, `m`, `h`, `d` |
34
- | `store` | `string` | no | `"memory"` | Storage backend. Only `"memory"` is supported in the current version |
30
+ | `key` | `string[]` | yes | -- | Dot-path fields used to build the dedup hash. |
31
+ | `window` | `string` | yes | `"5m"` | Duration window. Units: `ms`, `s`, `m`, `h`, `d`. |
32
+ | `store` | `string` | no | `"memory"` | Storage backend. Only `"memory"` currently. |
35
33
 
36
- ### How it works
34
+ ## Behavior
37
35
 
38
- 1. For each incoming event, the transform extracts values at the configured `key` field paths.
39
- 2. The values are concatenated (null-separated) and hashed with SHA-256.
40
- 3. If the hash has been seen within the `window` duration, the event is dropped (returns `null`).
41
- 4. If the hash is new or expired, the event passes through and the hash is recorded with the current timestamp.
36
+ For each event, values at the configured `key` paths are extracted, concatenated (null-separated), and SHA-256 hashed. If the hash was seen within the `window` duration, the event is dropped. Otherwise it passes through and the hash is recorded.
42
37
 
43
- A periodic cleanup timer evicts expired entries from the in-memory store.
38
+ A periodic cleanup timer (interval = max of window duration, 10s) evicts expired entries. State is in-memory only and lost on restart.
44
39
 
45
- ## Example route
40
+ ## Documentation
46
41
 
47
- ```yaml
48
- routes:
49
- - name: deduped-pr-reviews
50
- when:
51
- source: github-eng
52
- events:
53
- - resource.changed
54
- transforms:
55
- - ref: dedup-5m
56
- - ref: humans-only
57
- then:
58
- actor: openclaw-agent
59
- ```
60
-
61
- ## Auth / prerequisites
62
-
63
- None.
42
+ Full documentation at [orgloop.ai](https://orgloop.ai)
64
43
 
65
- ## Limitations / known issues
44
+ ## License
66
45
 
67
- - **Memory-only store** -- Dedup state is held entirely in memory and is lost on engine restart. After a restart, previously seen events may be processed again until the window catches up.
68
- - **No distributed dedup** -- The in-memory store does not support multiple engine instances. Running multiple instances results in each instance maintaining its own independent dedup state.
69
- - **Hash collisions** -- SHA-256 collisions are theoretically possible but practically negligible.
70
- - **Cleanup interval** -- Expired entries are cleaned up on a timer interval equal to the dedup window (minimum 10 seconds). Between cleanups, the memory footprint grows proportionally to event throughput.
71
- - **Key field ordering matters** -- The hash is built from key fields in the order specified. Changing the `key` array order produces different hashes.
46
+ MIT
package/package.json CHANGED
@@ -1,12 +1,12 @@
1
1
  {
2
2
  "name": "@orgloop/transform-dedup",
3
- "version": "0.1.0",
3
+ "version": "0.1.3",
4
4
  "description": "OrgLoop dedup transform — deduplicate events within a time window",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
7
7
  "types": "dist/index.d.ts",
8
8
  "dependencies": {
9
- "@orgloop/sdk": "0.1.0"
9
+ "@orgloop/sdk": "0.1.3"
10
10
  },
11
11
  "orgloop": {
12
12
  "type": "transform",