allus-company-data 0.0.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. allus_company_data-0.0.3/LICENSE +21 -0
  2. allus_company_data-0.0.3/PKG-INFO +666 -0
  3. allus_company_data-0.0.3/README.md +648 -0
  4. allus_company_data-0.0.3/pyproject.toml +32 -0
  5. allus_company_data-0.0.3/setup.cfg +4 -0
  6. allus_company_data-0.0.3/src/allus_company_data/__init__.py +63 -0
  7. allus_company_data-0.0.3/src/allus_company_data/buffer.py +325 -0
  8. allus_company_data-0.0.3/src/allus_company_data/client.py +399 -0
  9. allus_company_data-0.0.3/src/allus_company_data/config.py +234 -0
  10. allus_company_data-0.0.3/src/allus_company_data/crypto.py +285 -0
  11. allus_company_data-0.0.3/src/allus_company_data/errors.py +103 -0
  12. allus_company_data-0.0.3/src/allus_company_data/http.py +301 -0
  13. allus_company_data-0.0.3/src/allus_company_data/models.py +404 -0
  14. allus_company_data-0.0.3/src/allus_company_data/pump.py +360 -0
  15. allus_company_data-0.0.3/src/allus_company_data/webhooks.py +370 -0
  16. allus_company_data-0.0.3/src/allus_company_data.egg-info/PKG-INFO +666 -0
  17. allus_company_data-0.0.3/src/allus_company_data.egg-info/SOURCES.txt +25 -0
  18. allus_company_data-0.0.3/src/allus_company_data.egg-info/dependency_links.txt +1 -0
  19. allus_company_data-0.0.3/src/allus_company_data.egg-info/requires.txt +5 -0
  20. allus_company_data-0.0.3/src/allus_company_data.egg-info/top_level.txt +1 -0
  21. allus_company_data-0.0.3/tests/test_client.py +427 -0
  22. allus_company_data-0.0.3/tests/test_config.py +135 -0
  23. allus_company_data-0.0.3/tests/test_crypto.py +267 -0
  24. allus_company_data-0.0.3/tests/test_http.py +338 -0
  25. allus_company_data-0.0.3/tests/test_models.py +370 -0
  26. allus_company_data-0.0.3/tests/test_pump.py +709 -0
  27. allus_company_data-0.0.3/tests/test_webhooks.py +563 -0
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 allme.fyi
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,666 @@
1
+ Metadata-Version: 2.4
2
+ Name: allus-company-data
3
+ Version: 0.0.3
4
+ Summary: Reference Python SDK for the allus company-data API: typed, plaintext, slug-keyed conclusions with transparent decryption.
5
+ Author: allme.fyi
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/allus-fyi/company-data-python
8
+ Project-URL: Repository, https://github.com/allus-fyi/company-data-python
9
+ Keywords: allus,allme,company-data,sdk
10
+ Requires-Python: >=3.11
11
+ Description-Content-Type: text/markdown
12
+ License-File: LICENSE
13
+ Requires-Dist: cryptography>=42
14
+ Requires-Dist: requests>=2.31
15
+ Provides-Extra: dev
16
+ Requires-Dist: pytest>=8; extra == "dev"
17
+ Dynamic: license-file
18
+
19
+ # allus-company-data (Python)
20
+
21
+ The Python SDK for the **allus company-data API**. Point it at a JSON
22
+ config file and it hands back typed, plaintext, **your-slug-keyed conclusions**:
23
+ for each connected person, a map of *your request-field slug → plaintext value*
24
+ (plus whether the value is live and when it last changed).
25
+
26
+ The SDK hides everything else — the OAuth token, the field catalog, the id
27
+ plumbing, the hybrid decryption, binary fetching, the changes-queue mechanics,
28
+ JSON-vs-XML. The platform is **zero-knowledge**: the API only ever holds
29
+ ciphertext, so all decryption happens inside the SDK with your service private
30
+ key. **The person's own field choices are never exposed** — you only ever see
31
+ the request slots you configured.
32
+
33
+ > This SDK is one of six language ports that share an identical API surface.
34
+ > This manual is the Python view of it.
35
+
36
+ **Contents:** [TL;DR — fetch new updates](#tldr--fetch-new-updates) ·
37
+ [Quickstart](#quickstart) · [Every call](#every-call) ·
38
+ [The typed value model](#the-typed-value-model) ·
39
+ [The changes pump](#the-changes-pump) · [Webhooks](#webhooks) ·
40
+ [Rate limits](#rate-limits) · [Errors](#errors) ·
41
+ [How it's wired](#how-its-wired)
42
+
43
+ Deeper reference pages live in [`docs/`](docs/):
44
+ [config](docs/config.md) · [model](docs/model.md) · [pump](docs/pump.md) ·
45
+ [webhooks](docs/webhooks.md) · [errors](docs/errors.md).
46
+
47
+ ---
48
+
49
+ ## TL;DR — fetch new updates
50
+
51
+ ```bash
52
+ pip install allus-company-data
53
+ ```
54
+
55
+ Point a config.json at your service keys:
56
+
57
+ ```json
58
+ {
59
+ "api_url": "https://api.allme.fyi",
60
+ "client_id": "svc_xxx",
61
+ "client_secret": "xxx",
62
+ "service_private_key": "/path/to/service.pem",
63
+ "key_passphrase": "xxx",
64
+ "cache_dir": "./allus-cache"
65
+ }
66
+ ```
67
+
68
+ Drain everything new, handled one update at a time:
69
+
70
+ ```python
71
+ from allus_company_data import Client
72
+
73
+ client = Client.from_config("config.json")
74
+
75
+ def handle(change):
76
+ # one update at a time: event, person, slug, value, live, at
77
+ print(change.event, change.person_id, change.slug, change.value,
78
+ "live" if change.live else "snapshot", change.at)
79
+
80
+ client.process_changes(handle) # returns when the feed is empty
81
+ ```
82
+
83
+ `process_changes` pulls every pending change, decrypts it, and hands them to your
84
+ callback ONE BY ONE, acking each only after your code returns. Crash mid-batch?
85
+ The next run replays exactly what wasn't acked — nothing is lost, and the API
86
+ keeps no backlog of its own. Run it on a schedule (cron / systemd timer); there
87
+ is no daemon/follow mode by design. Connections, binary values, and webhooks are
88
+ documented below.
89
+
90
+ ---
91
+
92
+ ## Quickstart
93
+
94
+ Requires **Python ≥ 3.11**.
95
+
96
+ ```bash
97
+ pip install allus-company-data
98
+ # or, working from this repo: pip install -e '.[dev]' # from sdks/python/
99
+ python -c "import allus_company_data; print(allus_company_data.__version__)"
100
+ ```
101
+
102
+ ### 1. Write a config file
103
+
104
+ A single JSON file holds everything. Any field can be overridden by an `ALLUS_*`
105
+ env var, so secrets needn't live in the file. **No SDK method ever takes a key,
106
+ passphrase, or secret as an argument** — they all come from here.
107
+
108
+ `allus.json`:
109
+
110
+ ```json
111
+ {
112
+ "api_url": "https://api.allme.fyi",
113
+ "client_id": "svc_1a2b3c…",
114
+ "client_secret": "…",
115
+ "service_private_key": "./service-CRM.pem",
116
+ "key_passphrase": "…",
117
+
118
+ "account_private_key": "./account.pem",
119
+ "account_passphrase": "…",
120
+
121
+ "webhooks": {
122
+ "wh_abc123": "hmac_secret_for_that_webhook"
123
+ },
124
+
125
+ "cache_dir": "./allus-cache",
126
+ "format": "json"
127
+ }
128
+ ```
129
+
130
+ | Field | Required | Meaning |
131
+ |-------|----------|---------|
132
+ | `api_url` | yes | API base, e.g. `https://api.allme.fyi`. |
133
+ | `client_id` / `client_secret` | yes | The registered `client_credentials` credentials for **one** service. |
134
+ | `service_private_key` | yes | Path to the OpenSSL-encrypted PKCS#8 PEM you downloaded from the portal. |
135
+ | `key_passphrase` | yes | Decrypts that PEM in memory at startup. |
136
+ | `account_private_key` / `account_passphrase` | only for `encrypt_payload` webhooks | The company **account** key, used to unwrap an encrypted webhook envelope. |
137
+ | `webhooks` / `webhook_secret` | webhook auth — HMAC (default) | Per-webhook HMAC secrets keyed by webhook id (matched via the `X-Allus-Webhook-Id` header). A single-webhook service can use a flat `"webhook_secret": "…"` instead of the map. |
138
+ | `webhook_bearer_token` | webhook auth — bearer | Verify `Authorization: Bearer <token>` deliveries. |
139
+ | `webhook_basic` | webhook auth — basic | `{"username","password"}` — verify HTTP Basic deliveries. |
140
+ | `webhook_header` | webhook auth — header | `{"name","value"}` — verify a custom-header delivery. |
141
+ | `webhook_auth_none` | webhook auth — none | `true` — explicit opt-out; `verifyWebhook` always passes (use only behind your own gateway). **Configure at most one** webhook auth method (two+ → `ConfigError`). |
142
+ | `cache_dir` | no (default `./allus-cache`) | Durable local buffer for the changes pump. Must be writable + durable. |
143
+ | `format` | no (default `json`) | Wire format `json` or `xml`. Invisible in the output. |
144
+
145
+ Env overrides use the `ALLUS_` prefix of the field name, e.g.
146
+ `ALLUS_CLIENT_SECRET`, `ALLUS_KEY_PASSPHRASE`, `ALLUS_ACCOUNT_PASSPHRASE`,
147
+ `ALLUS_WEBHOOK_SECRET`. A missing/invalid config (or an unreadable PEM / wrong
148
+ passphrase) raises `ConfigError` at construction — fail fast.
149
+
150
+ ### 2. First call — list a connection's values
151
+
152
+ ```python
153
+ from allus_company_data import Client
154
+
155
+ client = Client.from_config("allus.json")
156
+
157
+ # Iterate every connected person (lazy, auto-paged).
158
+ for conn in client.connections():
159
+ print(conn.display_name, conn.person_id)
160
+ for slug, val in conn.values.items():
161
+ print(f" {slug} = {val.value!r} (live={val.live}, updated={val.updated_at})")
162
+ break # just the first one for the demo
163
+ ```
164
+
165
+ Or fetch one connection by id:
166
+
167
+ ```python
168
+ conn = client.connection("019xxxxxxxxxxxxxxxxxxxxxxxxx")
169
+ email = conn.values["work_email"].value # "alice@acme.com" (a str)
170
+ ```
171
+
172
+ `client = Client.from_env()` builds the same client entirely from `ALLUS_*`
173
+ env vars (no file).
174
+
175
+ ---
176
+
177
+ ## Every call
178
+
179
+ `Client` is the only object you construct. Build it from config, then:
180
+
181
+ ```python
182
+ Client.from_config(path, **kwargs) -> Client # from a JSON file (env overrides secrets)
183
+ Client.from_env(**kwargs) -> Client # entirely from ALLUS_* env vars
184
+ ```
185
+
186
+ `kwargs` are advanced/optional: `http` (an injected `HttpClient`), `logger` (a
187
+ `logging.Logger`), `sleep` (a `Callable[[float], None]`, for tests).
188
+
189
+ ### `request_fields()`
190
+
191
+ ```python
192
+ request_fields() -> list[RequestField]
193
+ ```
194
+
195
+ Your request-field **definitions** — fetched once from
196
+ `GET /api/company-data/request-fields` and cached for the life of the client (it
197
+ types every value). Returns *your* request config, never the person's fields.
198
+
199
+ * **Params:** none.
200
+ * **Returns:** `list[RequestField]` — each `RequestField(slug, label, type, one_time, mandatory, raw)`. `mandatory` is true when the field is mandatory-to-provide **or** mandatory-to-stay-connected.
201
+ * **Raises:** `AuthError`, `ApiError`, `RateLimitError`.
202
+
203
+ ```python
204
+ for f in client.request_fields():
205
+ flag = "mandatory" if f.mandatory else "optional"
206
+ print(f"{f.slug:20} {f.type:10} {flag}{' (one-time)' if f.one_time else ''}")
207
+ ```
208
+
209
+ ### `connections(limit, offset)`
210
+
211
+ ```python
212
+ connections(limit: int = 100, offset: int = 0) -> Iterator[Connection]
213
+ ```
214
+
215
+ A **lazy generator** that auto-pages `GET /api/company-data/connections?limit&offset`
216
+ and yields one typed `Connection` at a time (bounded memory for a large book).
217
+ Each `conn.values[slug]` is already decrypted (or a lazy binary handle).
218
+
219
+ * **Params:** `limit` — page size (default 100); `offset` — starting offset.
220
+ * **Returns:** `Iterator[Connection]`.
221
+ * **Raises:** `AuthError`, `ApiError`, `DecryptError` (per value, at access), `RateLimitError` (after the iterator's bounded internal backoff — see [Rate limits](#rate-limits)).
222
+
223
+ > **Heavily rate-limited.** Use for the initial full sync + occasional
224
+ > reconciliation only — never as a poll substitute for the changes feed. The
225
+ > generator paces itself within the limit (backs off on `Retry-After`).
226
+
227
+ ```python
228
+ # Initial full sync, streaming so a 100k-connection book never lands in memory.
229
+ for conn in client.connections(limit=200):
230
+ upsert_local_record(conn)
231
+ ```
232
+
233
+ ### `connection(id)`
234
+
235
+ ```python
236
+ connection(id: str) -> Connection
237
+ ```
238
+
239
+ Fetch one connection by its connection id (`GET /api/company-data/connections/{id}`).
240
+
241
+ * **Params:** `id` — the connection id (`Connection.id`).
242
+ * **Returns:** one `Connection`. Note: this endpoint returns `{connection_id, user_id, values}` and **no** `display_name`/`connected_at`, so those identity fields are `None` here (the list endpoint carries them).
243
+ * **Raises:** `AuthError`, `ApiError` (404 if unknown), `DecryptError`, `RateLimitError`.
244
+
245
+ ```python
246
+ conn = client.connection(conn_id)
247
+ phone = conn.values.get("mobile")
248
+ if phone:
249
+ print(phone.value, "live" if phone.live else "snapshot")
250
+ ```
251
+
252
+ ### `logs(limit, offset)`
253
+
254
+ ```python
255
+ logs(limit: int = 50, offset: int = 0) -> list[LogEntry]
256
+ ```
257
+
258
+ The service's activity log (`GET /api/company-data/logs?limit&offset`) — **ops
259
+ events only** (email / purge / webhook), never person field data.
260
+
261
+ * **Params:** `limit` (default 50), `offset` (default 0).
262
+ * **Returns:** `list[LogEntry]` — each `LogEntry(type, message, metadata, at, raw)`.
263
+ * **Raises:** `AuthError`, `ApiError`, `RateLimitError`.
264
+
265
+ ```python
266
+ for entry in client.logs(limit=20):
267
+ print(entry.at, entry.type, entry.message)
268
+ ```
269
+
270
+ ### `process_changes(handler, **options)`
271
+
272
+ ```python
273
+ process_changes(handler: Callable[[Change], None], **options) -> None
274
+ ```
275
+
276
+ The crash-safe changes pump: drains the feed through `handler` **one `Change` at
277
+ a time**, durably buffering each batch before delivery, with per-item ack and
278
+ retry → dead-letter → continue. Runs **until the feed is empty, then returns** —
279
+ there is **no follow/daemon mode** (you schedule re-runs yourself). Delivery is
280
+ **at-least-once**, so your handler **must be idempotent** (dedup on `Change.id`).
281
+ See [The changes pump](#the-changes-pump) for the full model.
282
+
283
+ * **Params:** `handler` — your callback; called with one `Change`. A return is an ack; an exception triggers retry.
284
+ * **Options** (keyword-only): `batch_size` (clamped to ≤ 500, default 100), `max_retries` (default 3), `on_error` (`"deadletter"` — default — or `"halt"`), `backoff` (`Callable[[int], float]`, attempt → seconds).
285
+ * **Returns:** `None` (when the feed is empty + the buffer is drained).
286
+ * **Raises:** `AuthError`, `ApiError`, `RateLimitError` (during a drain); `ValueError` (bad `on_error`); whatever the handler raises if `on_error="halt"` and retries are exhausted.
287
+
288
+ ```python
289
+ def handle(change):
290
+ if already_processed(change.id): # idempotency — dedup on the stable id
291
+ return
292
+ if change.event == "field_updated":
293
+ store(change.person_id, change.slug, change.value)
294
+ elif change.event in ("connection_deleted", "field_deleted"):
295
+ remove(change.person_id, change.slug)
296
+ mark_processed(change.id)
297
+
298
+ client.process_changes(handle) # returns when the feed is empty
299
+ ```
300
+
301
+ > `logger` is **not** a `process_changes` option in this SDK — pass it once to
302
+ > the `Client` constructor (`Client.from_config("allus.json", logger=my_logger)`).
303
+
304
+ ### Advanced changes primitives
305
+
306
+ ```python
307
+ drain_batch(max: int = 100) -> list[Change] # raw, UNBUFFERED — you own durability
308
+ dead_letters() -> list[dict] # the local dead-letter store
309
+ retry_dead_letters(handler, **options) -> int # re-drive dead-lettered events; returns count re-driven
310
+ ```
311
+
312
+ * `drain_batch(max)` — fetches one batch (clamped ≤ 500) and returns the decrypted `Change`s directly. It does **not** persist anything, so a crash loses what the API already deleted. Prefer `process_changes` for safe consumption.
313
+ * `dead_letters()` — each dict is the stored (ciphertext) event plus a flattened `error` and `attempts`.
314
+ * `retry_dead_letters(handler, **options)` — same `max_retries` / `on_error` / `backoff` options as `process_changes`; on success a record is removed, on repeated failure it stays dead-lettered (or re-raises under `"halt"`). Dead letters are never re-fetched from the API — the local store is their only home.
315
+
316
+ ```python
317
+ for dl in client.dead_letters():
318
+ print("stuck:", dl["id"], dl["error"], "after", dl["attempts"], "attempts")
319
+
320
+ n = client.retry_dead_letters(handle) # after you've fixed the bug
321
+ print(f"re-drove {n} dead letters")
322
+ ```
323
+
324
+ ### Webhook helpers (on the client)
325
+
326
+ The webhook receiver helpers are also exposed as `Client` methods (they delegate
327
+ to the module functions, fully config-driven — no key/secret arguments):
328
+
329
+ ```python
330
+ client.verify_webhook(raw_body: bytes, headers: dict) -> bool
331
+ client.parse_webhook(raw_body: bytes, headers: dict) -> Change
332
+ client.handle_webhook(raw_body: bytes, headers: dict) -> Change # verify + parse
333
+ ```
334
+
335
+ * `verify_webhook` — recomputes `HMAC-SHA256(raw_body, secret)` and constant-time-compares it to `X-Allus-Signature`. Returns `True`/`False`; **never raises** for a bad signature.
336
+ * `parse_webhook` — body → a typed `Change`. Does **not** verify. Handles JSON, XML, and the `encrypt_payload` account-key envelope. Raises `WebhookError` on a malformed/unparseable body.
337
+ * `handle_webhook` — verify **then** parse; raises `WebhookError` on a bad/unknown signature, otherwise returns the `Change`. The typical one-liner inside a route.
338
+
339
+ The same three are importable as standalone functions
340
+ (`from allus_company_data import verify_webhook, parse_webhook, handle_webhook`),
341
+ which take the `config` and the decrypt/type closures explicitly — but inside an
342
+ app you'll almost always use the client methods. See [Webhooks](#webhooks).
343
+
344
+ ---
345
+
346
+ ## The typed value model
347
+
348
+ You work with these objects and nothing else (`from allus_company_data import …`):
349
+
350
+ ```text
351
+ RequestField { slug, label, type, one_time, mandatory } # YOUR request config
352
+ Connection { id, person_id, display_name, connected_at, values: {<slug>: Value} }
353
+ Value { value, live, updated_at }
354
+ Change { id, event, person_id, slug?, value?, live?, at }
355
+ LogEntry { type, message, metadata, at }
356
+ ```
357
+
358
+ ### Keyed by *your* slug
359
+
360
+ `conn.values["work_email"].value` → `"alice@acme.com"`. The key is the stable,
361
+ explicit slug you set per request field in the portal — rename the label freely,
362
+ the slug is the contract. **The person's source field is never exposed**: no
363
+ source slug, no `field_id`, not even via `.raw`.
364
+
365
+ ### `Value(value, live, updated_at)`
366
+
367
+ | Attribute | Meaning |
368
+ |-----------|---------|
369
+ | `value` | The typed plaintext (see the table below). |
370
+ | `live` | `True` if the person chose "keep connected" (auto-updates); `False` for a one-time snapshot. |
371
+ | `updated_at` | `datetime` of when this answer last changed (per-answer, rides on the `Value`). |
372
+
373
+ ### Value types (from the field's `type`)
374
+
375
+ | Field type | Python `value` |
376
+ |------------|----------------|
377
+ | `email`, `phone`, `url`, `text` | `str` |
378
+ | `address`, `bank`, `creditcard` | `dict` — the decrypted plaintext is a JSON object, parsed for you |
379
+ | `date`, `date_of_birth` | `datetime.date` (falls back to the raw string if it can't be parsed) |
380
+ | `photo`, `document`, `legal_document` | a lazy `BinaryHandle` — see below |
381
+
382
+ ```python
383
+ addr = conn.values["home_address"].value # dict, e.g. {"street": "...", "city": "...", ...}
384
+ dob = conn.values["birthday"].value # datetime.date(1990, 5, 17)
385
+ ```
386
+
387
+ ### Binary fields — the lazy `BinaryHandle`
388
+
389
+ A photo/document value is a `BinaryHandle`. Nothing is fetched or decrypted until
390
+ you call `.bytes()` or `.save()`:
391
+
392
+ ```python
393
+ handle = conn.values["passport_scan"].value # BinaryHandle (no network yet)
394
+
395
+ data = handle.bytes() # GET the slot file → decrypt → file bytes
396
+ n = handle.save("/tmp/passport.jpg") # same, written to disk; returns bytes written
397
+ print(handle.value_url) # the opaque slot-keyed URL it fetches from
398
+ ```
399
+
400
+ `.bytes()` GETs the slot-keyed file endpoint, unwraps the API's
401
+ `{"encrypted": true, "value": <wrapper>}` envelope, decrypts with your service
402
+ key, parses the inner JSON envelope (`{"full": "data:…"}` for photos,
403
+ `{"file": "data:…"}` for documents) and base64-decodes the data URI into the
404
+ file bytes. The result is cached on the handle, so repeated calls don't re-fetch.
405
+
406
+ ### `Change(id, event, person_id, slug?, value?, live?, at)`
407
+
408
+ A change-feed / webhook event.
409
+
410
+ | Attribute | Meaning |
411
+ |-----------|---------|
412
+ | `id` | **The stable server change-row id — your dedup key** (captured before the server delete). |
413
+ | `event` | `connection_created`, `connection_deleted`, `field_updated`, `field_deleted`, `consent_accepted`, `consent_declined`. |
414
+ | `person_id` | The person the change is about (may be `None`). |
415
+ | `slug`, `value`, `live` | Present only on `field_updated`; `value` is typed exactly like `Value.value` (incl. a lazy `BinaryHandle` for binaries). Connection/consent events carry no slot/value. |
416
+ | `at` | `datetime` of the change. (There is no separate `updated_at` on a change.) |
417
+
418
+ ### `.raw`
419
+
420
+ Every model carries `.raw` — the underlying *hardened* API dict — for debugging
421
+ or an edge case the SDK didn't model. It still never contains the person's source
422
+ field.
423
+
424
+ See [`docs/model.md`](docs/model.md) for the full reference.
425
+
426
+ ---
427
+
428
+ ## The changes pump
429
+
430
+ The changes feed is a server-side **drain-on-fetch queue**:
431
+ `GET /api/company-data/changes?limit=N` returns up to N events (default 100, max
432
+ 500) **and deletes exactly those rows in the same transaction** — no
433
+ offset/cursor, and the API keeps no copy afterward. So consumption can't be a
434
+ plain list: a consumer crash mid-batch would lose events the API already deleted,
435
+ and a huge backlog must not materialize in memory. `process_changes` solves both.
436
+
437
+ **Per run, repeating until the feed is empty then returning:**
438
+
439
+ 1. **Replay first.** Deliver any un-acked events already in the local buffer (from a previous crashed run), oldest-first.
440
+ 2. **Drain.** When the buffer is empty, fetch one batch and **persist it to the durable file buffer (fsync) BEFORE handing anything out.** This is the backup the API no longer has.
441
+ 3. **Deliver one-by-one.** For each buffered event, oldest-first: decrypt its value *at delivery* (never on disk), build the typed `Change`, call `handler`.
442
+ 4. **Ack / retry / dead-letter.** On success, remove the event from the buffer (ack). On a handler error, retry with backoff up to `max_retries`; then either move it to the dead-letter store and continue (`on_error="deadletter"`, default — one poison event never wedges the stream) or stop and re-raise (`on_error="halt"`). A `DecryptError` on a buffered event (corrupt/truncated ciphertext, rotated key) is **dead-lettered immediately** — re-decrypting can't fix it, so it does *not* burn retries (under `on_error="halt"` it re-raises). Either way it never propagates out and wedges replay.
443
+ 5. Repeat until a drain returns empty **and** the buffer is drained → return.
444
+
445
+ ### The durable buffer
446
+
447
+ * Plain files under `cache_dir` (zero extra dependencies): `pending/` for un-acked events, `deadletter/` for ones that exhausted retries.
448
+ * Stored events keep their **ciphertext** value — **no plaintext PII is ever written to disk**. Decryption happens only at delivery.
449
+ * Writes are crash-safe (temp file → fsync → atomic rename → dir fsync). Files are named with a monotonic, zero-padded sequence so they replay oldest-first.
450
+
451
+ ### Crash safety, at-least-once, and idempotency
452
+
453
+ A batch is durably buffered *before* any delivery, and acked per-item only *after*
454
+ the handler succeeds. The ack can't be atomic with your side-effects — a crash
455
+ between your handler's success and its ack re-delivers that event on the next run.
456
+ That makes delivery **at-least-once**, so:
457
+
458
+ > **Your handler must be idempotent. Dedup on `Change.id`.**
459
+
460
+ `Change.id` is the stable server change-row id, captured before the server delete,
461
+ so it survives crash + replay unchanged.
462
+
463
+ ### No follow mode
464
+
465
+ `process_changes` returns when the feed empties. **You** schedule re-runs — a
466
+ cron job, a `while True: client.process_changes(handle); time.sleep(5)` loop, a
467
+ worker queue, whatever fits. The feed is cheap to poll (see
468
+ [Rate limits](#rate-limits)).
469
+
470
+ ### Worked example
471
+
472
+ ```python
473
+ import time
474
+ from allus_company_data import Client
475
+
476
+ client = Client.from_config("allus.json")
477
+
478
+ def handle(change):
479
+ # Idempotent: skip anything we've already applied.
480
+ if seen(change.id):
481
+ return
482
+ match change.event:
483
+ case "field_updated":
484
+ store_value(change.person_id, change.slug, change.value, live=change.live)
485
+ case "field_deleted":
486
+ clear_value(change.person_id, change.slug)
487
+ case "connection_deleted":
488
+ drop_person(change.person_id)
489
+ case "connection_created" | "consent_accepted" | "consent_declined":
490
+ note_event(change.person_id, change.event, change.at)
491
+ record_seen(change.id)
492
+
493
+ # Schedule your own re-runs; process_changes itself returns when empty.
494
+ while True:
495
+ client.process_changes(handle, batch_size=200, max_retries=5)
496
+ time.sleep(5)
497
+ ```
498
+
499
+ If a handler keeps failing, the event lands in the dead-letter store instead of
500
+ blocking the stream; inspect with `client.dead_letters()` and re-drive with
501
+ `client.retry_dead_letters(handle)` after fixing the cause. See
502
+ [`docs/pump.md`](docs/pump.md).
503
+
504
+ ---
505
+
506
+ ## Webhooks
507
+
508
+ Webhooks are the lower-latency push alternative to polling the changes feed. The
509
+ platform POSTs each change event to your configured webhook URL with:
510
+
511
+ * `X-Allus-Webhook-Id` — which webhook this is (selects the HMAC secret from config).
512
+ * `X-Allus-Signature` — `HMAC-SHA256(rawBody, secret)` as lowercase hex.
513
+ * the body — the same slug-keyed `Change` shape as the pull feed (JSON or XML).
514
+
515
+ All secrets/keys come from config; the helpers take **no key or secret
516
+ arguments**. Use the raw request body bytes (do not re-serialize a parsed body —
517
+ the HMAC is over the exact bytes the platform sent).
518
+
519
+ ### In a web route (Flask)
520
+
521
+ ```python
522
+ from flask import Flask, request, abort
523
+ from allus_company_data import Client, WebhookError
524
+
525
+ app = Flask(__name__)
526
+ client = Client.from_config("allus.json")
527
+
528
+ @app.post("/allus/webhook")
529
+ def allus_webhook():
530
+ try:
531
+ change = client.handle_webhook(request.get_data(), dict(request.headers))
532
+ except WebhookError:
533
+ abort(401) # bad / unknown signature, or unparseable envelope
534
+
535
+ # Same idempotency rule as the pump: dedup on change.id.
536
+ if not seen(change.id):
537
+ apply_change(change)
538
+ record_seen(change.id)
539
+ return ("", 204)
540
+ ```
541
+
542
+ `verify_webhook` / `parse_webhook` let you split the steps if you prefer:
543
+
544
+ ```python
545
+ if not client.verify_webhook(raw_body, headers):
546
+ abort(401)
547
+ change = client.parse_webhook(raw_body, headers)
548
+ ```
549
+
550
+ ### Config-driven secrets
551
+
552
+ Per-webhook HMAC secrets live in the config `webhooks` map, keyed by webhook id;
553
+ the SDK reads `X-Allus-Webhook-Id` off the request and looks up the matching
554
+ secret. A single-webhook service can use the flat `"webhook_secret": "…"`
555
+ shortcut (or `ALLUS_WEBHOOK_SECRET`). An unknown/unconfigured id ⇒ verification
556
+ returns `False` (and `handle_webhook` raises `WebhookError`).
557
+
558
+ ### The `encrypt_payload` account-key envelope
559
+
560
+ If a webhook has `encrypt_payload` enabled, the body is **replaced** by a
561
+ `{"_enc":1,…}` envelope encrypted to your company **account** key (and the HMAC is
562
+ over that envelope — the final bytes sent). `parse_webhook`/`handle_webhook`
563
+ unwrap it transparently using the configured `account_private_key` +
564
+ `account_passphrase`, then decrypt the inner field value with the service key — so
565
+ an encrypted-payload `Change` is identical to a plain one. If you receive such a
566
+ webhook without an `account_private_key` configured, you get a `WebhookError`.
567
+
568
+ > The account-key envelope uses OAEP-**SHA1** (OpenSSL's default), distinct from
569
+ > the OAEP-SHA256 used for person field values — the SDK handles this difference
570
+ > internally; you only supply the account key in config.
571
+
572
+ See [`docs/webhooks.md`](docs/webhooks.md).
573
+
574
+ ---
575
+
576
+ ## Rate limits
577
+
578
+ | Endpoint | Limit | Use it for |
579
+ |----------|-------|-----------|
580
+ | `changes` (the pump) | **generous** | Poll **as often as you like** — it's a cheap drain-on-fetch queue. |
581
+ | `request-fields`, `logs` | moderate | Occasional reads. |
582
+ | `connections`, `connection(id)`, binary `/file` | **heavily limited** | Initial full sync + occasional reconciliation **only** — never as a poll substitute. |
583
+
584
+ A 429 carries `Retry-After`. The SDK backs off and retries automatically:
585
+
586
+ * The transport (`HttpClient`) retries a 429 a bounded number of times honoring `Retry-After`, then surfaces `RateLimitError`.
587
+ * The `connections(...)` generator additionally backs off per `Retry-After` on a surfaced `RateLimitError` and retries the page a bounded number of times before re-raising — so it paces itself within the limit instead of hammering.
588
+
589
+ If you catch a `RateLimitError`, its `.retry_after` is the seconds to wait
590
+ (or `None` when the header was absent).
591
+
592
+ ---
593
+
594
+ ## Errors
595
+
596
+ All from `allus_company_data`. Same taxonomy + names across all six SDKs.
597
+
598
+ | Error | When |
599
+ |-------|------|
600
+ | `ConfigError` | Missing/invalid config, unreadable key file, or wrong passphrase — at construction (fail fast). |
601
+ | `AuthError` | Token fetch/refresh failed (bad `client_id`/`secret`, revoked client); or a 401 survives the one automatic refresh-and-retry. |
602
+ | `ApiError(status, error_key, message)` | Any non-2xx from the API; carries the HTTP `status`, the platform `error_key` (when present), and `message`. |
603
+ | `DecryptError` | A ciphertext wrapper is malformed, the key is wrong, or the GCM tag mismatches. Surfaces when a value is accessed/decrypted. |
604
+ | `WebhookError` | Signature verification failed, or an envelope couldn't be unwrapped/parsed. |
605
+ | `RateLimitError(retry_after)` | A 429 from a rate-limited endpoint. Subclass of `ApiError` (status fixed at 429); carries `retry_after` (seconds, or `None`). |
606
+
607
+ ```python
608
+ from allus_company_data import (
609
+ Client, ConfigError, AuthError, ApiError,
610
+ DecryptError, WebhookError, RateLimitError,
611
+ )
612
+
613
+ try:
614
+ client = Client.from_config("allus.json")
615
+ for conn in client.connections():
616
+ ...
617
+ except ConfigError as e:
618
+ ... # fix the config / key file
619
+ except RateLimitError as e:
620
+ wait(e.retry_after or 60)
621
+ except ApiError as e:
622
+ log(e.status, e.error_key, e.message)
623
+ ```
624
+
625
+ See [`docs/errors.md`](docs/errors.md).
626
+
627
+ ---
628
+
629
+ ## How it's wired
630
+
631
+ Everything below is what the SDK hides so your code only ever sees conclusions.
632
+
633
+ **Auth / token.** An `HttpClient` owns a `client_credentials`-only token. On the
634
+ first call (or when the cached token nears expiry) it POSTs
635
+ `client_id`/`client_secret` to `{api_url}/oauth2/token` and caches the bearer
636
+ token + its expiry; refresh is automatic. A mid-flight 401 triggers exactly one
637
+ refresh-and-retry, then `AuthError`. The token is scoped server-side to **one**
638
+ service, so every call is implicitly that service's data.
639
+
640
+ **Slug resolution.** `request_fields()` is fetched once and cached; its slug→type
641
+ map types every value (so `address` parses to a dict, `photo` becomes a lazy
642
+ binary handle, etc.). The connection/changes endpoints return values keyed by
643
+ **your** request slug — the person's source field is dropped server-side and
644
+ never reaches the SDK.
645
+
646
+ **Decryption (zero-knowledge).** The service private key is loaded **once** at
647
+ construction from the configured encrypted PEM + passphrase into an in-memory RSA
648
+ key. A `decrypt` closure over it is handed to every model factory and the pump —
649
+ the key never appears in a method signature. Each value is a hybrid wrapper
650
+ (`{"_enc":1,"k":rsa_oaep_sha256(aesKey),"iv":…,"d":aes256gcm(…)}`); the SDK
651
+ RSA-OAEP-SHA256 unwraps the AES key, then AES-256-GCM decrypts the payload. **The
652
+ platform only ever holds ciphertext — it never sees your plaintext.**
653
+
654
+ **Binary fetch.** A binary value is a lazy `BinaryHandle` over a slot-keyed
655
+ `value_url`. On `.bytes()`/`.save()` it GETs that file endpoint, unwraps the
656
+ `{"encrypted":true,"value":<wrapper>}` envelope, runs the same service-key
657
+ decrypt to a JSON file-envelope, and base64-decodes its data URI to the file
658
+ bytes. (Slot-keyed, never source-field-keyed.)
659
+
660
+ **The drain-on-fetch feed.** `process_changes` delegates to a `Pump` wired to a
661
+ `fetch_changes` closure (`GET /changes?limit=`, returning raw ciphertext events)
662
+ and a `decrypt` closure (builds a typed `Change`). Because the fetch deletes the
663
+ rows it returns, the pump persists each batch to the durable file buffer
664
+ (ciphertext at rest) before delivery, acks per-item after your handler succeeds,
665
+ and replays the buffer on restart — see [The changes pump](#the-changes-pump).
666
+ ```