postulator 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Arved Klöhn
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,513 @@
1
+ Metadata-Version: 2.4
2
+ Name: postulator
3
+ Version: 0.1.1
4
+ Summary: Programmatically create, read, and publish Audible blog posts to Contentful CMA.
5
+ Author-email: Arved Klöhn <arved.kloehn@gmail.com>
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/Redundando/postulator
8
+ Project-URL: Repository, https://github.com/Redundando/postulator
9
+ Project-URL: Issues, https://github.com/Redundando/postulator/issues
10
+ Keywords: contentful,cms,audible,blog,pydantic
11
+ Classifier: Development Status :: 4 - Beta
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: License :: OSI Approved :: MIT License
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.11
16
+ Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Programming Language :: Python :: 3.13
18
+ Classifier: Topic :: Software Development :: Libraries
19
+ Classifier: Typing :: Typed
20
+ Requires-Python: >=3.11
21
+ Description-Content-Type: text/markdown
22
+ License-File: LICENSE
23
+ Requires-Dist: pydantic>=2.0
24
+ Requires-Dist: httpx>=0.27
25
+ Requires-Dist: python-dotenv>=1.0
26
+ Requires-Dist: scraperator
27
+ Dynamic: license-file
28
+
29
+ # Postulator
30
+
31
+ ## Overview
32
+
33
+ Postulator is a Python library for programmatically creating, reading, and publishing blog posts to Contentful CMA (Content Management API). It provides Pydantic models for posts, rich-text body nodes, audiobook embeds, SEO settings, and authors — plus an async Contentful client that handles ASIN resolution, asset uploads, and entry publishing in a single pipeline.
34
+
35
+ The primary consumer is other LLMs and automation scripts that need to compose and publish Audible blog content to a Contentful space.
36
+
37
+ ## Installation
38
+
39
+ ```bash
40
+ pip install -r requirements.txt
41
+ ```
42
+
43
+ Dependencies: `pydantic`, `httpx`, `python-dotenv`, `scraperator`.
44
+
45
+ ## Configuration
46
+
47
+ Set these environment variables (or use a `.env` file with `python-dotenv`):
48
+
49
+ | Variable | Required | Description |
50
+ |---|---|---|
51
+ | `CONTENTFUL_TOKEN` | Yes | Contentful CMA personal access token |
52
+ | `CONTENTFUL_SPACE_ID` | Yes | Contentful space ID |
53
+ | `CONTENTFUL_ENVIRONMENT` | No | Contentful environment (defaults to `"master"`) |
54
+
55
+ ## Quick Start
56
+
57
+ ```python
58
+ import asyncio
59
+ from datetime import datetime, timezone
60
+ from postulator import Post, ParagraphNode, TextNode, HeadingNode, AudiobookNode
61
+ from postulator.adapters.contentful import ContentfulClient
62
+
63
+ post = Post(
64
+ slug="my-first-post",
65
+ locale="fr-FR",
66
+ title="My First Post",
67
+ date=datetime.now(timezone.utc),
68
+ body=[
69
+ HeadingNode(level=2, children=[TextNode(value="Hello")]),
70
+ ParagraphNode(children=[TextNode(value="This is a paragraph.")]),
71
+ AudiobookNode(asin="B0D53WYQ3S", marketplace="FR"),
72
+ ],
73
+ )
74
+
75
+ async def main():
76
+ async with ContentfulClient(
77
+ space_id="<space_id>",
78
+ environment="master",
79
+ token="<token>",
80
+ ) as client:
81
+ created = await client.create_post(post, publish=True)
82
+ print(created.source_id)
83
+
84
+ asyncio.run(main())
85
+ ```
86
+
87
+ The pipeline automatically:
88
+ 1. Enriches `AudiobookNode`s by scraping Audible (title, cover, PDP URL, authors, etc.)
89
+ 2. Creates/reuses `asin` entries in Contentful
90
+ 3. Creates `asinsList` / `asinsCarousel` entries for list/carousel nodes
91
+ 4. Uploads any `LocalAsset` images
92
+ 5. Creates/updates the `seoSettings` entry if `post.seo` is set
93
+ 6. Creates the `post` entry with rich-text body referencing all embedded entries
94
+ 7. Publishes everything
95
+
96
+ ## Post Model
97
+
98
+ `Post` — the top-level model representing a blog post.
99
+
100
+ | Field | Type | Default | Description |
101
+ |---|---|---|---|
102
+ | `source_id` | `str \| None` | `None` | Contentful entry ID. Required for `write_post`, auto-set by `create_post`. |
103
+ | `slug` | `str` | — | URL slug |
104
+ | `locale` | `str` | — | BCP-47 locale (e.g. `"fr-FR"`, `"en-GB"`). Controls `countryCode` and Audible marketplace — does **not** affect Contentful field locale (always `en-US`). |
105
+ | `title` | `str` | — | Post title |
106
+ | `date` | `datetime` | — | Publish date |
107
+ | `introduction` | `str \| None` | `None` | Short intro text |
108
+ | `body` | `DocumentNode` | — | List of `BlockNode` (the rich-text body) |
109
+ | `featured_image` | `AssetRef \| LocalAsset \| None` | `None` | Hero image |
110
+ | `authors` | `list[AuthorRef]` | `[]` | Author references (must have `source_id` set for write) |
111
+ | `tags` | `list[TagRef]` | `[]` | Tag references (must have `source_id` set for write) |
112
+ | `update_date` | `datetime \| None` | `None` | Last-updated date |
113
+ | `seo` | `SeoMeta \| None` | `None` | SEO settings (created/updated automatically during write) |
114
+ | `custom_recommended_title` | `str \| None` | `None` | Override title for recommended content widgets |
115
+ | `show_in_feed` | `bool` | `True` | Show in blog feed (maps to `hideFromBlogFeed` inverted) |
116
+ | `show_publish_date` | `bool` | `True` | Show publish date on page |
117
+ | `show_hero_image` | `bool` | `True` | Show hero image on page |
118
+ | `related_posts` | `list[str]` | `[]` | Contentful entry IDs of related posts |
119
+
120
+ ## Author Model
121
+
122
+ `Author` — represents a blog author entry. Used with `create_author` / `write_author`.
123
+
124
+ | Field | Type | Default | Description |
125
+ |---|---|---|---|
126
+ | `source_id` | `str \| None` | `None` | Contentful entry ID. Required for `write_author`. |
127
+ | `country_code` | `str \| None` | `None` | e.g. `"FR"`, `"UK"` |
128
+ | `slug` | `str` | — | URL slug |
129
+ | `name` | `str` | — | Display name |
130
+ | `short_name` | `str \| None` | `None` | Abbreviated name |
131
+ | `title` | `str \| None` | `None` | Job title / role |
132
+ | `bio` | `str \| None` | `None` | Biography text |
133
+ | `picture` | `AssetRef \| LocalAsset \| None` | `None` | Profile picture |
134
+ | `seo` | `SeoMeta \| None` | `None` | SEO settings for the author page |
135
+
136
+ ## Body Nodes
137
+
138
+ `DocumentNode` is `list[BlockNode]`. Each `BlockNode` is a discriminated union (on `type`).
139
+
140
+ ### Standard Block Nodes
141
+
142
+ **ParagraphNode** (`type="paragraph"`)
143
+ - `children: list[InlineNode]` — list of `TextNode` and/or `HyperlinkNode`
144
+
145
+ **HeadingNode** (`type="heading"`)
146
+ - `level: int` — 1–6
147
+ - `children: list[InlineNode]`
148
+
149
+ **ListNode** (`type="list"`)
150
+ - `ordered: bool` — `False` for bullet list, `True` for numbered
151
+ - `children: list[ListItemNode]` — each `ListItemNode` contains `list[ParagraphNode]`
152
+
153
+ **BlockquoteNode** (`type="blockquote"`)
154
+ - `children: list[ParagraphNode]`
155
+
156
+ **HrNode** (`type="hr"`)
157
+ - No fields. Horizontal rule.
158
+
159
+ **TableNode** (`type="table"`)
160
+ - `children: list[TableRowNode]` — each row contains `list[TableCellNode]`
161
+ - `TableCellNode` has `is_header: bool` and `children: list[BlockNode]`
162
+
163
+ ### Inline Nodes
164
+
165
+ **TextNode** (`type="text"`)
166
+ - `value: str`
167
+ - `marks: list[Literal["bold", "italic", "underline", "code", "superscript", "subscript"]]`
168
+
169
+ **HyperlinkNode** (`type="hyperlink"`)
170
+ - `url: str`
171
+ - `children: list[TextNode]`
172
+
173
+ ### Embed Block Nodes
174
+
175
+ **AudiobookNode** (`type="audiobook"`)
176
+
177
+ Represents a single Audible product embed. You only need to provide `asin` and `marketplace` — the rest is auto-populated by scraping Audible during write.
178
+
179
+ | Field | Type | Required for render | Description |
180
+ |---|---|---|---|
181
+ | `asin` | `str` | Yes | Audible ASIN |
182
+ | `marketplace` | `str` | Yes | e.g. `"FR"`, `"US"`, `"DE"` |
183
+ | `source_id` | `str \| None` | — | Contentful entry ID (auto-set during write) |
184
+ | `title` | `str \| None` | Yes | Book title (auto-scraped) |
185
+ | `cover_url` | `str \| None` | Yes | Cover image URL (auto-scraped) |
186
+ | `pdp` | `str \| None` | Yes | Product detail page URL (auto-scraped) |
187
+ | `authors` | `list[AudiobookAuthor]` | Yes (name + pdp) | Author names and links (auto-scraped) |
188
+ | `summary` | `str \| None` | No | Publisher summary HTML |
189
+ | `label` | `str \| None` | No | Display label |
190
+ | `release_date` | `str \| None` | No | `YYYY-MM-DD` format |
191
+ | `narrators` | `list[AudiobookNarrator]` | No | Narrator names |
192
+ | `series` | `list[AudiobookSeries]` | No | Series info |
193
+
194
+ **AudiobookListNode** (`type="audiobook-list"`)
195
+
196
+ A list of audiobooks rendered as a grid. Maps to the `asinsList` content type.
197
+
198
+ | Field | Type | Default | Description |
199
+ |---|---|---|---|
200
+ | `asins` | `list[str]` | `[]` | ASINs to include |
201
+ | `asin_entry_ids` | `list[str]` | `[]` | Preserved Contentful entry IDs (used on read round-trip) |
202
+ | `asin_items` | `list[AudiobookListItem]` | `[]` | Per-item overrides for `descriptions="Custom"` mode |
203
+ | `title` | `str \| None` | `None` | Section title |
204
+ | `label` | `str \| None` | `None` | Display label |
205
+ | `body_copy` | `str \| None` | `None` | Intro copy |
206
+ | `player_type` | `str` | `"Cover"` | Player display type |
207
+ | `asins_per_row` | `int` | `1` | Items per row. Must be 1, 3, 4, or 5. |
208
+ | `descriptions` | `str` | `"Full"` | `"Full"`, `"Short"`, or `"Custom"` |
209
+ | `filters` | `list[str] \| None` | `None` | Filter options |
210
+ | `options` | `list[str]` | `[]` | Display options |
211
+
212
+ **AudiobookCarouselNode** (`type="audiobook-carousel"`)
213
+
214
+ A carousel of audiobooks. Maps to the `asinsCarousel` content type. Requires at least 4 ASINs.
215
+
216
+ | Field | Type | Default | Description |
217
+ |---|---|---|---|
218
+ | `asins` | `list[str]` | — | ASINs to include (minimum 4) |
219
+ | `asin_entry_ids` | `list[str]` | `[]` | Preserved Contentful entry IDs |
220
+ | `items_per_slide` | `int \| None` | `None` | Items visible per slide |
221
+ | `title` | `str \| None` | `None` | Carousel title |
222
+ | `subtitle` | `str \| None` | `None` | Subtitle |
223
+ | `body_copy` | `str \| None` | `None` | Intro copy |
224
+ | `cta_text` | `str \| None` | `None` | Call-to-action button text |
225
+ | `cta_url` | `str \| None` | `None` | CTA link URL |
226
+ | `options` | `list[str]` | `[]` | Display options |
227
+
228
+ **ContentImageNode** (`type="content-image"`)
229
+
230
+ An inline image embed. Maps to the `contentImage` content type.
231
+
232
+ | Field | Type | Default | Description |
233
+ |---|---|---|---|
234
+ | `source_id` | `str \| None` | `None` | Contentful entry ID (required for write) |
235
+ | `image` | `AssetRef \| LocalAsset \| None` | `None` | The image asset |
236
+ | `href` | `str \| None` | `None` | Link URL when image is clicked |
237
+ | `alignment` | `str \| None` | `None` | Image alignment |
238
+ | `size` | `str \| None` | `None` | Image size |
239
+
240
+ **UnknownNode** (`type="unknown"`)
241
+ - `raw: dict` — raw Contentful JSON for unrecognized content types. Written back as-is.
242
+
243
+ ## Assets
244
+
245
+ Two asset types:
246
+
247
+ **AssetRef** — references an existing Contentful asset (returned by reads and after upload).
248
+
249
+ | Field | Type | Description |
250
+ |---|---|---|
251
+ | `source_id` | `str \| None` | Contentful asset ID |
252
+ | `url` | `str \| None` | Public URL (always `https://`) |
253
+ | `title` | `str \| None` | Asset title |
254
+ | `alt` | `str \| None` | Alt text |
255
+ | `file_name` | `str \| None` | Original file name |
256
+ | `content_type` | `str \| None` | MIME type |
257
+ | `width` | `int \| None` | Image width in px |
258
+ | `height` | `int \| None` | Image height in px |
259
+ | `size` | `int \| None` | File size in bytes |
260
+
261
+ **LocalAsset** — a local file to upload during write.
262
+
263
+ | Field | Type | Description |
264
+ |---|---|---|
265
+ | `local_path` | `str` | Absolute or relative path to the file on disk |
266
+ | `title` | `str` | Asset title in Contentful |
267
+ | `alt` | `str \| None` | Alt text |
268
+ | `file_name` | `str \| None` | Override file name (defaults to basename of `local_path`) |
269
+ | `content_type` | `str \| None` | Override MIME type (auto-detected if omitted) |
270
+
271
+ During `create_post` / `write_post`, any `LocalAsset` on `featured_image`, `seo.og_image`, or `ContentImageNode.image` is automatically uploaded via `upload_local_asset`, which:
272
+ 1. Reads the file from disk
273
+ 2. Uploads bytes to Contentful's upload endpoint
274
+ 3. Creates an asset entry linking to the upload
275
+ 4. Processes the asset (Contentful server-side)
276
+ 5. Polls until processing completes
277
+ 6. Publishes the asset
278
+ 7. Returns an `AssetRef` that replaces the `LocalAsset` in-place
279
+
280
+ ## SEO Settings
281
+
282
+ `SeoMeta` — maps to the `seoSettings` content type.
283
+
284
+ | Field | Type | Default | Description |
285
+ |---|---|---|---|
286
+ | `source_id` | `str \| None` | `None` | Contentful entry ID (auto-set after write) |
287
+ | `label` | `str \| None` | `None` | Internal label (falls back to `"SEO Settings: {post.title}"`) |
288
+ | `slug_replacement` | `str \| None` | `None` | Override slug |
289
+ | `slug_redirect` | `str \| None` | `None` | Redirect slug |
290
+ | `no_index` | `bool \| None` | `None` | Set `noindex` meta tag |
291
+ | `meta_title` | `str \| None` | `None` | `<title>` tag |
292
+ | `meta_description` | `str \| None` | `None` | Meta description |
293
+ | `og_title` | `str \| None` | `None` | Open Graph title |
294
+ | `og_description` | `str \| None` | `None` | Open Graph description |
295
+ | `og_image` | `AssetRef \| LocalAsset \| None` | `None` | Open Graph image (LocalAsset auto-uploaded) |
296
+ | `schema_type` | `str \| None` | `None` | Schema.org type |
297
+ | `json_ld_id` | `str \| None` | `None` | Linked `jsonLd` entry ID |
298
+ | `similar_content_ids` | `list[str]` | `[]` | Entry IDs for similar content links |
299
+ | `external_links_source_code` | `str \| None` | `None` | Tracking source code for external links |
300
+
301
+ `write_seo` creates a new `seoSettings` entry if `seo.source_id` is `None`, or updates the existing one. It publishes the entry and sets `seo.source_id` in-place.
302
+
303
+ ## Authors & Tags
304
+
305
+ **AuthorRef** and **TagRef** are lightweight references used on `Post`. Both require `source_id` to be set to an existing Contentful entry ID for writes.
306
+
307
+ ```python
308
+ from postulator import AuthorRef, TagRef
309
+
310
+ post.authors = [
311
+ AuthorRef(slug="fr-author", locale="fr-FR", name="FR Author", source_id="52621970-fr-author"),
312
+ ]
313
+ post.tags = [
314
+ TagRef(slug="fr-tag", locale="fr-FR", name="FR Tag", source_id="2093616522-fr-tag"),
315
+ ]
316
+ ```
317
+
318
+ To discover existing author/tag IDs, use:
319
+ - `client.list_authors(country_code="FR")` — returns `list[Author]`
320
+ - `client.list_tags(country_code="FR")` — returns `list[TagRef]`
321
+
322
+ ## Locale & Marketplace Mapping
323
+
324
+ `Post.locale` determines the `countryCode` written to Contentful and the Audible marketplace used for ASIN scraping.
325
+
326
+ | Locale | Country Code | Audible TLD |
327
+ |---|---|---|
328
+ | `de-DE` | `DE` | `audible.de` |
329
+ | `en-GB` | `UK` | `audible.co.uk` |
330
+ | `fr-FR` | `FR` | `audible.fr` |
331
+ | `it-IT` | `IT` | `audible.it` |
332
+ | `en-CA` | `CA_EN` | `audible.ca` |
333
+ | `fr-CA` | `CA_FR` | `audible.ca` |
334
+ | `es-ES` | `ES` | `audible.es` |
335
+ | `en-US` | `US` | `audible.com` |
336
+ | `en-AU` | `AU` | `audible.com.au` |
337
+
338
+ ## ContentfulClient
339
+
340
+ Async HTTP client wrapping the Contentful CMA. Must be used as an async context manager.
341
+
342
+ ```python
343
+ from postulator.adapters.contentful import ContentfulClient
344
+
345
+ async with ContentfulClient(
346
+ space_id="<space_id>",
347
+ environment="master",
348
+ token="<token>",
349
+ on_progress=lambda e: print(e),
350
+ ) as client:
351
+ ...
352
+ ```
353
+
354
+ ### Constructor Parameters
355
+
356
+ | Parameter | Type | Default | Description |
357
+ |---|---|---|---|
358
+ | `space_id` | `str` | — | Contentful space ID |
359
+ | `environment` | `str` | — | Environment name |
360
+ | `token` | `str` | — | CMA access token |
361
+ | `batch_size` | `int` | `200` | Max entries per batch request |
362
+ | `asset_poll_attempts` | `int` | `10` | Polls before asset processing timeout |
363
+ | `asset_poll_interval` | `float` | `1.0` | Seconds between asset processing polls |
364
+ | `on_progress` | `Callable \| None` | `None` | Progress callback (receives `dict` with `event`, `ts`, and extra keys) |
365
+
366
+ ### High-Level Methods
367
+
368
+ **Posts:**
369
+ - `create_post(post, publish=False) -> Post` — full pipeline: enrich ASINs, upload assets, create all entries, create post. Returns the round-tripped `Post`.
370
+ - `write_post(post, publish=True) -> Post` — same pipeline but updates an existing post (`post.source_id` required).
371
+ - `read_post(entry_id, locale="en-US") -> Post` — reads a post and all its linked entries/assets into a `Post` model.
372
+
373
+ **Authors:**
374
+ - `create_author(author, publish=False) -> Author` — creates a new author entry.
375
+ - `write_author(author, publish=True) -> Author` — updates an existing author (`author.source_id` required).
376
+ - `read_author(entry_id, locale="en-US") -> Author` — reads an author entry.
377
+ - `list_authors(country_code, locale="en-US") -> list[Author]` — lists all authors for a country code.
378
+
379
+ **Tags:**
380
+ - `list_tags(country_code, locale="en-US") -> list[TagRef]` — lists all tags for a country code.
381
+
382
+ **Lookup:**
383
+ - `find_entry_by_slug(slug, locale) -> dict | None` — finds a `post` or `category` entry by slug and country code.
384
+
385
+ **SEO:**
386
+ - `write_seo(seo, fallback_label) -> str` — creates or updates a `seoSettings` entry. Returns entry ID.
387
+
388
+ **Assets:**
389
+ - `upload_local_asset(asset: LocalAsset) -> AssetRef` — uploads, processes, publishes a local file. Returns the resulting `AssetRef`.
390
+
391
+ **Embeds (usually called automatically by the post pipeline):**
392
+ - `write_asin(node: AudiobookNode) -> str` — creates or reuses an `asin` entry. Returns entry ID.
393
+ - `write_asin_list(node: AudiobookListNode, asin_nodes) -> str` — creates or updates an `asinsList` entry.
394
+ - `write_asin_carousel(node: AudiobookCarouselNode, asin_nodes) -> str` — creates or updates an `asinsCarousel` entry.
395
+
396
+ ### Low-Level Methods
397
+
398
+ - `get_entry(entry_id) -> dict`
399
+ - `get_entries(entry_ids) -> dict[str, dict]` — batch fetch, auto-paginated
400
+ - `create_entry(content_type, fields) -> dict`
401
+ - `create_entry_with_id(entry_id, content_type, fields) -> dict`
402
+ - `update_entry(entry_id, version, fields) -> dict`
403
+ - `publish_entry(entry_id, version) -> dict`
404
+ - `delete_entry(entry_id, version) -> None`
405
+ - `find_entries(content_type, filters, limit=1) -> list[dict]` — auto-paginated
406
+ - `get_asset(asset_id) -> dict`
407
+ - `get_assets(asset_ids) -> dict[str, dict]` — batch fetch
408
+ - `upload_file(data, content_type) -> str` — returns upload ID
409
+ - `create_asset(fields) -> dict`
410
+ - `process_asset(asset_id, locale) -> None`
411
+ - `publish_asset(asset_id, version) -> dict`
412
+ - `get_content_type(content_type_id) -> dict`
413
+
414
+ ### Retry Behaviour
415
+
416
+ All HTTP requests retry up to 3 times on status codes `429`, `500`, `502`, `503`, `504` with exponential backoff (`2^attempt` seconds). Non-retryable errors raise `httpx.HTTPStatusError` immediately.
417
+
418
+ ### Progress Events
419
+
420
+ The `on_progress` callback receives dicts with an `event` key. Events emitted:
421
+
422
+ | Event | When | Extra keys |
423
+ |---|---|---|
424
+ | `fetching_entries` | Before batch-fetching linked entries during read | `count` |
425
+ | `fetching_nested` | Before fetching nested linked entries | `count` |
426
+ | `parsing` | Before parsing raw Contentful data into models | — |
427
+ | `resolving_asins` | Before batch-resolving existing ASIN entries | `count` |
428
+ | `enriching_asins` | Before scraping Audible for missing ASINs | `count` |
429
+ | `writing_asin` | Before creating/reusing a single ASIN entry | `asin`, `marketplace` |
430
+ | `asin_publish_conflict` | When a uniqueKey conflict is detected and resolved | `asin`, `entry_id` |
431
+ | `asin_publish_failed` | When publishing an ASIN entry fails | `asin`, `message` |
432
+ | `uploading_asset` | Before uploading a local asset | `title`, `file_name` |
433
+ | `asset_upload_failed` | When asset upload fails | `title`, `message` |
434
+ | `asset_processing_timeout` | When asset processing polling times out | `asset_id` |
435
+ | `writing_post` | Before updating a post entry | `entry_id` |
436
+ | `creating_post` | Before creating a new post entry | `slug`, `locale` |
437
+ | `writing_author` | Before updating an author entry | `entry_id` |
438
+ | `creating_author` | Before creating a new author entry | `slug` |
439
+ | `post_invalid` | When post validation fails | `slug`, `reason` |
440
+ | `list_skipped` | When an AudiobookListNode is skipped (0 ASINs) | `reason` |
441
+ | `carousel_skipped` | When a carousel is skipped (<4 ASINs) | `reason`, `asins` |
442
+ | `request_failed` | When an HTTP request fails (non-retryable or after retries) | `method`, `url`, `status_code` |
443
+
444
+ ## Scraperator Adapter
445
+
446
+ The `postulator.adapters.scraperator` module wraps the `scraperator` library to batch-scrape Audible product pages and populate `AudiobookNode` fields.
447
+
448
+ `enrich_audiobook_nodes(nodes, on_progress=None)` fills in `title`, `pdp`, `cover_url`, `summary`, `release_date`, `authors`, and `narrators` on each node — only for fields that are `None`/empty (never overwrites manually-set data).
449
+
450
+ To configure caching:
451
+
452
+ ```python
453
+ from postulator.adapters.scraperator import configure
454
+
455
+ configure(
456
+ cache="local", # "local" or "dynamodb"
457
+ cache_directory="cache", # local cache dir
458
+ cache_table=None, # DynamoDB table name
459
+ scrape_cache="none", # raw scrape cache
460
+ )
461
+ ```
462
+
463
+ ## Known Quirks
464
+
465
+ ### All fields are written under `en-US`
466
+
467
+ Contentful fields are always stored under the `"en-US"` locale key regardless of `post.locale`.
468
+ The `locale` field on `Post` controls `countryCode` (e.g. `FR`, `UK`) and determines which
469
+ Audible marketplace is used for ASIN scraping — it does not affect the Contentful field locale.
470
+ This is intentional given the current space setup but worth keeping in mind if multi-locale
471
+ field storage is ever needed.
472
+
473
+ ### `asinDescriptions` — hybrid inline overrides
474
+
475
+ The `asinDescriptions` field on an `asinsList` entry stores a hybrid structure: each item contains
476
+ both a `sys` link pointing to the underlying `asin` entry **and** inline field overrides (`summary`,
477
+ `cover`, `title`, `editorBadge`, etc.) that take precedence over what is stored on the linked entry.
478
+
479
+ The `descriptions` field controls which data the frontend uses:
480
+ - `"Full"` / `"Short"` — reads summary from the linked `asin` entry directly
481
+ - `"Custom"` — reads the inline overrides from `asinDescriptions` instead
482
+
483
+ When writing an `AudiobookListNode` with custom per-item summaries, populate `asin_items` with
484
+ `AudiobookListItem` instances and set `descriptions="Custom"`. `write_asin_list` will resolve the
485
+ underlying `asin` entry IDs automatically and embed them alongside the inline overrides.
486
+
487
+ ### ASIN deduplication
488
+
489
+ The write pipeline collects all ASINs across the entire post body (single embeds, lists, carousels),
490
+ deduplicates by `{ASIN}-{MARKETPLACE}` key, batch-resolves existing entries, and only scrapes/creates
491
+ missing ones. Duplicate `AudiobookNode`s referencing the same ASIN reuse the same `source_id`.
492
+
493
+ ### Carousel minimum
494
+
495
+ `AudiobookCarouselNode` requires at least 4 ASINs. Carousels with fewer are skipped during write
496
+ (emits `carousel_skipped` event).
497
+
498
+ ### `asins_per_row` validation
499
+
500
+ `AudiobookListNode.asins_per_row` must be one of `1`, `3`, `4`, `5`. Other values raise `ValueError`.
501
+
502
+ ### `source_id` requirements for write
503
+
504
+ - `write_post` requires `post.source_id` (use `create_post` for new posts)
505
+ - `write_author` requires `author.source_id` (use `create_author` for new authors)
506
+ - `ContentImageNode` requires `source_id` for write (must reference an existing `contentImage` entry)
507
+ - `AudiobookListNode` and `AudiobookCarouselNode` get `source_id` auto-set during the post pipeline; when calling `write_asin_list` / `write_asin_carousel` directly, set `source_id` to update or leave `None` to create
508
+
509
+ ### ASIN uniqueKey conflict resolution
510
+
511
+ When publishing an `asin` entry whose `uniqueKey` conflicts with an already-published entry,
512
+ the writer detects the conflict from the Contentful error response, deletes the duplicate,
513
+ and returns the ID of the existing entry.