jmap-email 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,13 @@
1
+ # Changelog
2
+
3
+ All notable changes to `jmap-email` are documented here.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.1.0] - 2026-06-08
9
+
10
+ Initial release. Extracted from the
11
+ [Messages](https://github.com/suitenumerique/messages) project.
12
+
13
+ [0.1.0]: https://github.com/suitenumerique/messages/releases/tag/jmap-email-0.1.0
@@ -0,0 +1,79 @@
1
+ # Contributing to `jmap-email`
2
+
3
+ Thanks for considering a contribution. This package is small and
4
+ focused; the bar for accepting changes is that they make the library
5
+ more correct, more spec-conformant, or better-documented without
6
+ adding runtime dependencies.
7
+
8
+ ## Development environment
9
+
10
+ Two paths are supported.
11
+
12
+ ### Docker (matches CI)
13
+
14
+ From the repository root:
15
+
16
+ ```bash
17
+ make test-jmap-email # full test suite
18
+ make typecheck-jmap-email # ty (Astral)
19
+ ```
20
+
21
+ These spin up the same container image CI uses, so the only divergence
22
+ between local results and CI is the host architecture (arm64 vs x86_64).
23
+
24
+ ### Native Python 3.14.5+
25
+
26
+ ```bash
27
+ cd src/jmap-email
28
+ pip install -e '.[dev]'
29
+
30
+ pytest # default selection — fuzz tests excluded
31
+ pytest -m fuzz # property-based / Hypothesis fuzz tests
32
+ ruff check .
33
+ ruff format --check .
34
+ ```
35
+
36
+ ## Pull-request checklist
37
+
38
+ Every PR should:
39
+
40
+ - Add or update a test that fails without the change and passes with
41
+ it. For parser fixes, the test goes in `tests/test_parser.py` near
42
+ the closest existing class; for composer fixes,
43
+ `tests/test_composer.py`; for shape-helper fixes,
44
+ `tests/test_helpers.py`.
45
+ - Keep `make typecheck-jmap-email` green. `ty` is the source of truth
46
+ for type contracts.
47
+ - Not introduce a runtime dependency. The package's value comes
48
+ from being a clean stdlib wrapper; new deps are rejected unless they
49
+ ship a CVE fix the stdlib won't.
50
+ - Update `CHANGELOG.md` under the `Unreleased` heading when the change
51
+ is user-visible.
52
+ - Update `README.md` when public API surface, conformance status, or
53
+ resource defaults move.
54
+
55
+ ## Coding conventions
56
+
57
+ - **PEP 585 / PEP 604 typing.** Use `list[X]` / `dict[K, V]` /
58
+ `X | None` rather than `typing.List[X]` etc. No `from __future__
59
+ import annotations` — the supported floor is 3.14.5.
60
+ - **No legacy stdlib imports inside hot paths.** If a regex or
61
+ `email.utils` helper is the wrong tool, write the loop.
62
+ - **Module-private symbols** are prefixed with `_`. Anything in
63
+ `__all__` is part of the wire contract — changes that rename or
64
+ remove a name require a major-version bump (post-1.0) or a clear
65
+ `CHANGELOG` Removed entry (during 0.x).
66
+
67
+ ## Adding a regression test for a new CVE / paper
68
+
69
+ 1. Add the test to the appropriate `tests/` module under the
70
+ `TestParserSecurityRegressions` or `TestComposerRFCAudit` class
71
+ (whichever fits).
72
+ 2. Reference the CVE / paper by id in the test docstring.
73
+ 3. Add the entry to the [defense matrix](README.md#defense-matrix) in
74
+ the README.
75
+
76
+ ## Security-sensitive changes
77
+
78
+ See `SECURITY.md`. Don't open a public PR or issue for a vulnerability
79
+ before coordinating disclosure.
@@ -0,0 +1,22 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Agence Nationale de la Cohésion des Territoires (ANCT)
4
+ and contributors.
5
+
6
+ Permission is hereby granted, free of charge, to any person obtaining a copy
7
+ of this software and associated documentation files (the "Software"), to deal
8
+ in the Software without restriction, including without limitation the rights
9
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10
+ copies of the Software, and to permit persons to whom the Software is
11
+ furnished to do so, subject to the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be included in all
14
+ copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22
+ SOFTWARE.
@@ -0,0 +1,472 @@
1
+ Metadata-Version: 2.4
2
+ Name: jmap-email
3
+ Version: 0.1.0
4
+ Summary: A strict-JMAP RFC 8621 Email object library for Python 3.14+ with lenient RFC 5322 / MIME parsing and strict-by-design composition. Zero runtime dependencies.
5
+ Project-URL: Homepage, https://github.com/suitenumerique/messages
6
+ Project-URL: Repository, https://github.com/suitenumerique/messages
7
+ Project-URL: Bug Tracker, https://github.com/suitenumerique/messages/issues
8
+ Project-URL: Changelog, https://github.com/suitenumerique/messages/blob/main/src/jmap-email/CHANGELOG.md
9
+ Author-email: ANCT <contact@suite.anct.gouv.fr>
10
+ License-Expression: MIT
11
+ License-File: LICENSE
12
+ Keywords: composer,email,jmap,mime,parser,rfc5322,rfc8621
13
+ Classifier: Development Status :: 4 - Beta
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: License :: OSI Approved :: MIT License
16
+ Classifier: Natural Language :: English
17
+ Classifier: Programming Language :: Python :: 3
18
+ Classifier: Programming Language :: Python :: 3.14
19
+ Classifier: Topic :: Communications :: Email
20
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
21
+ Requires-Python: <4.0,>=3.14.5
22
+ Provides-Extra: dev
23
+ Requires-Dist: hypothesis>=6.151.0; extra == 'dev'
24
+ Requires-Dist: pylint>=4.0.4; extra == 'dev'
25
+ Requires-Dist: pytest-cov>=7.0.0; extra == 'dev'
26
+ Requires-Dist: pytest>=9.0.0; extra == 'dev'
27
+ Requires-Dist: ruff>=0.15.0; extra == 'dev'
28
+ Requires-Dist: ty>=0.0.44; extra == 'dev'
29
+ Description-Content-Type: text/markdown
30
+
31
+ # jmap-email
32
+
33
+ A strict-JMAP RFC 8621 Email object library for Python 3.14+, with
34
+ lenient RFC 5322 / MIME parsing and strict-by-design composition.
35
+ **Zero runtime dependencies** — the package is a clean wrapper around
36
+ the Python stdlib `email` package, plus null-safe shape accessors over
37
+ the JMAP Email object.
38
+
39
+ The codebase came out of operating an inbound mail pipeline; every CVE
40
+ and research result in the [defense matrix](#defense-matrix) below has
41
+ a regression test under `tests/`.
42
+
43
+ > Status: **beta** while the public API stabilizes. Wire shape
44
+ > conforms to RFC 8621 §4 today; future 0.1.x releases will only add
45
+ > fields, never remove or rename them.
46
+
47
+ ## Why a Python 3.14.5 floor?
48
+
49
+ The standard library `email` package receives frequent bug fixes
50
+ between patch releases, and this library wraps it directly — every fix
51
+ to header parsing, RFC 2047 encoded-words, address-list defects, etc.
52
+ surfaces immediately in our output. The 3.14.5 floor is not arbitrary:
53
+ it carries
54
+ [gh-128110](https://github.com/python/cpython/issues/128110)
55
+ (RFC 2047 §6.2 encoded-word adjacent-pair spacing under modern
56
+ policies), which materially affects the composer.
57
+
58
+ **Aligning on the latest 3.14.x patch is recommended for any
59
+ production deployment.** Each CPython patch release that touches
60
+ `email` is one less class of malformed-input edge case downstream
61
+ pipelines need to paper over manually.
62
+
63
+ ## Quick start
64
+
65
+ ```bash
66
+ pip install jmap-email
67
+ ```
68
+
69
+ ```python
70
+ import jmap_email
71
+
72
+ # Parse raw RFC 5322 bytes → JMAP Email object dict (RFC 8621 §4),
73
+ # or None when the input is fundamentally unparseable (empty, non-bytes,
74
+ # stdlib produced no Message, etc.). parse_email never raises — the
75
+ # failure mode is a single `is None` check at the call site.
76
+ email = jmap_email.parse_email(raw_bytes)
77
+ if email is None:
78
+ ... # log + skip / 400 / quarantine — caller's choice
79
+
80
+ # Recoverable damage (a salvageable malformed header, an unknown
81
+ # charset that fell back to utf-8/replace, …) surfaces in
82
+ # email["_ext"]["defects"] when you opt into the project-extension
83
+ # namespace:
84
+ email_with_ext = jmap_email.parse_email(raw_bytes, extensions=True)
85
+ defects = (email_with_ext or {}).get("_ext", {}).get("defects") or []
86
+ email["subject"] # str | None (NFC normalised)
87
+ email["from"] # [{"name": str | None, "email": str}, ...] | None
88
+ email["sentAt"] # ISO-8601 with offset, e.g. "2026-06-08T14:30:00+02:00"
89
+ email["textBody"] # JMAP EmailBodyPart[]
90
+ email["bodyValues"] # {partId: {"value", "isEncodingProblem", "isTruncated"}}
91
+ email["headers"] # [{"name": "<wire-case>", "value": "<raw>"}, ...]
92
+ email["hasAttachment"] # bool
93
+ email["preview"] # str (≤256 chars, plain-text)
94
+
95
+ # Strict-by-design composer accepts the same JMAP shape on input.
96
+ # sentAt is required (RFC 5322 §3.6.1) — pass it explicitly.
97
+ raw = jmap_email.compose_email({
98
+ "from": [{"name": "Alice", "email": "alice@example.com"}],
99
+ "to": [{"name": "Bob", "email": "bob@example.com"}],
100
+ "subject": "hi",
101
+ "sentAt": "2026-06-08T12:00:00+00:00",
102
+ "textBody": [{"partId": "1", "type": "text/plain", "content": "hello"}],
103
+ })
104
+ # raw is RFC 5322 bytes ready for SMTP delivery (e.g.
105
+ # smtplib.SMTP.sendmail handles dot-stuffing for you).
106
+ ```
107
+
108
+ ## Conformance
109
+
110
+ `parse_email()` produces a JMAP Email object per RFC 8621 §4 with the
111
+ following defaults, matching `Email/get` `defaultProperties`:
112
+
113
+ | Property | Default emitted? | Notes |
114
+ | ------------------- | ---------------- | -------------------------------------- |
115
+ | Email metadata (`id`, `blobId`, `threadId`, `mailboxIds`, `keywords`, `size`, `receivedAt`) | No | Server-set; out of parser scope |
116
+ | `subject` | Yes | NFC-normalised; `null` when absent |
117
+ | `from` / `sender` / `to` / `cc` / `bcc` / `replyTo` | Yes | `EmailAddress[]` or `null` |
118
+ | `messageId` / `inReplyTo` / `references` | Yes | `String[]` (no `<>`) or `null` |
119
+ | `sentAt` | Yes | ISO-8601 with offset; `null` when absent |
120
+ | `headers` | Yes | `[{name, value}]` ordered; `value` is RFC 8621 Raw form (byte-faithful, NOT encoded-word-decoded) |
121
+ | `textBody` / `htmlBody` / `attachments` | Yes | `EmailBodyPart[]` per RFC 8621 §4.1.4 |
122
+ | `hasAttachment` | Yes | |
123
+ | `preview` | Yes | ≤256-char plain-text excerpt; HTML-stripped + whitespace-normalised |
124
+ | `bodyValues` | Yes | `{partId: EmailBodyValue}` per §4.1.5; text-body parts then carry metadata only |
125
+ | `bodyStructure` | Opt-in | `parse_email(raw, body_structure=True)` |
126
+ | `_ext` | Opt-in | `parse_email(raw, extensions=True)` — project extensions; see below |
127
+
128
+ Parser-only fields (`preview`, `bodyValues`, `bodyStructure`,
129
+ `hasAttachment`, `ext`) are ignored on composer input — passing them
130
+ through `compose_email` is harmless.
131
+
132
+ ### Project extensions (`ext`)
133
+
134
+ `extensions=True` adds a single `_ext` sub-dict to the output.
135
+ These fields are NOT in RFC 8621 — they expose information the parser
136
+ already computes so consumers don't have to re-walk the message:
137
+
138
+ - `_ext.defects` — stdlib `MessageDefect` class names collected during
139
+ the parse walk; useful for message-store quarantine policies (the
140
+ Mailman pattern).
141
+ - `_ext.resent` — Resent-* typed projection (see below). Present only
142
+ when the wire carries at least one Resent-* header.
143
+
144
+ ### `EmailBodyPart` extensions
145
+
146
+ RFC 8621 §4.1.4 lists the `EmailBodyPart` shape as `partId`, `blobId`,
147
+ `size`, `headers`, `name`, `type`, `charset`, `disposition`, `cid`,
148
+ `language`, `location`, `subParts`. The library extends that shape
149
+ with two project fields. Where each shows up:
150
+
151
+ | Location | `content` | `sha256` |
152
+ |------------------------|--------------------------|----------|
153
+ | `attachments[i]` | always (`bytes`) | always |
154
+ | `textBody[i]` / `htmlBody[i]` with `body_values=False` | yes (`str` for text/*, base64 `str` for inline media) | no |
155
+ | `textBody[i]` / `htmlBody[i]` with `body_values=True` | absent — content moves to `bodyValues` per §4.1.4 | no |
156
+ | `bodyStructure` and its `subParts` tree | never | never |
157
+
158
+ - `content` exists because the library has no blob store to satisfy
159
+ the spec's `blobId` → fetch-by-blob contract. Callers need the
160
+ bytes somewhere on the part. Attachment `content` is never
161
+ stripped; text/html `content` follows the `body_values` flag.
162
+ - `sha256` is the hex digest of the part's decoded bytes — useful
163
+ for dedup / blob storage. Attachment parts only.
164
+
165
+ `bodyStructure` is pure RFC 8621 shape — no project fields appear
166
+ in that tree, so a strict JMAP consumer can ingest it as-is. Strict
167
+ consumers should ignore unknown keys elsewhere. Composer input that
168
+ includes these fields is harmless — the composer ignores parser-only
169
+ metadata.
170
+
171
+ ### Duplicate scalar headers
172
+
173
+ RFC 5322 §3.6 marks From / Sender / Reply-To / To / Cc / Bcc /
174
+ Message-ID / In-Reply-To / References / Subject / Date as `max=1` —
175
+ each may appear at most once. Real-world senders sometimes emit
176
+ duplicates anyway. The parser follows the stdlib
177
+ `email.message.Message[name]` convention: when a header is repeated,
178
+ the first occurrence wins for the scalar JMAP projection. Every
179
+ occurrence still appears in the `headers` list in document order.
180
+ Background: see "Detection of Weak Links in Authentication Chains",
181
+ USENIX Security 2020.
182
+
183
+ ### Resent-* projection (`_ext.resent`)
184
+
185
+ RFC 8621 §4.1.3 names only the 11 base header convenience properties;
186
+ Resent-* is not on that list. The library pre-computes it as a §4.1.2
187
+ typed-projection idiom and exposes it under `_ext.resent` so forwarded /
188
+ resent mail handling doesn't need to walk `parsed["headers"]`. Sub-
189
+ fields mirror the base properties — `ext.resent["from"]`,
190
+ `["sender"]`, `["replyTo"]`, `["to"]`, `["cc"]`, `["bcc"]`,
191
+ `["messageId"]`, `["date"]` — and the sub-dict is omitted entirely
192
+ when no Resent-* header is present on the wire.
193
+
194
+ ### Pragmatic deviations from RFC 8621
195
+
196
+ Two places where the parser knowingly deviates from the spec text.
197
+ Both are conscious choices for downstream safety; flagging them so
198
+ the contract is explicit:
199
+
200
+ - **`headers[i].value` is not strictly "Raw" form.** RFC 8621 §4.1.2
201
+ defines "Raw" as byte-faithful except for `CRLF+WSP` unfolding.
202
+ We additionally:
203
+ - Strip NUL (`\x00`) bytes — PostgreSQL `TEXT` cannot store NUL, so a
204
+ spec-faithful value would crash any downstream insert. Carrying
205
+ them through and dropping them at the storage boundary would also
206
+ be wrong (different stores would handle them differently).
207
+ - Truncate at `max_header_value_bytes` (default 102 400) — the stdlib
208
+ `_header_value_parser` has quadratic-time hot spots on adversarial
209
+ inputs (gh-136063); truncating early bounds wall-clock.
210
+ The `EmailBodyPart.headers[i].value` field follows the same policy.
211
+
212
+ - **Inline media isn't added to `attachments` in the `multipart/alternative`
213
+ nullified-branch case.** The spec algorithm in §4.1.4 has a clause
214
+ `if ((!htmlBody || !textBody) && isInlineMediaType(part)) attachments.push(part)`.
215
+ We don't honor it. Effect: in the narrow case where a `multipart/
216
+ alternative` ancestor has nullified one body branch and the message
217
+ contains inline `image/*` / `audio/*` / `video/*`, the inline media
218
+ appears in the surviving body but not in `attachments`. Matches what
219
+ Gmail / Apple Mail render; differs from a strict spec walker.
220
+
221
+ ## Resource limits
222
+
223
+ The parser enforces hard caps against adversarial input. Caps are
224
+ passed per-call via a frozen `ParseLimits` instance; the default
225
+ applies when no value is supplied.
226
+
227
+ | Attribute | Default | Source |
228
+ | ---------------------------- | ------- | ---------------------------------------- |
229
+ | `max_mime_nesting_depth` | 100 | Postfix `mime_nesting_limit` |
230
+ | `max_mime_parts` | 1000 | Go `multipartmaxparts` |
231
+ | `max_header_value_bytes` | 102 400 | Postfix `header_size_limit` |
232
+ | `max_address_list_bytes` | 100 000 | Dovecot CVE-2024-23184 analogue |
233
+
234
+ Excess input is silently truncated and logged at WARNING level.
235
+
236
+ A single process can host multiple workloads with different caps —
237
+ the limits travel with the call, never via shared module state:
238
+
239
+ ```python
240
+ from jmap_email import ParseLimits, parse_email
241
+
242
+ bulk = ParseLimits(max_mime_parts=5000, max_mime_nesting_depth=200)
243
+ gateway = ParseLimits(max_mime_parts=500)
244
+
245
+ parse_email(big_archive_message, limits=bulk)
246
+ parse_email(inbound_smtp_bytes, limits=gateway)
247
+ ```
248
+
249
+ `ParseLimits` is frozen and hashable; instances can be reused freely
250
+ across threads and as cache keys.
251
+
252
+ ## Strict-compose, lenient-parse
253
+
254
+ The two entry points use **different stdlib `email.policy` instances
255
+ on purpose**:
256
+
257
+ | Direction | Policy | Why |
258
+ |---|---|---|
259
+ | **Compose** (`compose_email`) | `email.policy.SMTP` (cloned, CTE 7-bit) | Caller-controlled input → must produce strictly RFC-compliant output. Enforces address-list folding, RFC 2047 / 2231 encoding, CRLF, line-length limits. |
260
+ | **Parse** (`parse_email`) | `email.policy.compat32` | Real-world inbound MIME violates the spec routinely. `compat32` is lenient: it returns raw header strings and recovers what it can from broken Content-Transfer-Encoding, missing charsets, malformed structural delimiters. |
261
+
262
+ ### Parser failure mode
263
+
264
+ `parse_email` is total: it returns a `JmapEmail` dict on success or
265
+ `None` on fundamental failure (empty bytes, wrong type, stdlib
266
+ producing no `Message`, or any unhandled internal error). All failures
267
+ log at WARNING level. No exception escapes.
268
+
269
+ ```python
270
+ parsed = parse_email(raw)
271
+ if parsed is None:
272
+ logger.warning("dropped unparseable message")
273
+ return
274
+ ... # use parsed
275
+ ```
276
+
277
+ Recoverable damage (a salvageable malformed header, an unknown
278
+ charset, etc.) keeps the parse on track — those are surfaced in
279
+ `parsed["_ext"]["defects"]` when the caller opts in via
280
+ `parse_email(raw, extensions=True)`.
281
+
282
+ ### Composer error hierarchy
283
+
284
+ `compose_email` raises a typed exception that subclasses `ComposeError`.
285
+ Callers that don't want to discriminate can catch `ComposeError` only;
286
+ callers that do can dispatch on the subclass:
287
+
288
+ ```text
289
+ ComposeError
290
+ ├── InvalidAddressError # missing/malformed `from`, `to`, …
291
+ ├── InvalidMessageIdError # Message-ID / In-Reply-To / References / Content-ID
292
+ ├── InvalidDateError # `sentAt` missing or unparseable
293
+ ├── AttachmentError # missing content, bad base64, bad MIME type, …
294
+ └── HeaderInjectionError # custom-header name not RFC 5322 ftext
295
+ ```
296
+
297
+ The composer is strict on every input the caller controls. Silently
298
+ substituting `now()` for a missing `sentAt`, or quietly dropping a
299
+ broken attachment, would be invisible data loss for the sender.
300
+
301
+ - Want "now" for `sentAt`? Use the `now_sent_at()` helper:
302
+ `compose_email({..., "sentAt": now_sent_at(), ...})`.
303
+ - Handling flaky attachment input? Wrap the compose call in
304
+ `try / except ComposeError` (the base class catches every
305
+ composer error subclass — `InvalidAddressError`,
306
+ `AttachmentError`, etc. — at once).
307
+
308
+ ## Shape helpers
309
+
310
+ Every JMAP field is a list — `from`, `to`, `messageId`, `headers`, …
311
+ Reading them safely usually means writing `parsed.get("from") or []`,
312
+ then indexing, then `.get`. Skip that with these helpers:
313
+
314
+ ```python
315
+ from jmap_email import (
316
+ first_address, first_address_email, first_address_name,
317
+ first_msgid, msgid_chain, sent_at_to_datetime,
318
+ find_header, find_headers, has_header,
319
+ body_part_text, body_text_joined,
320
+ )
321
+ ```
322
+
323
+ About `body_part_text(parsed, part)`: a text body part can have its
324
+ text stored two ways depending on how `parse_email` was called. Either
325
+ the text is right on the part (`part["content"]`), or it's in a
326
+ separate map (`parsed["bodyValues"][part["partId"]]["value"]`). This
327
+ helper checks both, so your code keeps working if the parser default
328
+ ever flips.
329
+
330
+ About `now_sent_at()`: returns the current UTC time formatted as the
331
+ ISO-8601 string `compose_email` expects for `sentAt`. One-liner instead
332
+ of `datetime.now(timezone.utc).isoformat()`.
333
+
334
+ ## Validators
335
+
336
+ Want to know if a string would be accepted by `compose_email` as a
337
+ Message-ID without actually trying to compose? Use `is_valid_msg_id`:
338
+
339
+ ```python
340
+ from jmap_email import is_valid_msg_id
341
+
342
+ if is_valid_msg_id(parent_header):
343
+ reply["inReplyTo"] = [parent_header]
344
+ ```
345
+
346
+ It applies exactly the same checks `compose_email` does — shape,
347
+ length ceiling, no embedded whitespace — but returns `True`/`False`
348
+ instead of raising. Useful for lenient parse paths (archive importers,
349
+ inbound salvaging) that need to decide between keeping a raw id and
350
+ falling back to synthesis without catching an exception.
351
+
352
+ ## Strict vs. lenient `parse_address`
353
+
354
+ `parse_address(s)` is **strict by default**: an input that can't be
355
+ parsed into a valid addr-spec returns `("", "")`. Use this for entry-
356
+ point validation (CLI flags, web form input) — `parse_address("no-at")`
357
+ returning `("", "")` lets the caller reject garbage without a second
358
+ `"@" in result` check.
359
+
360
+ Pass `lenient=True` for archive-import paths that must preserve the
361
+ original wire bytes even when invalid:
362
+
363
+ ```python
364
+ parse_address("no-at-sign") # → ("", "")
365
+ parse_address("no-at-sign", lenient=True) # → ("", "no-at-sign")
366
+ ```
367
+
368
+ `parse_addresses(s)` is always strict per-entry: tuples whose addr-spec
369
+ fails the shape check are silently dropped — so
370
+ `len(parse_addresses(header)) != header.count(",") + 1` is expected
371
+ when the header carries garbage between real entries.
372
+
373
+ ## Defense matrix
374
+
375
+ The parser explicitly defends against the documented attack classes
376
+ below. See the `tests/` directory for regression coverage of each.
377
+
378
+ - **CVE-2023-27043** — `parseaddr`/`getaddresses` display-name confusion
379
+ - **CVE-2024-6923** — header-injection via embedded newlines (compose)
380
+ - **CVE-2024-21742** — Apache James `\r\n` in fields
381
+ - **CVE-2024-23184** — Dovecot unbounded address-list allocation
382
+ - **CVE-2002-1337** — Sendmail `crackaddr` nested-comments shape
383
+ - **CVE-2002-2325** — Pine empty-boundary infinite loop
384
+ - **gh-114906** — embedded newline in RFC 2047 encoded-word
385
+ - **gh-136063** — quadratic-time hot spots in `_header_value_parser`
386
+ - **gh-137687** — base64 padding `==` truncation
387
+ - **PortSwigger "Splitting the Email Atom"** (DEF CON 32 2024) —
388
+ encoded-word smuggling of structural chars (`@`, `,`, `<`, `>`, NUL)
389
+ - **Inbox Invasion (CCS '24)** — duplicate boundary parser confusion
390
+ - **Mailsploit** — NUL-byte truncation in encoded-words
391
+ - **USENIX 2020 "Weak Links in Auth Chains"** — duplicate `From:`,
392
+ group-syntax, CFWS-in-address handling
393
+
394
+ ## Compatibility
395
+
396
+ - **Python** 3.14.5+ (see [Why a Python 3.14.5 floor?](#why-a-python-3145-floor))
397
+ - **Platforms tested in CI:** Linux on x86_64 and arm64
398
+ - **macOS / Windows / PyPy / free-threaded build:** untested; expected
399
+ to work since the package has zero compiled extensions and zero
400
+ runtime dependencies. Reports of breakage welcome via the issue
401
+ tracker.
402
+
403
+ ## Performance and concurrency
404
+
405
+ - **Thread-safe** at the public API level. Module-level state
406
+ (`_HEADER_FACTORY`, `_POLICY`) is constructed once at import and
407
+ never mutated after.
408
+ - **No I/O.** Every entry point operates on in-memory bytes or dicts.
409
+ - **No global rate limits or singletons** beyond the immutable
410
+ registries above. Multiple processes / asyncio tasks may call
411
+ `parse_email` / `compose_email` concurrently without coordination.
412
+
413
+ Ballpark wall time on an Apple M2 (single thread, in-process):
414
+ ≈ 0.4 ms per typical 5 kB inbound message; ≈ 1 ms per 100 kB MIME
415
+ multipart with embedded images. Use your own corpus to measure for
416
+ your workload — message-shape variation dominates.
417
+
418
+ ## Examples
419
+
420
+ Runnable scripts under `examples/`:
421
+
422
+ - `examples/parse_and_print.py` — parse raw bytes and pretty-print the
423
+ JMAP shape
424
+ - `examples/import_eml_safely.py` — read an `.eml` off disk, handle
425
+ the `None` failure path, surface defects, print key fields
426
+ - `examples/compose_with_attachment.py` — compose a multipart message
427
+ with a regular attachment
428
+ - `examples/inline_image_roundtrip.py` — compose + re-parse a message
429
+ with an inline image, asserting the CID survives
430
+ - `examples/encoded_word_subject.py` — compose a non-ASCII Subject
431
+ and re-parse it
432
+
433
+ ## Development
434
+
435
+ The repository ships a docker-compose-based test environment so the
436
+ package can be exercised against the exact Python / pytest / hypothesis
437
+ versions CI uses:
438
+
439
+ ```bash
440
+ make test-jmap-email # run the full test suite (zero infra deps)
441
+ make typecheck-jmap-email # static check via Astral's `ty` (Rust)
442
+ ```
443
+
444
+ To run tests outside docker:
445
+
446
+ ```bash
447
+ cd src/jmap-email
448
+ pip install -e '.[dev]'
449
+ pytest # default selection, fuzz tests excluded
450
+ pytest -m fuzz # property-based / Hypothesis fuzz
451
+ ruff check .
452
+ ruff format --check .
453
+ ```
454
+
455
+ See `CONTRIBUTING.md` for the contribution workflow.
456
+
457
+ ## License
458
+
459
+ MIT — see `LICENSE`.
460
+
461
+ ## Versioning
462
+
463
+ Semantic. Public API is everything exported in `jmap_email.__all__`;
464
+ anything prefixed with `_` is internal and may change between patch
465
+ releases.
466
+
467
+ `__version__` is exposed at the module level.
468
+
469
+ ## Security
470
+
471
+ Security-sensitive reports go through GitHub Security Advisories — see
472
+ `SECURITY.md` for the disclosure policy.