mutable-url 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,83 @@
1
+ Metadata-Version: 2.4
2
+ Name: mutable-url
3
+ Version: 0.1.0
4
+ Summary: MutableURL class for editing URLs
5
+ Keywords: url,mutability,mutable
6
+ Author: Phil Pennock
7
+ Author-email: Phil Pennock <python-pkgs@pennock-tech.com>
8
+ License-Expression: ISC
9
+ Classifier: Development Status :: 5 - Production/Stable
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Operating System :: OS Independent
13
+ Classifier: Topic :: Internet
14
+ Requires-Python: >=3.14
15
+ Project-URL: repository, https://github.com/PennockTech/mutable_url.py
16
+ Project-URL: issues, https://github.com/PennockTech/mutable_url.py/issues
17
+ Description-Content-Type: text/markdown
18
+
19
+ mutable_url.py
20
+ ==============
21
+
22
+ This repository holds the source for the `mutable_url` Python package.
23
+
24
+ This provides one utility class, `MutableURL`.
25
+
26
+ It also has one hook-point, `configure_idna`, to allow callers to opt into
27
+ using correct modern DNS internationalised hostnames; the default is to use
28
+ only the IDNA support in Python stdlib, which is out of date _but_ usually
29
+ good enough.
30
+
31
+ ```python
32
+ from mutable_url import MutableURL
33
+
34
+ u = MutableURL('http://www.example.org/hum')
35
+ u.scheme = https
36
+
37
+ print(u)
38
+ call_func_wanting_url(u.url)
39
+
40
+ u2 = MutableURL.from_parts(scheme='https', host='www.example.com',
41
+ query_params={'foo': 'bar', 'baz': '3'})
42
+ ```
43
+
44
+ The `from_parts` class-method constructor requires keyword invocation. You
45
+ can start with an empty URL. There is no default scheme. At present, the
46
+ query_params values must be strings.
47
+
48
+ A `MutableURL` can be reconstructed into a string form via `str()` or by using
49
+ the `.url` property (which just does that for you). This is the most stable
50
+ interface for passing into other URL handling classes: I haven't found any
51
+ other intermediate representation for cross-API compatibility worth the added
52
+ complexity.
53
+
54
+ The Authority section is supported, including further virtualized accessors to
55
+ allow individual access to `username` and `password` fields. Either one can
56
+ be empty, to support auth schemes which only use one or the other (such as
57
+ issued tokens provided as a password for an empty user).
58
+
59
+ The hostname part has two forms, which differ only when IDNA
60
+ internationalisation is in play:
61
+ * `host`: the on-the-wire ASCII form (ACE), which is also what appears in the URL
62
+ * `hostname`: the presentation-layer form, as UTF-8
63
+
64
+ There are multiple accessors for query parameters handling:
65
+ * `query_params`: a simple `dict` view which assumes keys are not repeated
66
+ * `query_params_multi`: a dict where the value is a list of strings, one for
67
+ each instance
68
+ * `query_params_list`: a list of `(key,value)` tuples.
69
+
70
+ Fragments are supported.
71
+
72
+ ## AI Disclosures
73
+
74
+ The original implementation of `MutableURL` was written by a human and
75
+ committed to a private repository on 2018-04-27.
76
+ That initial version depended upon `urllib3` (by way of `requests`).
77
+
78
+ On 2026-02-18, Anthropic's Claude was used to rewrite the implementation; the
79
+ code was subjected to thorough human code-review and there were many
80
+ iterations as aspects were refined. The goal was to move to only depending
81
+ upon the Python stdlib, to add more accessors (for compatibility with API
82
+ expectations of _both_ `urllib` _and_ `urllib3`), and to add tests. Along the
83
+ way, we also collected IDNA support, while keeping the default as stdlib-only.
@@ -0,0 +1,65 @@
1
+ mutable_url.py
2
+ ==============
3
+
4
+ This repository holds the source for the `mutable_url` Python package.
5
+
6
+ This provides one utility class, `MutableURL`.
7
+
8
+ It also has one hook-point, `configure_idna`, to allow callers to opt into
9
+ using correct modern DNS internationalised hostnames; the default is to use
10
+ only the IDNA support in Python stdlib, which is out of date _but_ usually
11
+ good enough.
12
+
13
+ ```python
14
+ from mutable_url import MutableURL
15
+
16
+ u = MutableURL('http://www.example.org/hum')
17
+ u.scheme = https
18
+
19
+ print(u)
20
+ call_func_wanting_url(u.url)
21
+
22
+ u2 = MutableURL.from_parts(scheme='https', host='www.example.com',
23
+ query_params={'foo': 'bar', 'baz': '3'})
24
+ ```
25
+
26
+ The `from_parts` class-method constructor requires keyword invocation. You
27
+ can start with an empty URL. There is no default scheme. At present, the
28
+ query_params values must be strings.
29
+
30
+ A `MutableURL` can be reconstructed into a string form via `str()` or by using
31
+ the `.url` property (which just does that for you). This is the most stable
32
+ interface for passing into other URL handling classes: I haven't found any
33
+ other intermediate representation for cross-API compatibility worth the added
34
+ complexity.
35
+
36
+ The Authority section is supported, including further virtualized accessors to
37
+ allow individual access to `username` and `password` fields. Either one can
38
+ be empty, to support auth schemes which only use one or the other (such as
39
+ issued tokens provided as a password for an empty user).
40
+
41
+ The hostname part has two forms, which differ only when IDNA
42
+ internationalisation is in play:
43
+ * `host`: the on-the-wire ASCII form (ACE), which is also what appears in the URL
44
+ * `hostname`: the presentation-layer form, as UTF-8
45
+
46
+ There are multiple accessors for query parameters handling:
47
+ * `query_params`: a simple `dict` view which assumes keys are not repeated
48
+ * `query_params_multi`: a dict where the value is a list of strings, one for
49
+ each instance
50
+ * `query_params_list`: a list of `(key,value)` tuples.
51
+
52
+ Fragments are supported.
53
+
54
+ ## AI Disclosures
55
+
56
+ The original implementation of `MutableURL` was written by a human and
57
+ committed to a private repository on 2018-04-27.
58
+ That initial version depended upon `urllib3` (by way of `requests`).
59
+
60
+ On 2026-02-18, Anthropic's Claude was used to rewrite the implementation; the
61
+ code was subjected to thorough human code-review and there were many
62
+ iterations as aspects were refined. The goal was to move to only depending
63
+ upon the Python stdlib, to add more accessors (for compatibility with API
64
+ expectations of _both_ `urllib` _and_ `urllib3`), and to add tests. Along the
65
+ way, we also collected IDNA support, while keeping the default as stdlib-only.
@@ -0,0 +1,38 @@
1
+ [project]
2
+ name = "mutable-url"
3
+ version = "0.1.0" # will bump to 1.0.0 via trusted publishing
4
+ description = "MutableURL class for editing URLs"
5
+ readme = "README.md"
6
+ license = "ISC"
7
+ authors = [
8
+ { name="Phil Pennock", email="python-pkgs@pennock-tech.com" },
9
+ ]
10
+ classifiers = [
11
+ "Development Status :: 5 - Production/Stable", # FIXME: it's 0.1.0 but only until we switch to trusted publishing
12
+ "Intended Audience :: Developers",
13
+ "Programming Language :: Python :: 3",
14
+ "Operating System :: OS Independent",
15
+ "Topic :: Internet",
16
+ ]
17
+ keywords = [
18
+ "url", "mutability", "mutable"
19
+ ]
20
+
21
+ requires-python = ">=3.14"
22
+ dependencies = []
23
+
24
+ [project.urls]
25
+ repository = "https://github.com/PennockTech/mutable_url.py"
26
+ issues = "https://github.com/PennockTech/mutable_url.py/issues"
27
+
28
+ [build-system]
29
+ requires=["uv_build<=0.11.0"]
30
+ build-backend = "uv_build"
31
+
32
+ [dependency-groups]
33
+ dev = [
34
+ "ruff",
35
+ "ty",
36
+
37
+ "pytest"
38
+ ]
@@ -0,0 +1,730 @@
1
+ # Networking support
2
+
3
+ """mutable_url library
4
+
5
+ Provides the MutableURL class for URL manipulation tasks.
6
+ - regular constructor takes a string URL
7
+ - from_parts constructor takes named parameters to set via fields
8
+ - MutableURL('https://www.spodhuis.org')
9
+ - MutableURL.from_parts(host='www.spodhuis.org', scheme='https')
10
+
11
+ IDNA note: by default hostname encode/decode uses Python's stdlib 'idna'
12
+ codec, which implements IDNA2003 (RFC 3490). For IDNA2008 (RFC 5891) -
13
+ needed for some newer TLDs and stricter validity rules - call
14
+ configure_idna() with encode/decode callables backed by the third-party
15
+ 'idna' package (pip install idna). Example:
16
+
17
+ import idna
18
+ import mutable_url
19
+ mutable_url.configure_idna(
20
+ encode=lambda h: idna.encode(h, alabel=True).decode('ascii'),
21
+ decode=lambda h: idna.decode(h),
22
+ )
23
+
24
+ The module never imports 'idna' itself; configure_idna() is the sole
25
+ injection point for that dependency.
26
+ """
27
+
28
+ __author__ = 'phil@pennock-tech.com (Phil Pennock)'
29
+ __credits__ = [
30
+ 'Claude Sonnet (Anthropic) — urllib.parse rewrite, IDNA/encoding logic',
31
+ ]
32
+
33
+ # Uses urllib.parse (stdlib) throughout; no third-party dependencies by default.
34
+ #
35
+ # Field naming conventions supported:
36
+ # urllib3-style : scheme, auth, host, port, path, query, fragment
37
+ # urllib.parse-style : hostname (≡ host but unicode/IDNA-decoded, writable),
38
+ # username, password (percent-decoded, writable),
39
+ # netloc, request_uri
40
+
41
+ import typing
42
+ import urllib.parse
43
+
44
+ QueryParamList = list[tuple[str, str | None]]
45
+ QueryParamDict = dict[str, str | None]
46
+ QueryParamMulti = dict[str, list[str | None]]
47
+
48
+ __all__ = []
49
+
50
+
51
+ def export(f):
52
+ __all__.append(f.__name__)
53
+ return f
54
+
55
+
56
+ # ---------------------------------------------------------------------------
57
+ # Userinfo (username / password) percent-encoding helpers
58
+ # ---------------------------------------------------------------------------
59
+
60
+ # RFC 3986 §3.2.1 — characters that may appear unencoded in userinfo:
61
+ # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
62
+ # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
63
+ # ":" additionally separates user from password, so it must be encoded within
64
+ # the username component but may appear unencoded in the password component.
65
+ _USERINFO_SAFE_USER = "-._~!$&'()*+,;=" # colon NOT safe in username
66
+ _USERINFO_SAFE_PASS = "-._~!$&'()*+,;=:" # colon safe in password
67
+
68
+
69
+ def _encode_userinfo(value: str, *, is_password: bool = False) -> str:
70
+ """Percent-encode a plain-text username or password for embedding in a URL."""
71
+ safe = _USERINFO_SAFE_PASS if is_password else _USERINFO_SAFE_USER
72
+ return urllib.parse.quote(value, safe=safe)
73
+
74
+
75
+ def _decode_userinfo(raw: str) -> str:
76
+ """Percent-decode a raw userinfo component to a plain-text string."""
77
+ return urllib.parse.unquote(raw)
78
+
79
+
80
+ # ---------------------------------------------------------------------------
81
+ # Hostname / IDNA helpers — default IDNA2003 (stdlib) implementations
82
+ # ---------------------------------------------------------------------------
83
+
84
+ def _is_ip_literal(host: str) -> bool:
85
+ """Return True for IPv6 bracket literals or bare IPv4 dotted-decimal."""
86
+ if not host:
87
+ return False
88
+ if host.startswith('['): # IPv6: [::1]
89
+ return True
90
+ parts = host.rstrip('.').split('.')
91
+ if len(parts) == 4:
92
+ try:
93
+ return all(0 <= int(p) <= 255 for p in parts)
94
+ except ValueError:
95
+ pass
96
+ return False
97
+
98
+
99
+ def _default_encode_host(host: str | None) -> str | None:
100
+ """Encode a (possibly unicode) hostname to ASCII-Compatible Encoding.
101
+
102
+ Pure-ASCII hostnames and IP literals pass through unchanged. Encoding
103
+ failures fall back to the original string so that already-encoded or
104
+ otherwise non-encodable hosts are not silently lost.
105
+
106
+ Uses IDNA2003 (Python stdlib 'idna' codec). Replace via configure_idna()
107
+ if you need IDNA2008.
108
+ """
109
+ if not host or _is_ip_literal(host):
110
+ return host
111
+ try:
112
+ host.encode('ascii')
113
+ return host # already ASCII — nothing to do
114
+ except UnicodeEncodeError:
115
+ pass
116
+ try:
117
+ return host.encode('idna').decode('ascii')
118
+ except (UnicodeError, UnicodeDecodeError):
119
+ return host # best-effort fallback
120
+
121
+
122
+ def _default_decode_host(host: str | None) -> str | None:
123
+ """Decode a punycode / ACE hostname to its unicode form.
124
+
125
+ Pure-unicode or non-punycode labels pass through unchanged. IP literals
126
+ are returned as-is.
127
+
128
+ Uses IDNA2003 (Python stdlib 'idna' codec). Replace via configure_idna()
129
+ if you need IDNA2008.
130
+ """
131
+ if not host or _is_ip_literal(host):
132
+ return host
133
+ try:
134
+ return host.encode('ascii').decode('idna')
135
+ except (UnicodeError, UnicodeDecodeError):
136
+ return host
137
+
138
+
139
+ # ---------------------------------------------------------------------------
140
+ # Module-level IDNA dispatch hooks
141
+ #
142
+ # All internal code calls _encode_host / _decode_host rather than the
143
+ # _default_* functions directly, so that configure_idna() takes effect
144
+ # everywhere without callers needing to do anything.
145
+ # ---------------------------------------------------------------------------
146
+
147
+ _encode_host = _default_encode_host
148
+ _decode_host = _default_decode_host
149
+
150
+
151
+ @export
152
+ def configure_idna(
153
+ *,
154
+ encode: 'typing.Callable[[str | None], str | None]',
155
+ decode: 'typing.Callable[[str | None], str | None]',
156
+ ) -> None:
157
+ """Replace the module-level IDNA encode/decode hooks.
158
+
159
+ Both callables receive a hostname string (or None) and must return a
160
+ hostname string (or None). They are responsible for handling IP literals
161
+ and None themselves, or they may delegate to _is_ip_literal() for that
162
+ guard.
163
+
164
+ Typical usage with the 'idna' package (IDNA2008):
165
+
166
+ import idna
167
+ import mutable_url
168
+
169
+ def _enc(host):
170
+ if host is None or mutable_url._is_ip_literal(host):
171
+ return host
172
+ try:
173
+ return idna.encode(host, alabel=True).decode('ascii')
174
+ except idna.core.InvalidCodepoint:
175
+ return host # or raise, depending on your policy
176
+
177
+ def _dec(host):
178
+ if host is None or mutable_url._is_ip_literal(host):
179
+ return host
180
+ try:
181
+ return idna.decode(host)
182
+ except (idna.core.InvalidCodepoint, UnicodeError):
183
+ return host
184
+
185
+ mutable_url.configure_idna(encode=_enc, decode=_dec)
186
+
187
+ This function is the *only* place where an IDNA2008 dependency is wired
188
+ in; the module itself never imports 'idna'.
189
+ """
190
+ global _encode_host, _decode_host
191
+ _encode_host = encode
192
+ _decode_host = decode
193
+
194
+
195
+ # ---------------------------------------------------------------------------
196
+ # Query-string parsing / serialisation helpers
197
+ # ---------------------------------------------------------------------------
198
+
199
+ def _parse_query_params(query: str | None) -> QueryParamList:
200
+ """Parse a raw query string into an ordered list of (key, value) pairs.
201
+
202
+ Decoding uses ``urllib.parse.unquote_plus``, which converts ``%XX``
203
+ sequences *and* ``+`` to spaces — the standard behaviour for
204
+ ``application/x-www-form-urlencoded`` data (HTML forms, most REST APIs).
205
+ Use ``MutableURL.query`` when you need the raw, still-encoded string.
206
+
207
+ Value semantics:
208
+
209
+ * ``?flag`` → ``("flag", None)`` — no ``=`` sign at all
210
+ * ``?flag=`` → ``("flag", "")`` — ``=`` present but value is empty
211
+ * ``?flag=v`` → ``("flag", "v")`` — normal key-value pair
212
+ * Empty segments (consecutive ``&`` separators) are silently skipped.
213
+ """
214
+ if not query:
215
+ return []
216
+ result: QueryParamList = []
217
+ for part in query.split('&'):
218
+ if not part:
219
+ continue
220
+ if '=' in part:
221
+ raw_k, raw_v = part.split('=', 1)
222
+ result.append((
223
+ urllib.parse.unquote_plus(raw_k),
224
+ urllib.parse.unquote_plus(raw_v),
225
+ ))
226
+ else:
227
+ result.append((urllib.parse.unquote_plus(part), None))
228
+ return result
229
+
230
+
231
+ class _QueryParamView(dict):
232
+ """dict subclass returned by ``query_params``; subscript assignment writes back to the URL.
233
+
234
+ This ensures that ``u.query_params['key'] = value`` actually mutates the URL
235
+ rather than silently mutating a discarded copy. Values are coerced to ``str``
236
+ (or left as ``None`` to produce a valueless ``key``-only parameter).
237
+ """
238
+
239
+ __slots__ = ('_setter',)
240
+
241
+ def __init__(self, setter, data: 'QueryParamDict') -> None:
242
+ super().__init__(data)
243
+ self._setter = setter # callable: _set_query_params(dict)
244
+
245
+ def __setitem__(self, key: str, value) -> None:
246
+ coerced = None if value is None else str(value)
247
+ super().__setitem__(key, coerced)
248
+ self._setter(self)
249
+
250
+
251
+ def _encode_query_params(params: QueryParamList) -> str | None:
252
+ """Encode a list of (key, value) pairs into a raw query string.
253
+
254
+ Encoding uses ``urllib.parse.quote_plus``, which encodes spaces as ``+``
255
+ and percent-encodes everything else — the inverse of ``unquote_plus``.
256
+
257
+ * A ``None`` value produces a key-only parameter (no ``=`` sign).
258
+ * An empty-string value produces ``key=``.
259
+ * Returns ``None`` (not ``""``) when *params* is empty, consistent with
260
+ how the rest of ``MutableURL`` represents absent components.
261
+ """
262
+ if not params:
263
+ return None
264
+ parts: list[str] = []
265
+ for k, v in params:
266
+ k_enc = urllib.parse.quote_plus(k)
267
+ if v is None:
268
+ parts.append(k_enc)
269
+ else:
270
+ parts.append(f'{k_enc}={urllib.parse.quote_plus(v)}')
271
+ return '&'.join(parts) or None
272
+
273
+
274
+ # ---------------------------------------------------------------------------
275
+ # Internal URL value object
276
+ # ---------------------------------------------------------------------------
277
+
278
+ class _URL:
279
+ """Lightweight URL value object backed by individual RFC 3986 components.
280
+
281
+ ``auth`` stores the raw percent-encoded userinfo string exactly as it
282
+ appears in the URL (e.g. ``"user%40corp:p%40ss"``). Callers that need
283
+ decoded values should use MutableURL's ``username``/``password``
284
+ properties.
285
+
286
+ ``host`` stores the ASCII/punycode (ACE) form of the hostname so that
287
+ ``__str__`` always produces a valid ASCII URL. MutableURL's ``hostname``
288
+ property handles unicode ↔ IDNA conversion for human-facing access.
289
+ """
290
+ __slots__ = ('scheme', 'auth', 'host', 'port', 'path', 'query', 'fragment')
291
+
292
+ def __init__(self, scheme=None, auth=None, host=None, port=None,
293
+ path=None, query=None, fragment=None):
294
+ self.scheme = scheme or None
295
+ self.auth = auth or None
296
+ self.host = host or None
297
+ self.port = int(port) if port is not None else None
298
+ self.path = path or None
299
+ self.query = query or None
300
+ self.fragment = fragment or None
301
+
302
+ @property
303
+ def netloc(self) -> str | None:
304
+ """Reconstructed ``[userinfo@]host[:port]`` component."""
305
+ if not self.host:
306
+ return None
307
+ nl = self.host
308
+ if self.port is not None:
309
+ nl = f'{nl}:{self.port}'
310
+ if self.auth:
311
+ nl = f'{self.auth}@{nl}'
312
+ return nl
313
+
314
+ @property
315
+ def request_uri(self) -> str:
316
+ """Path and query string combined, as used in an HTTP request line."""
317
+ uri = self.path or '/'
318
+ if self.query:
319
+ uri = f'{uri}?{self.query}'
320
+ return uri
321
+
322
+ @property
323
+ def url(self) -> str:
324
+ return str(self)
325
+
326
+ def __str__(self) -> str:
327
+ return urllib.parse.urlunsplit(urllib.parse.SplitResult(
328
+ scheme = self.scheme or '',
329
+ netloc = self.netloc or '',
330
+ path = self.path or '',
331
+ query = self.query or '',
332
+ fragment = self.fragment or '',
333
+ ))
334
+
335
+ def __repr__(self) -> str:
336
+ return (f'_URL(scheme={self.scheme!r}, auth={self.auth!r}, '
337
+ f'host={self.host!r}, port={self.port!r}, path={self.path!r}, '
338
+ f'query={self.query!r}, fragment={self.fragment!r})')
339
+
340
+
341
+ def _parse_url(u: str) -> _URL:
342
+ """Parse a URL string into a _URL, preserving percent-encoding in auth."""
343
+ p = urllib.parse.urlsplit(u)
344
+
345
+ # Extract raw userinfo from netloc rather than using p.username / p.password:
346
+ # the latter are *decoded* by urllib.parse, so reconstructing auth from them
347
+ # would corrupt credentials containing percent-encoded characters (e.g. a
348
+ # literal '@' encoded as '%40').
349
+ auth = None
350
+ if '@' in p.netloc:
351
+ raw_userinfo, _ = p.netloc.rsplit('@', 1)
352
+ auth = raw_userinfo or None
353
+
354
+ # Always store host in ASCII/punycode form so __str__ produces a valid URL
355
+ # even when the caller supplies a unicode (IDN) hostname. Goes through the
356
+ # hook so any configure_idna() call is respected at parse time too.
357
+ #
358
+ # Re-wrap IPv6 addresses in brackets: urlsplit strips them from
359
+ # p.hostname (e.g. '[::1]' in the URL becomes '::1' in p.hostname),
360
+ # but we need brackets for correct URL reconstruction via netloc.
361
+ host_str = p.hostname or None
362
+ if host_str and ':' in host_str:
363
+ host_str = f'[{host_str}]'
364
+ host = _encode_host(host_str)
365
+
366
+ return _URL(
367
+ scheme = p.scheme or None,
368
+ auth = auth,
369
+ host = host,
370
+ port = p.port, # already int-or-None from urlsplit
371
+ path = p.path or None,
372
+ query = p.query or None,
373
+ fragment = p.fragment or None,
374
+ )
375
+
376
+
377
+ # ---------------------------------------------------------------------------
378
+ # Public API
379
+ # ---------------------------------------------------------------------------
380
+
381
+ @export
382
+ class MutableURL(object):
383
+ """Mutable URL object; every component field can be set.
384
+
385
+ Optimised for fast lookup: every mutation of a base field reconstructs the
386
+ internal _URL so that derived properties (netloc, url, …) remain cheap.
387
+
388
+ Mutable fields — urllib3-style (raw/encoded values, no implicit transform):
389
+ scheme, auth, host, port, path, query, fragment
390
+
391
+ Mutable fields — urllib.parse-style (encode/decode transparently):
392
+ hostname unicode ↔ IDNA/punycode; delegates storage to ``host``
393
+ username decoded plain-text; encodes on write, updates ``auth``
394
+ password decoded plain-text; encodes on write, updates ``auth``
395
+
396
+ Read-only computed fields (present in both conventions):
397
+ netloc, request_uri, url
398
+
399
+ Query-parameter dict views (all decode via ``unquote_plus``):
400
+ query_params dict[str, str|None] last-value-wins; writable
401
+ query_params_list list[tuple[str, str|None]] ordered, lossless; writable
402
+ query_params_multi dict[str, list[str|None]] all values per key; read-only
403
+
404
+ Relationship between ``host`` and ``hostname``:
405
+ ``host`` is the storage/wire-format field: ASCII/ACE only, suitable
406
+ for direct URL embedding. ``hostname`` is the presentation-layer
407
+ counterpart: accepts and returns unicode, performs IDNA encode/decode
408
+ via the module-level hooks (IDNA2003 by default; see configure_idna()).
409
+ Setting ``hostname = "münchen.de"`` stores ``"xn--mnchen-3ya.de"`` in
410
+ ``host``; reading ``hostname`` on that URL returns ``"münchen.de"``.
411
+ Use ``host`` when you already have a correctly encoded ASCII hostname;
412
+ use ``hostname`` for everything human-facing.
413
+
414
+ Relationship between ``auth``, ``username``, and ``password``:
415
+ ``auth`` holds the raw percent-encoded userinfo string
416
+ (e.g. ``"user%40corp:s3cr%3At"``). Assign to it directly when you
417
+ already have a correctly encoded string. ``username`` and ``password``
418
+ accept plain unicode, encode it for you, and splice only their half
419
+ into ``auth`` without touching or re-encoding the other half.
420
+ """
421
+
422
+ _FIELDS = ('scheme', 'auth', 'host', 'port', 'path', 'query', 'fragment')
423
+
424
+ def __init__(self, u: str):
425
+ self._u = _parse_url(u)
426
+
427
+ @classmethod
428
+ def from_parts(
429
+ cls,
430
+ *,
431
+ scheme: str | None = None,
432
+ host: str | None = None,
433
+ hostname: str | None = None,
434
+ port: int | None = None,
435
+ auth: str | None = None,
436
+ username: str | None = None,
437
+ password: str | None = None,
438
+ path: str | None = None,
439
+ query: str | None = None,
440
+ query_params: 'QueryParamDict | None' = None,
441
+ query_params_list: 'QueryParamList | None' = None,
442
+ fragment: str | None = None,
443
+ ) -> 'MutableURL':
444
+ """Construct a MutableURL directly from its component parts.
445
+
446
+ All parameters are keyword-only. Omitted parameters default to None
447
+ (absent from the resulting URL). Three groups of parameters are
448
+ mutually exclusive — passing more than one from any group raises
449
+ ValueError:
450
+
451
+ host / hostname
452
+ ``host`` accepts a pre-encoded ASCII/ACE hostname (stored
453
+ as-is). ``hostname`` accepts a unicode hostname and
454
+ IDNA-encodes it via the module-level hook (see
455
+ configure_idna()), exactly as the ``hostname`` setter does.
456
+
457
+ auth / (username, password)
458
+ ``auth`` accepts a raw percent-encoded userinfo string such as
459
+ ``"user%40corp:s3cr%3At"`` (stored as-is).
460
+ ``username`` and ``password`` accept plain-text strings and
461
+ percent-encode them before storage; either may be omitted
462
+ independently. Supplying only ``password`` (with no
463
+ ``username``) produces a ``:token`` userinfo, which is the
464
+ conventional form for bearer-token credentials in REST APIs.
465
+
466
+ query / query_params / query_params_list
467
+ ``query`` accepts a raw query string (no leading ``?``).
468
+ ``query_params`` accepts a ``dict[str, str | None]``; key
469
+ order follows dict iteration order.
470
+ ``query_params_list`` accepts a
471
+ ``list[tuple[str, str | None]]`` for precise ordering or
472
+ multi-value keys.
473
+ All three encode via ``quote_plus``, matching the
474
+ ``query_params*`` property behaviour. An empty dict or list
475
+ produces no query string (same as omitting the parameter).
476
+ """
477
+ # -- mutual exclusion -------------------------------------------------
478
+ if host is not None and hostname is not None:
479
+ raise ValueError("host and hostname are mutually exclusive")
480
+ if auth is not None and (username is not None or password is not None):
481
+ raise ValueError("auth and username/password are mutually exclusive")
482
+ n_query = sum(x is not None for x in (query, query_params, query_params_list))
483
+ if n_query > 1:
484
+ raise ValueError(
485
+ "query, query_params, and query_params_list are mutually exclusive"
486
+ )
487
+
488
+ # -- resolve host -----------------------------------------------------
489
+ resolved_host = host
490
+ if hostname is not None:
491
+ resolved_host = _encode_host(hostname)
492
+
493
+ # -- resolve auth -----------------------------------------------------
494
+ resolved_auth = auth
495
+ if username is not None or password is not None:
496
+ raw_user = (
497
+ _encode_userinfo(username, is_password=False)
498
+ if username is not None
499
+ else ''
500
+ )
501
+ if password is not None:
502
+ resolved_auth = f'{raw_user}:{_encode_userinfo(password, is_password=True)}'
503
+ else:
504
+ resolved_auth = raw_user or None
505
+
506
+ # -- resolve query ----------------------------------------------------
507
+ resolved_query = query
508
+ if query_params is not None:
509
+ resolved_query = _encode_query_params(list(query_params.items()))
510
+ elif query_params_list is not None:
511
+ resolved_query = _encode_query_params(query_params_list)
512
+
513
+ obj = cls.__new__(cls)
514
+ obj._u = _URL(
515
+ scheme=scheme,
516
+ auth=resolved_auth,
517
+ host=resolved_host,
518
+ port=port,
519
+ path=path,
520
+ query=resolved_query,
521
+ fragment=fragment,
522
+ )
523
+ return obj
524
+
525
+ # -- internal helpers ----------------------------------------------------
526
+
527
+ def _setter_for(self, new_field):
528
+ """Return a setter that rebuilds _u with one field replaced.
529
+
530
+ Field values are read from self._u at *call* time (not at the time
531
+ _setter_for is invoked) so that interleaved mutations compose
532
+ correctly.
533
+ """
534
+ other_fields = [f for f in type(self)._FIELDS if f != new_field]
535
+
536
+ def _f(s, new_value):
537
+ params = {f: getattr(s._u, f) for f in other_fields}
538
+ params[new_field] = new_value
539
+ s._u = _URL(**params)
540
+
541
+ return _f
542
+
543
+ def __eq__(self, other: object) -> bool:
544
+ if not isinstance(other, MutableURL):
545
+ return NotImplemented
546
+ return (self._u.scheme, self._u.auth, self._u.host, self._u.port,
547
+ self._u.path, self._u.query, self._u.fragment) == (
548
+ other._u.scheme, other._u.auth, other._u.host, other._u.port,
549
+ other._u.path, other._u.query, other._u.fragment)
550
+
551
+ def __str__(self) -> str:
552
+ return str(self._u)
553
+
554
+ def __repr__(self) -> str:
555
+ return f"MutableURL('{self}')"
556
+
557
+ # -- mutable base fields (urllib3-style) ----------------------------------
558
+
559
+ scheme = property(lambda s: s._u.scheme,
560
+ lambda s, v: s._setter_for('scheme')(s, v))
561
+ # auth: raw percent-encoded userinfo string; prefer username/password for
562
+ # plain-text assignment.
563
+ auth = property(lambda s: s._u.auth,
564
+ lambda s, v: s._setter_for('auth')(s, v))
565
+ # host: ASCII/ACE hostname; prefer hostname for unicode/IDN assignment.
566
+ host = property(lambda s: s._u.host,
567
+ lambda s, v: s._setter_for('host')(s, v))
568
+ port = property(lambda s: s._u.port,
569
+ lambda s, v: s._setter_for('port')(s, v))
570
+ path = property(lambda s: s._u.path,
571
+ lambda s, v: s._setter_for('path')(s, v))
572
+ query = property(lambda s: s._u.query,
573
+ lambda s, v: s._setter_for('query')(s, v))
574
+ fragment = property(lambda s: s._u.fragment,
575
+ lambda s, v: s._setter_for('fragment')(s, v))
576
+
577
+ # -- mutable derived fields (urllib.parse-style) --------------------------
578
+
579
+ def _get_hostname(self) -> str | None:
580
+ """Unicode (IDNA-decoded) hostname; passes through IP literals unchanged.
581
+
582
+ Decodes via the module-level _decode_host hook; see configure_idna().
583
+ """
584
+ return _decode_host(self._u.host)
585
+
586
+ def _set_hostname(self, value: str | None) -> None:
587
+ """Set hostname from a unicode or ACE string; IDNA-encodes before storage.
588
+
589
+ Encodes via the module-level _encode_host hook; see configure_idna().
590
+ Assigning None clears the host entirely.
591
+ """
592
+ self.host = _encode_host(value) if value is not None else None
593
+
594
+ hostname = property(_get_hostname, _set_hostname)
595
+
596
+ def _get_username(self) -> str | None:
597
+ """Percent-decoded username, or None if no userinfo is present."""
598
+ if self._u.auth is None:
599
+ return None
600
+ raw = self._u.auth.split(':', 1)[0]
601
+ return _decode_userinfo(raw) if raw else None
602
+
603
+ def _set_username(self, value: str | None) -> None:
604
+ """Set the username from a plain-text (unicode) string.
605
+
606
+ Percent-encodes the new value. Preserves the existing raw-encoded
607
+ password verbatim so it is never double-encoded. Passing None removes
608
+ the username while retaining any existing password.
609
+ """
610
+ # Preserve the raw-encoded password without touching it.
611
+ raw_pass: str | None = None
612
+ if self._u.auth and ':' in self._u.auth:
613
+ raw_pass = self._u.auth.split(':', 1)[1]
614
+
615
+ if value is None:
616
+ self.auth = (f':{raw_pass}' if raw_pass is not None else None)
617
+ else:
618
+ encoded = _encode_userinfo(value, is_password=False)
619
+ self.auth = (f'{encoded}:{raw_pass}'
620
+ if raw_pass is not None
621
+ else encoded)
622
+
623
+ username = property(_get_username, _set_username)
624
+
625
+ def _get_password(self) -> str | None:
626
+ """Percent-decoded password, or None if not present."""
627
+ if self._u.auth is None or ':' not in self._u.auth:
628
+ return None
629
+ raw = self._u.auth.split(':', 1)[1]
630
+ return _decode_userinfo(raw) if raw else None
631
+
632
+ def _set_password(self, value: str | None) -> None:
633
+ """Set the password from a plain-text (unicode) string.
634
+
635
+ Percent-encodes the new value. Preserves the existing raw-encoded
636
+ username verbatim so it is never double-encoded. Passing None removes
637
+ the password while retaining any existing username.
638
+ """
639
+ # Preserve the raw-encoded username without touching it.
640
+ raw_user: str = ''
641
+ if self._u.auth:
642
+ raw_user = self._u.auth.split(':', 1)[0]
643
+
644
+ if value is None:
645
+ # Drop the password; clear auth entirely if username is also absent.
646
+ self.auth = raw_user or None
647
+ else:
648
+ encoded = _encode_userinfo(value, is_password=True)
649
+ self.auth = f'{raw_user}:{encoded}'
650
+
651
+ password = property(_get_password, _set_password)
652
+
653
+ # -- query-parameter dict views -------------------------------------------
654
+
655
+ def _get_query_params(self) -> QueryParamDict:
656
+ """Query parameters as a ``dict``; for repeated keys the **last** value wins.
657
+
658
+ Keys and string values are percent-decoded (``+`` treated as space via
659
+ ``unquote_plus``). A parameter without an ``=`` sign (e.g. ``flag``
660
+ in ``?flag&x=1``) maps to ``None``; a parameter whose value is the
661
+ empty string (e.g. ``x`` in ``?x=``) maps to ``""`` — the two cases
662
+ are distinct.
663
+
664
+ Repeated-key policy: last-value-wins mirrors Python ``dict`` construction
665
+ semantics and is the safest choice when callers know keys are unique.
666
+ Use :attr:`query_params_multi` when you need all values.
667
+
668
+ Setting this property replaces the **entire** query string. Key order
669
+ follows iteration order of the source dict (insertion-ordered in Python
670
+ 3.7+). To control key order precisely or preserve multi-values, set
671
+ :attr:`query_params_list` instead.
672
+ """
673
+ return _QueryParamView(self._set_query_params, dict(_parse_query_params(self._u.query)))
674
+
675
+ def _set_query_params(self, params: QueryParamDict) -> None:
676
+ self.query = _encode_query_params(list(params.items()))
677
+
678
+ query_params = property(_get_query_params, _set_query_params)
679
+
680
+ def _get_query_params_list(self) -> QueryParamList:
681
+ """Query parameters as an ordered list of ``(key, value)`` pairs.
682
+
683
+ This is the lossless representation: every parameter appears in its
684
+ original position, repeated keys are preserved, and the ``None`` /
685
+ ``""`` distinction for valueless parameters is maintained. Keys and
686
+ string values are percent-decoded (``+`` treated as space).
687
+
688
+ Setting this property replaces the **entire** query string from the
689
+ supplied list. ``None`` values serialise without an ``=`` sign;
690
+ ``""`` values serialise as ``key=``.
691
+ """
692
+ return _parse_query_params(self._u.query)
693
+
694
+ def _set_query_params_list(self, params: QueryParamList) -> None:
695
+ self.query = _encode_query_params(params)
696
+
697
+ query_params_list = property(_get_query_params_list, _set_query_params_list)
698
+
699
+ @property
700
+ def query_params_multi(self) -> QueryParamMulti:
701
+ """Query parameters as a ``dict`` mapping each key to a list of all its values.
702
+
703
+ All occurrences of a repeated key are collected into a list in the
704
+ order they appear in the query string. The ``None`` / ``""``
705
+ distinction is preserved within each list. Keys and string values are
706
+ percent-decoded (``+`` treated as space).
707
+
708
+ This property is **read-only**. To set multi-value parameters, assign
709
+ to :attr:`query_params_list` with the desired ``(key, value)`` pairs.
710
+
711
+ Example::
712
+
713
+ url = MutableURL("https://example.com/search?color=red&color=blue&lang=en")
714
+ url.query_params_multi
715
+ # {"color": ["red", "blue"], "lang": ["en"]}
716
+ """
717
+ result: QueryParamMulti = {}
718
+ for k, v in _parse_query_params(self._u.query):
719
+ result.setdefault(k, []).append(v)
720
+ return result
721
+
722
+ # -- read-only computed fields (both conventions) -------------------------
723
+
724
+ netloc = property(lambda s: s._u.netloc)
725
+ request_uri = property(lambda s: s._u.request_uri)
726
+ url = property(lambda s: s._u.url)
727
+
728
+
729
+ # vim: set sw=4 et :
730
+ # EOF