attestix 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/SPEC.md ADDED
@@ -0,0 +1,259 @@
1
+ # Attestix Cross-Engine Verification Spec
2
+
3
+ This document specifies the exact wire formats and canonicalization rules the
4
+ `attestix` offline verifier must reproduce to verify artifacts
5
+ produced by the Python Attestix engine (`VibeTensor/attestix`). It is the
6
+ authority for closing GitHub issue [VibeTensor/attestix#7](https://github.com/VibeTensor/attestix/issues/7)
7
+ (cross-engine interop). Everything here was derived by reading the Python
8
+ source and probing its actual byte output (not from a published standard), so
9
+ the JS side matches the **engine**, not an idealized RFC.
10
+
11
+ Source files inspected (in `D:\Development\vibetensor-products\Attestix`):
12
+
13
+ - `auth/crypto.py` — canonicalization, signing, did:key codec.
14
+ - `signing/inprocess_signer.py`, `signing/signer.py` — the signer seam (default = byte-identical to v0.3.0).
15
+ - `services/credential_service.py` — VC / VP issuance + proof.
16
+ - `services/delegation_service.py` — UCAN-style JWT delegation chains.
17
+ - `services/did_service.py` — did:key / did:web documents.
18
+ - `auth/token_parser.py` — JWT detection.
19
+
20
+ ---
21
+
22
+ ## 1. Ed25519
23
+
24
+ - Curve: Ed25519 (RFC 8032), via Python `cryptography` `Ed25519PrivateKey` / `Ed25519PublicKey`.
25
+ - Raw public key: 32 bytes. Raw private seed: 32 bytes.
26
+ - Signature: 64 bytes, standard Ed25519 (PureEdDSA).
27
+ - JS implementation uses `@noble/curves/ed25519` `ed25519.verify(sig, msg, pubkey)`.
28
+
29
+ ## 2. did:key codec
30
+
31
+ `public_key_to_did_key` / `did_key_to_public_key` in `auth/crypto.py`:
32
+
33
+ ```
34
+ did:key:z<base58btc(0xed01 || raw_pubkey_32_bytes)>
35
+ ```
36
+
37
+ - Multicodec prefix for ed25519-pub: bytes `0xED 0x01` (`ED25519_MULTICODEC_PREFIX`).
38
+ - Multibase prefix: literal ASCII `z` (base58btc), placed **immediately after**
39
+ `did:key:`. The base58btc alphabet is the Bitcoin alphabet
40
+ (`123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz`).
41
+ - Decode rejects anything whose first two decoded bytes are not `0xED 0x01`.
42
+ - Verification-method fragment (`did_key_fragment`): for `did:key:zXXXX`, the
43
+ fragment is `#zXXXX` (the full multibase string, **including** the `z`). So a
44
+ full verificationMethod id is `did:key:zXXXX#zXXXX`.
45
+
46
+ When verifying, the verifier extracts the issuer DID by splitting the proof's
47
+ `verificationMethod` on `#` and taking the part before it (matching Python's
48
+ `vm.split("#")[0]`). For credentials it falls back to `issuer.id`.
49
+
50
+ ## 3. Canonical JSON (the critical interop surface)
51
+
52
+ `canonicalize_json(payload: dict) -> bytes` in `auth/crypto.py`. It is **NOT**
53
+ strict RFC 8785; it is `json.dumps` with specific options plus a pre-pass:
54
+
55
+ ```python
56
+ normalized = _normalize_for_signing(payload)
57
+ canonical = json.dumps(normalized, sort_keys=True,
58
+ separators=(",", ":"), ensure_ascii=False)
59
+ return canonical.encode("utf-8")
60
+ ```
61
+
62
+ Rules the JS canonicalizer (`src/verify/jcs.ts`) reproduces, verified
63
+ byte-for-byte against Python output:
64
+
65
+ 1. **Object keys sorted by Unicode code point** (Python's default string sort).
66
+ This is code-point order, *not* UTF-16 code-unit order. For the BMP these
67
+ coincide; they only diverge for astral (> U+FFFF) keys, which do not occur
68
+ in Attestix payloads. The JS sorter sorts by code point to match Python
69
+ exactly regardless.
70
+ 2. **Compact separators**: `,` between items, `:` between key and value. No
71
+ spaces. (matches JS `JSON.stringify` default.)
72
+ 3. **`ensure_ascii=False`**: non-ASCII characters are emitted as their raw UTF-8
73
+ bytes, NOT `\uXXXX` escapes. e.g. `café` -> `caf\xc3\xa9`, `日本語` -> raw
74
+ UTF-8. JS `JSON.stringify` already emits raw characters for printable
75
+ non-ASCII, so the UTF-8 encoding of the resulting string matches.
76
+ 4. **String escaping** matches JSON / JS `JSON.stringify` for the shared set:
77
+ `"` -> `\"`, `\` -> `\\`, U+0008 -> `\b`, U+0009 -> `\t`, U+000A -> `\n`,
78
+ U+000C -> `\f`, U+000D -> `\r`, and other control chars U+0000–U+001F ->
79
+ `\u00xx` (lowercase hex). `/` is NOT escaped. `<`, `>`, `&` are NOT escaped.
80
+ U+007F (DEL) is emitted raw (not escaped) by both engines.
81
+ 5. **NFC normalization** of every string (keys and values) via
82
+ `unicodedata.normalize("NFC", s)`. JS reproduces with
83
+ `String.prototype.normalize("NFC")`.
84
+ 6. **Numbers**:
85
+ - Integers serialize as integers (`1`, `100000000000`).
86
+ - Whole-valued floats are coerced to integers before serialization
87
+ (`1.0` -> `1`, `2.0` -> `2`) by `_normalize_for_signing`.
88
+ - JS `JSON.stringify` already renders `1.0` as `1` and integer-valued
89
+ numbers without a decimal point, so this matches for all integers and
90
+ whole numbers that fit in a JS `number`.
91
+ 7. **`null`, `true`, `false`** literal as in JSON.
92
+ 8. Output is UTF-8 bytes.
93
+
94
+ ### Known canonicalization divergences (documented, guarded, NOT silently wrong)
95
+
96
+ These differ between Python `json.dumps` and JS `JSON.stringify`. **None of
97
+ them occur in any Attestix signed payload** (which contain only strings,
98
+ integers, booleans, nulls, arrays, and nested objects). The JS canonicalizer
99
+ **detects and throws** on these inputs rather than emit a byte string that
100
+ would not match Python, so a verifier can never silently accept/reject due to a
101
+ canonicalization mismatch:
102
+
103
+ | Input | Python output | JS `JSON.stringify` | JS verifier behavior |
104
+ |---|---|---|---|
105
+ | `1e21` | `1000000000000000000000` | `1e+21` | throws `JcsUnsupportedValueError` |
106
+ | `1e-7` | `1e-07` | `1e-7` | throws `JcsUnsupportedValueError` |
107
+ | `-0.0` | `-0.0` | `0` | throws `JcsUnsupportedValueError` |
108
+ | `NaN` / `Infinity` | (Python emits `NaN`/`Infinity`, invalid JSON) | `null` | throws |
109
+ | non-integer float (e.g. `1.5`) | `1.5` | `1.5` | allowed (matches) for the common path; large/sci-notation guarded |
110
+
111
+ The verifier accepts finite integers (including those exactly representable),
112
+ non-integer finite floats whose JS string form has no exponent, booleans,
113
+ strings, null, arrays, and plain objects. Numbers requiring exponential
114
+ notation or signed-zero are rejected up front. Real Attestix VCs and JWT
115
+ delegation payloads never trip this guard.
116
+
117
+ ## 4. JSON-payload signature (used by VCs and VPs)
118
+
119
+ `sign_json_payload(private_key, payload) -> str`:
120
+
121
+ ```
122
+ proofValue = base64url( ed25519_sign( canonicalize_json(payload) ) )
123
+ ```
124
+
125
+ - The signature is **base64url WITH padding** (Python `base64.urlsafe_b64encode`
126
+ always emits `=` padding; a 64-byte signature encodes to 88 chars ending in
127
+ `=`). The JS decoder accepts base64url with or without padding and with `+/`
128
+ or `-_` alphabet, but the Python wire form is url-safe + padded.
129
+ - Verification recomputes `canonicalize_json(payload)` and checks the signature
130
+ with the issuer's Ed25519 public key.
131
+
132
+ ## 5. Verifiable Credential (W3C VC Data Model 1.1)
133
+
134
+ Issued by `CredentialService.issue_credential`. Structure:
135
+
136
+ ```jsonc
137
+ {
138
+ "@context": [
139
+ "https://www.w3.org/2018/credentials/v1",
140
+ "https://w3id.org/security/suites/ed25519-2020/v1"
141
+ ],
142
+ "id": "urn:uuid:...",
143
+ "type": ["VerifiableCredential", "<SpecificType>"],
144
+ "issuer": { "id": "did:key:z...", "name": "..." },
145
+ "issuanceDate": "ISO-8601",
146
+ "expirationDate": "ISO-8601",
147
+ "credentialSubject": { "id": "<subject>", ...claims },
148
+ "credentialStatus": { // NOT signed (mutable)
149
+ "id": "<id>#status",
150
+ "type": "RevocationList2021Status",
151
+ "revoked": false,
152
+ "revocation_reason": null,
153
+ "revoked_at": null
154
+ },
155
+ "proof": { // NOT signed (mutable)
156
+ "type": "Ed25519Signature2020",
157
+ "created": "ISO-8601",
158
+ "verificationMethod": "did:key:z...#z...",
159
+ "proofPurpose": "assertionMethod",
160
+ "proofValue": "<base64url sig>"
161
+ }
162
+ }
163
+ ```
164
+
165
+ **Signed-field set (CRITICAL):** the signature covers the credential object with
166
+ the **`proof` and `credentialStatus` fields removed**
167
+ (`MUTABLE_FIELDS = {"proof", "credentialStatus"}`). i.e. the signed payload is:
168
+
169
+ ```
170
+ { @context, id, type, issuer, issuanceDate, expirationDate, credentialSubject }
171
+ ```
172
+
173
+ canonicalized by the rules in §3. The verifier:
174
+
175
+ 1. Strips `proof` and `credentialStatus`.
176
+ 2. Canonicalizes the remainder.
177
+ 3. Resolves issuer DID = `proof.verificationMethod` before `#`, else `issuer.id`.
178
+ 4. Verifies `proof.proofValue` (base64url) over the canonical bytes with that
179
+ did:key's public key.
180
+
181
+ It also checks structure (`type` contains `VerifiableCredential`), expiry
182
+ (`expirationDate` in the future), and — if present locally — revocation. The
183
+ offline JS verifier checks signature + structure + expiry; revocation is a
184
+ local-storage concern and is reported as "not checkable offline".
185
+
186
+ ## 6. Verifiable Presentation (VP)
187
+
188
+ `create_verifiable_presentation`. Signed payload = the VP with **only `proof`
189
+ removed** (note: differs from VC — VP excludes `proof` only, not
190
+ `credentialStatus`). Each embedded credential is verified by §5 rules. The VP
191
+ `proof.proofPurpose` is `authentication`, and `challenge`/`domain` may be
192
+ present both at top level and inside `proof`.
193
+
194
+ ## 7. UCAN-style delegation chain (JWT / EdDSA)
195
+
196
+ `DelegationService.create_delegation`. **This is a JWT, not a JSON-signature
197
+ object.** A delegation token is a compact JWS:
198
+
199
+ ```
200
+ base64url(header) . base64url(payload) . base64url(signature)
201
+ ```
202
+
203
+ - **Header**: `{"typ":"JWT","ucv":"0.9.0","alg":"EdDSA"}` (PyJWT serializes
204
+ header keys; `alg` is `EdDSA`). The exact header bytes are whatever PyJWT
205
+ emits; the verifier does NOT recanonicalize — it verifies over the literal
206
+ `base64url(header) || "." || base64url(payload)` ASCII bytes, exactly as JWS
207
+ requires.
208
+ - **Payload claims**:
209
+ - `iss`: the **server** did:key (the signing identity for the whole chain).
210
+ - `aud`, `sub`: the audience agent id (recipient).
211
+ - `delegator`: the issuer agent id (logical granter).
212
+ - `iat`, `nbf`, `exp`: unix seconds (integers).
213
+ - `jti`: random url-safe id.
214
+ - `att`: list of capability strings (the UCAN attenuation set).
215
+ - `prf`: list of parent JWT strings (the proof chain). `[]` at the root.
216
+ - `attestix_version`: `"0.1.0"`, `typ`: `"ucan/delegation"`.
217
+ - **Signature**: EdDSA (Ed25519) over the ASCII `signing input`
218
+ (`b64url(header).b64url(payload)`), signed by the server key. Verified with
219
+ the `iss` did:key's public key.
220
+
221
+ ### Chain verification rules (`verifyDelegationChain`)
222
+
223
+ Reproduces `DelegationService.verify_delegation` recursion + the attenuation
224
+ check enforced at creation time in `create_delegation`:
225
+
226
+ 1. **Each link's JWS signature** verifies against the public key derived from
227
+ that link's `iss` did:key.
228
+ 2. **Recursive `prf` verification**: every parent token in `prf` must itself be
229
+ a valid link. Invalid ancestor => whole chain invalid.
230
+ 3. **Cycle detection**: a `jti` seen twice in one verification run => reject
231
+ ("Cycle detected").
232
+ 4. **Capability attenuation**: a child's `att` set MUST be a subset of each of
233
+ its parents' `att` sets. A capability present in the child but not the
234
+ parent is privilege escalation => reject. (Python enforces this at creation;
235
+ the offline verifier re-checks it across the supplied chain because an
236
+ offline verifier cannot trust that creation-time checks ran.)
237
+ 5. **Linkage**: the chain is rooted at the server `iss`; each link's `prf`
238
+ entries are the parent tokens. The verifier walks root -> leaf.
239
+ 6. **Expiry**: `exp` must be in the future (`exp >= now`); `nbf`/`iat` sanity.
240
+ Expired link => reject (with `expired: true`).
241
+ 7. **Revocation** is a local-storage concern (by `jti`) and is not checkable
242
+ offline; reported as not-checked.
243
+
244
+ The JS verifier accepts either a single leaf JWT string (it walks `prf`
245
+ internally) or an explicit array of JWT strings root..leaf.
246
+
247
+ ---
248
+
249
+ ## 8. Summary of what JS must match byte-for-byte
250
+
251
+ | Concern | Rule |
252
+ |---|---|
253
+ | Canonical JSON | sort keys by code point, `,`/`:` separators, raw UTF-8 (no `\u` for non-ASCII), NFC, whole-float->int, lowercase `\u00xx` control escapes |
254
+ | Signature encoding | base64url (accept padded/unpadded, both alphabets) |
255
+ | Ed25519 | RFC 8032 verify over canonical bytes (VC/VP) or JWS signing input (delegation) |
256
+ | did:key | base58btc multibase `z` + multicodec `0xED01` + 32-byte pubkey |
257
+ | VC signed fields | object minus `proof` and `credentialStatus` |
258
+ | VP signed fields | object minus `proof` |
259
+ | Delegation | JWS EdDSA over `b64url(header).b64url(payload)`; `prf` recursion; `att` subset attenuation |