runinfra 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,8 @@
1
+ include README.md
2
+ include pyproject.toml
3
+ include runinfra/py.typed
4
+ recursive-include runinfra *.py
5
+ prune tests
6
+ prune build
7
+ prune dist
8
+ global-exclude __pycache__ *.py[cod]
@@ -0,0 +1,386 @@
1
+ Metadata-Version: 2.4
2
+ Name: runinfra
3
+ Version: 0.1.0
4
+ Summary: RunInfra SDK for optimized inference deployments
5
+ Author: RunInfra
6
+ License-Expression: LicenseRef-Proprietary
7
+ Project-URL: Documentation, https://runinfra.ai/docs/tools-sdks/runinfra-sdk
8
+ Project-URL: Homepage, https://runinfra.ai
9
+ Project-URL: Issues, https://github.com/RightNow-AI/RunPipe/issues
10
+ Keywords: runinfra,inference,openai,responses,embeddings,tts,asr,image-generation
11
+ Classifier: Development Status :: 5 - Production/Stable
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3 :: Only
15
+ Classifier: Programming Language :: Python :: 3.9
16
+ Classifier: Programming Language :: Python :: 3.10
17
+ Classifier: Programming Language :: Python :: 3.11
18
+ Classifier: Programming Language :: Python :: 3.12
19
+ Classifier: Programming Language :: Python :: 3.13
20
+ Classifier: Programming Language :: Python :: 3.14
21
+ Classifier: Typing :: Typed
22
+ Requires-Python: >=3.9
23
+ Description-Content-Type: text/markdown
24
+
25
+ # RunInfra Python SDK
26
+
27
+ Access optimized RunInfra deployments through the verified public gateway.
28
+
29
+ Requires Python 3.9 or newer.
30
+
31
+ ## Install
32
+
33
+ ```bash
34
+ pip install runinfra
35
+ ```
36
+
37
+ ## Create a client
38
+
39
+ Use a workspace-scoped key to reach verified active deployments through the `model` field.
40
+ In RunPipe, open Settings, API Keys, Create key, and keep Scope set to Workspace.
41
+
42
+ The Deploy tab can create a pipeline-scoped key for one optimized pipeline.
43
+ The one-time secret is shown once after creation. Store it as `RUNINFRA_API_KEY`
44
+ for app snippets before leaving the page. For repo live canaries, keep the
45
+ workspace key in `RUNINFRA_API_KEY` and put the pipeline-scoped key for
46
+ `TEST_PIPELINE_ID` in `RUNINFRA_PIPELINE_API_KEY` so flat and pipeline routes
47
+ are verified independently.
48
+
49
+ After a runbook finishes in RunPipe, choose Open Deploy from the runbook handoff.
50
+ Deploy only shows SDK operations that the verified endpoint supports, so copy
51
+ the native or OpenAI-compatible snippet from there instead of guessing a route.
52
+
53
+ ```python
54
+ import os
55
+ from runinfra import RunInfra
56
+
57
+ api_key = os.environ.get("RUNINFRA_API_KEY")
58
+ if not api_key:
59
+ raise RuntimeError("Set RUNINFRA_API_KEY before running this snippet.")
60
+
61
+ client = RunInfra(api_key=api_key)
62
+ ```
63
+
64
+ Use `pipeline_id` when the key or integration should be locked to one optimized pipeline.
65
+
66
+ ```python
67
+ api_key = os.environ.get("RUNINFRA_API_KEY")
68
+ if not api_key:
69
+ raise RuntimeError("Set RUNINFRA_API_KEY before running this snippet.")
70
+
71
+ client = RunInfra(
72
+ api_key=api_key,
73
+ pipeline_id="pipe_123",
74
+ )
75
+ ```
76
+
77
+ The default base URL is `https://api.runinfra.ai/v1`.
78
+ `pipeline_id` is stripped and URL-encoded before it is added to the base URL. Use either `pipeline_id` with the default base URL, or a pipeline-scoped `base_url` such as `https://api.runinfra.ai/v1/pipe_123`. If both point to the same pipeline, the SDK keeps the URL scoped once.
79
+ RunPipe generated native SDK snippets prefer `pipeline_id` with the root `https://api.runinfra.ai/v1` base URL. OpenAI-compatible snippets use the pipeline-scoped base URL because the OpenAI SDK has no RunInfra pipeline option.
80
+ Custom base URLs must use `http` or `https`. Other schemes and malformed URLs are rejected before a bearer API key can be sent.
81
+ Remote custom base URLs must use `https`. Plain `http` is accepted only for local development hosts: `localhost`, `127.0.0.1`, `0.0.0.0`, and `[::1]`.
82
+ Custom base URLs must not include usernames or passwords.
83
+ Custom base URLs must not include query strings or fragments.
84
+
85
+ ## Responses and streaming
86
+
87
+ ```python
88
+ stream = client.responses.create(
89
+ model="llama-3.1-8b",
90
+ input="Hello",
91
+ max_output_tokens=512,
92
+ stream=True,
93
+ )
94
+
95
+ for event in stream:
96
+ if event.get("type") == "response.output_text.delta":
97
+ print(event.get("delta", ""), end="")
98
+
99
+ print(stream.request_id)
100
+ ```
101
+
102
+ ## Supported public routes
103
+
104
+ - `models.list()`
105
+ - `models.retrieve(model)`
106
+ - `responses.create()`
107
+ - `chat.completions.create()`
108
+ - `embeddings.create()`
109
+ - `audio.speech.create()`
110
+ - `audio.transcriptions.create()`
111
+ - `images.generate()`
112
+ - `voice.pipeline.create()`
113
+
114
+ ## Text to speech
115
+
116
+ TTS deployments can expose named voices or Base/reference-audio voice cloning.
117
+ Use `RUNINFRA_TTS_VOICE` when the deployment lists a voice or speaker. Use
118
+ `RUNINFRA_TTS_REF_AUDIO` and `RUNINFRA_TTS_REF_TEXT` when the deployment expects
119
+ reference-audio input.
120
+
121
+ ```python
122
+ voice = os.environ.get("RUNINFRA_TTS_VOICE", "").strip()
123
+ ref_audio = os.environ.get("RUNINFRA_TTS_REF_AUDIO", "").strip()
124
+ ref_text = os.environ.get("RUNINFRA_TTS_REF_TEXT", "").strip()
125
+
126
+ if voice:
127
+ speech_voice = {"voice": voice}
128
+ elif ref_audio and ref_text:
129
+ speech_voice = {
130
+ "ref_audio": ref_audio,
131
+ "ref_text": ref_text,
132
+ "task_type": os.environ.get("RUNINFRA_TTS_TASK_TYPE", "Base").strip() or "Base",
133
+ }
134
+ else:
135
+ raise RuntimeError("Set RUNINFRA_TTS_VOICE, or RUNINFRA_TTS_REF_AUDIO and RUNINFRA_TTS_REF_TEXT.")
136
+
137
+ audio = client.audio.speech.create(
138
+ model="your-tts-model-id",
139
+ input="Hello from your optimized RunInfra endpoint.",
140
+ **speech_voice,
141
+ )
142
+ ```
143
+
144
+ ## Timeouts and retries
145
+
146
+ ```python
147
+ import os
148
+
149
+ api_key = os.environ.get("RUNINFRA_API_KEY")
150
+ if not api_key:
151
+ raise RuntimeError("Set RUNINFRA_API_KEY before running this snippet.")
152
+
153
+ client = RunInfra(
154
+ api_key=api_key,
155
+ timeout_seconds=60,
156
+ max_retries=2,
157
+ retry_base_seconds=0.25,
158
+ )
159
+ ```
160
+
161
+ The SDK retries transient transport failures and `408`, `409`, `429`, `500`, `502`, `503`, and `504` responses for safe `GET` requests. Charge-bearing `POST` inference requests retry only when you provide `idempotency_key`, and automatic POST retries are limited to non-streaming JSON calls whose gateway responses can be replayed safely. That covers `responses.create()`, non-streaming `chat.completions.create()`, `embeddings.create()`, and `images.generate()`. Streaming calls, binary TTS responses, and multipart ASR uploads are sent once even when you provide an idempotency key. The gateway still binds idempotency keys for TTS and ASR, so a manual retry with the same key will not run or charge a second inference after the first request settles. Automatic retries honor reasonable `Retry-After` values up to 60 seconds when the header is a plain integer second value or HTTP-date, then fall back to bounded exponential backoff. The SDK does not retry authentication errors, insufficient credits, or unsupported operations.
162
+
163
+ If the gateway successfully finishes a request but the response body is too large to replay from the idempotency cache, later calls with the same `idempotency_key` return `idempotency_replay_unavailable` without running or charging the inference again.
164
+
165
+ `timeout_seconds` must be positive, `max_retries` must be a non-negative integer, and `retry_base_seconds` must be non-negative. Unknown per-request option keys are rejected so typos do not silently disable idempotency, tracing, timeout, or retry behavior. Python request option aliases cannot be mixed; choose either snake_case or camelCase for a given option. Invalid values raise `RunInfraError` with `type == "invalid_request_options"` before any network request is sent.
166
+
167
+ ## Request validation
168
+
169
+ Required request fields are validated before any network request is sent. The model must be a non-blank string, chat messages must be a non-empty array, each chat message must be an object with a non-empty role, Responses input must be a non-empty string or array, Responses input array items must be objects, JSON request bodies must be serializable and contain only finite numbers, embedding input must be a non-empty string or array of non-empty strings, TTS input and image prompts must be non-empty strings, and ASR file must be bytes or bytearray. ASR multipart filenames, content types, and extra form field names and values are validated before the multipart body is built. Invalid request values raise `RunInfraError` with `type == "invalid_request_options"` and do not reach the gateway or billing path.
170
+
171
+ Use per-request options when a call needs a shorter timeout, a trace ID, or a retry-safe idempotency key.
172
+ Custom headers are for app metadata only. They cannot override SDK-controlled headers such as `Authorization`, `Content-Type`, `X-Client-Request-Id`, `Idempotency-Key`, `X-RunInfra-SDK`, or `X-RunInfra-SDK-Version`, and they cannot set transport or credential headers such as `Host`, `Cookie`, `Content-Length`, `Transfer-Encoding`, `Connection`, `Proxy-Authorization`, `Api-Key`, `X-API-Key`, `X-Auth-Token`, or `X-Access-Token`.
173
+
174
+ ```python
175
+ import uuid
176
+
177
+ client.responses.create(
178
+ model="llama-3.1-8b",
179
+ input="Summarize this incident.",
180
+ request_options={
181
+ "client_request_id": str(uuid.uuid4()),
182
+ "idempotency_key": str(uuid.uuid4()),
183
+ "timeout_seconds": 20,
184
+ "max_retries": 0,
185
+ },
186
+ )
187
+ ```
188
+
189
+ ## Typed errors
190
+
191
+ The SDK exposes `AuthenticationError`, `PermissionDeniedError`, `RateLimitError`, `InsufficientCreditsError`, `DeploymentError`, `ModelNotFoundError`, `RunInfraTimeoutError`, `RunInfraConnectionError`, `RunInfraStreamParseError`, and `UnsupportedOperationError`.
192
+ `RateLimitError` includes `retry_after_seconds` when the gateway returns `Retry-After`.
193
+ `RunInfraStreamParseError` includes `request_id` when a malformed SSE frame came from a traced gateway response.
194
+ `RunInfraTimeoutError` also covers stalled streaming reads and default non-streaming body reads after headers arrive, and includes `request_id` when the response was traced.
195
+ `RunInfraConnectionError` also covers streaming body transport failures and default non-streaming body transport failures after headers arrive, and includes `request_id` when the response was traced.
196
+
197
+ ## Traceability and typing
198
+
199
+ Every request includes `X-RunInfra-SDK: python`, `X-RunInfra-SDK-Version`, and `X-Client-Request-Id`. These headers help support trace requests without changing billing or routing.
200
+
201
+ When `idempotency_key` is provided, the SDK sends it as `Idempotency-Key`. Use a unique value for each logical retry-safe operation. Idempotency keys must be non-blank, ASCII, 255 characters or less, and must not contain secrets or personal data.
202
+
203
+ Successful JSON object responses include `_request_id` when the gateway returns `x-request-id`. Streaming responses expose the same value as `stream.request_id`, malformed stream frames raise `RunInfraStreamParseError` with that request id, and binary audio responses expose it as `audio.request_id`. Log that value with production errors and customer support reports.
204
+
205
+ The wheel ships `py.typed` so type checkers can inspect the package. Fixed-shape helpers expose `TypedDict` response contracts: `ModelListResponse`, `ModelObject`, `ResponsesCreateResponse`, `ChatCompletionResponse`, `EmbeddingResponse`, `TranscriptionResponse`, and `ImageGenerationResponse`. Stream-capable helpers are typed as either the JSON response contract or `RunInfraStream` when `stream=True`.
206
+
207
+ ## Webhook verification
208
+
209
+ Public webhook delivery routes are not shipped yet, but the SDK includes local verification helpers for signed RunInfra webhook deliveries once you receive them in your own server. Always verify the exact raw body before parsing JSON. The `RunInfra-Signature` timestamp must be a non-negative integer Unix second.
210
+
211
+ ```python
212
+ import os
213
+
214
+ from runinfra import (
215
+ WebhookVerificationError,
216
+ construct_webhook_event,
217
+ verify_webhook_signature,
218
+ )
219
+
220
+ webhook_secret = os.environ.get("RUNINFRA_WEBHOOK_SECRET")
221
+ if not webhook_secret or not webhook_secret.strip():
222
+ raise RuntimeError("Set RUNINFRA_WEBHOOK_SECRET before verifying webhook events.")
223
+
224
+ event = construct_webhook_event(
225
+ payload=raw_body,
226
+ signature_header=signature_header,
227
+ secret=webhook_secret,
228
+ )
229
+ ```
230
+
231
+ `construct_webhook_event` verifies the signature, checks timestamp tolerance, and parses JSON. Use `verify_webhook_signature` when your framework parses JSON separately and you only need to validate the raw delivery. Invalid signatures, stale timestamps, and invalid webhook JSON raise `WebhookVerificationError`.
232
+
233
+ ## OpenAI-compatible clients
234
+
235
+ OpenAI-compatible clients can use the same verified base URL:
236
+
237
+ ```python
238
+ import os
239
+ from openai import OpenAI
240
+
241
+ api_key = os.environ.get("RUNINFRA_API_KEY")
242
+ if not api_key:
243
+ raise RuntimeError("Set RUNINFRA_API_KEY before running this snippet.")
244
+
245
+ client = OpenAI(
246
+ api_key=api_key,
247
+ base_url="https://api.runinfra.ai/v1/pipe_123",
248
+ )
249
+ ```
250
+
251
+ ## Production promotion
252
+
253
+ Local package tests prove SDK shape, retry behavior, streaming parsing, and
254
+ typed errors. They do not prove that a newly optimized deployment is ready for
255
+ customers. For production promotion, run the strict live SDK canary gate from
256
+ the RunPipe repo against the same base URL and API key you plan to expose. The gate starts with `test:sdk-live-api-key`, which verifies the plaintext key hashes to an active workspace-scoped `api_keys` row before any paid inference canary runs:
257
+
258
+ ```bash
259
+ pnpm verify:sdk-release
260
+ pnpm test:sdk-canary:live -- --print-env-template
261
+ pnpm test:sdk-canary:live -- --env-file .env.sdk-live.local --print-env-status
262
+ pnpm sync:sdk-live-env -- --source .env.local --target .env.sdk-live.local
263
+ pnpm discover:sdk-live-targets -- --env-file .env.sdk-live.local --probe-inference --report artifacts/sdk/live-targets-discovery.json
264
+ pnpm bootstrap:sdk-live-key -- --env-file .env.sdk-live.local
265
+ pnpm discover:sdk-live-targets -- --env-file .env.sdk-live.local --probe-inference --report artifacts/sdk/live-targets-discovery.json
266
+ pnpm prepare:sdk-live-env -- --discovery-report artifacts/sdk/live-targets-discovery.json --output .env.sdk-live.local
267
+ pnpm discover:sdk-live-targets -- --env-file .env.sdk-live.local --probe-inference --report artifacts/sdk/live-targets-discovery.json
268
+ pnpm verify:sdk-live-targets -- --env-file .env.sdk-live.local --require-available --discovery-report artifacts/sdk/live-targets-discovery.json
269
+ pnpm test:sdk-canary:live -- --env-file .env.sdk-live.local --check-env-only
270
+ pnpm test:sdk-canary:live -- --env-file .env.sdk-live.local --discovery-report artifacts/sdk/live-targets-discovery.json --report artifacts/sdk/live-canary.json
271
+ pnpm verify:sdk-live-report -- artifacts/sdk/live-canary.json
272
+ pnpm test:sdk-canary -- --env-file .env.sdk-live.local --report artifacts/sdk/native-focused-smoke.json
273
+ pnpm test:openai-compat -- --env-file .env.sdk-live.local --report artifacts/sdk/openai-focused-smoke.json
274
+ pnpm verify:sdk-goal -- --release-report artifacts/sdk/release-verification.json --live-report artifacts/sdk/live-canary.json --live-targets-report artifacts/sdk/live-targets-discovery.json --env-check-report artifacts/sdk/live-canary-env-check.json --focused-smoke-report artifacts/sdk/native-focused-smoke.json --openai-focused-smoke-report artifacts/sdk/openai-focused-smoke.json --report artifacts/sdk/goal-readiness.json
275
+ pnpm verify:sdk-publish -- --release-report artifacts/sdk/release-verification.json --goal-report artifacts/sdk/goal-readiness.json --live-report artifacts/sdk/live-canary.json --live-targets-report artifacts/sdk/live-targets-discovery.json --env-check-report artifacts/sdk/live-canary-env-check.json --focused-smoke-report artifacts/sdk/native-focused-smoke.json --openai-focused-smoke-report artifacts/sdk/openai-focused-smoke.json --report artifacts/sdk/publish-readiness.json
276
+ ```
277
+
278
+ Save the printed template as `.env.sdk-live.local`; it is ignored by git and
279
+ should contain the real production gateway, workspace-scoped API key, database
280
+ URL, pipeline-scoped API key for the optimized LLM pipeline, `RUNPOD_API_KEY`,
281
+ deployed model IDs, the optimized LLM `TEST_PIPELINE_ID`, TTS proof inputs, and
282
+ ASR clip path. `RUNPOD_API_KEY` is
283
+ used only by discovery to prove checked RunPod endpoint inventory. A checked
284
+ inventory needs endpointCount greater than zero, and `endpointCount: 0` blocks
285
+ promotion even if old database rows still mention active deployments. Discovery
286
+ also blocks any selected target with `endpoint_not_in_runpod_inventory` and
287
+ emits `runpod_endpoint_inventory_empty` when RunPod returns no endpoints. For
288
+ operator handoffs, set optional `RUNPOD_EXPECTED_ENDPOINT_IDS` to a
289
+ comma-separated list of endpoint IDs. Discovery compares those IDs against the
290
+ same checked inventory and reports only redacted verified or missing endpoint
291
+ IDs, so a wrong RunPod account/scope becomes an explicit blocker.
292
+ For TTS, set either `TEST_TTS_VOICE` or both `TEST_TTS_REF_AUDIO` and
293
+ `TEST_TTS_REF_TEXT` for Base/voice-cloning models.
294
+ `sync:sdk-live-env` copies `RUNPOD_API_KEY` from the source env file when it is
295
+ present. If you keep the RunPod key only in the shell, discovery uses that
296
+ process value instead of the generated placeholder in `.env.sdk-live.local`.
297
+ Use `--print-env-status` before running the canary to see missing, placeholder,
298
+ or invalid fields without printing API keys, database URLs, or file paths. Use `pnpm sync:sdk-live-env -- --source .env.local --target .env.sdk-live.local` to copy whitelisted local values such as `DATABASE_URL` without printing secrets, then use `pnpm discover:sdk-live-targets -- --env-file .env.sdk-live.local --probe-inference --report artifacts/sdk/live-targets-discovery.json` to inspect
299
+ which `active_verified` deployments can satisfy strict modality coverage without
300
+ printing API keys, key hashes, or database credentials. Deployments that are
301
+ close but not promotable appear under redacted `nearEligibleTargets` with
302
+ reasons such as `status_not_active_verified`, `missing_inference_url`, or
303
+ `missing_endpoint_id`. The discovery report also includes `nextActions`, so
304
+ deployment and SDK agents can follow safe commands without scraping error text.
305
+ A skipped probe is diagnostic only. It means discovery intentionally avoided a
306
+ live inference call after an earlier eligibility failure; only `passed` probes can promote `targets_available`.
307
+ After discovery reports eligible `active_verified` targets, `pnpm bootstrap:sdk-live-key -- --env-file .env.sdk-live.local` can create a workspace-scoped key for that workspace, store only its hash in the database, and write the plaintext once into the ignored live env file without printing it. Rerun discovery after bootstrap so the report proves the selected workspace now has an active workspace-scoped key.
308
+ When discovery is complete, use
309
+ `pnpm prepare:sdk-live-env -- --discovery-report artifacts/sdk/live-targets-discovery.json --output .env.sdk-live.local`
310
+ to fill the deployment model IDs and the optimized LLM `TEST_PIPELINE_ID`, then rerun `discover:sdk-live-targets` against the prepared env file.
311
+ `prepare:sdk-live-env` cannot recover a one-time plaintext pipeline secret. Set
312
+ `RUNINFRA_PIPELINE_API_KEY` from the Deploy tab for `TEST_PIPELINE_ID` before
313
+ strict live canaries, while keeping `RUNINFRA_API_KEY` workspace-scoped for
314
+ billing and flat-route verification.
315
+ Before running live canaries, run
316
+ `pnpm verify:sdk-live-targets -- --env-file .env.sdk-live.local --require-available --discovery-report artifacts/sdk/live-targets-discovery.json`
317
+ against the prepared-env discovery report to validate that it is redacted, same-workspace, uses exact live env values, and only promotes callable `active_verified` targets.
318
+ If the output file already has `RUNINFRA_API_KEY`, `RUNINFRA_PIPELINE_API_KEY`,
319
+ `DATABASE_URL`, `TEST_ASR_FILE`, or local TTS reference inputs, the helper
320
+ preserves them and does not print them.
321
+
322
+ That gate must cover LLM, embeddings, image, TTS, and ASR endpoints before the
323
+ deployment is treated as production verified. Those strict modality targets must
324
+ be distinct deployed model IDs in the same workspace, because the promotion
325
+ canary uses one workspace-scoped key and then proves billing for every reported
326
+ model. The generated live report also records the SDK version and source digest,
327
+ so stale canaries cannot promote a newer SDK build.
328
+ Focused `pnpm test:sdk-canary -- --report ...` smoke reports also record the
329
+ same SDK / Docs / Engine source digest and stay redacted. The raw
330
+ OpenAI-compatible focused smoke writes `artifacts/sdk/openai-focused-smoke.json`
331
+ and the native SDK smoke writes `artifacts/sdk/native-focused-smoke.json`.
332
+ `verify:sdk-goal` rejects either focused smoke report when it was generated from older source,
333
+ so focused LLM debugging evidence cannot be reused as fresh promotion evidence.
334
+ Each canary result also records redacted `checks` for the required proof checks
335
+ it emitted. If a canary exits successfully but misses a required proof line, the
336
+ strict report records `missingChecks` and stays blocked. The runner only counts
337
+ a proof line when the child canary prints `[ OK ] <required check>`; `[FAIL]` lines do not satisfy promotion evidence. The proof set covers model discovery
338
+ and retrieval, LLM responses and streaming,
339
+ pipeline-scoped native SDK responses, OpenAI-compatible pipeline-scoped `/v1/responses`, embeddings vectors, image data, TTS audio bytes, ASR transcription text,
340
+ OpenAI-compatible auth and error paths, native SDK typed `AuthenticationError`
341
+ mapping, request ID propagation, unsupported webhook guards, API key scope, and
342
+ billing usage verification. OpenAI-compatible security checks also prove request tracing, HSTS, `nosniff`, path traversal blocking, invalid model 404, missing model 400, and auth failures before publish promotion can pass.
343
+ The live-target gate also requires checked RunPod endpoint inventory before
344
+ promotion. `selectedTargets.*.runpodEndpointVerified` must be true for every
345
+ strict modality.
346
+
347
+ Use registry trusted publishing first. Do not provide NPM or PyPI publish tokens.
348
+ Registry tokens are not used; publish through GitHub trusted publishing only. If
349
+ trusted publishing is unavailable, do not publish until the registry identity is fixed.
350
+ The publish-readiness report ties the local release verification,
351
+ goal-readiness report, live-target discovery report, and strict live canary
352
+ report, plus both focused smoke reports, to the same source digest.
353
+ PyPI publishing should use the same verified release flow; do not upload the
354
+ wheel or sdist from a local shell until the publish-readiness report passes.
355
+ Use `pnpm prepare:sdk-publish` to build the npm package, Python wheel, and
356
+ Python sdist only after publish readiness passes; the command writes a
357
+ source-digest-tied manifest with artifact SHA-256 hashes, byte counts, and
358
+ checksummed release / goal / live-target / live-canary proof reports at
359
+ `artifacts/sdk/publish/publish-artifacts.json`.
360
+ Use `pnpm publish:sdk -- --dry-run` to validate that manifest and print the npm
361
+ trusted-publishing and PyPI action handoff without sending packages. The guarded publish wrapper
362
+ refuses `--execute` outside CI, validates artifact checksums again, and supports
363
+ npm trusted publishing through GitHub OIDC on the Node 22.14.0 publish workflow;
364
+ PyPI should be uploaded by `pypa/gh-action-pypi-publish` in the `SDK Publish`
365
+ workflow.
366
+ GitHub Actions has two SDK-only workflows for the same path. `SDK Release Gate`
367
+ runs `pnpm verify:sdk-release` and requires `RUNINFRA_SDK_CI_TOKEN` with
368
+ read-only access to the docs and Engine contract repositories. `SDK Publish` is
369
+ manual only, defaults to `publish: false`, runs `verify:sdk-release`, creates
370
+ the strict live SDK env from GitHub secrets, discovers `active_verified` targets,
371
+ prepares the modality env, runs strict live canaries, verifies goal and publish
372
+ readiness, prepares artifacts, runs `publish:sdk -- --dry-run`, then uploads the
373
+ proof reports and package artifacts. The verification job has no registry OIDC permission;
374
+ the npm and PyPI publish jobs download the checked artifacts and are the only
375
+ jobs with registry trusted publishing identity. The npm publish job uses GitHub environment `npm`;
376
+ the PyPI publish job revalidates the publish manifest before upload and uses GitHub environment `pypi`. The live proof secrets are
377
+ `RUNINFRA_SDK_LIVE_API_KEY`, `RUNINFRA_SDK_LIVE_DATABASE_URL`, `RUNPOD_API_KEY`, and
378
+ `RUNINFRA_SDK_LIVE_ASR_FILE_BASE64`; voice-less Base TTS canaries can also set
379
+ `RUNINFRA_SDK_LIVE_TTS_REF_AUDIO` and `RUNINFRA_SDK_LIVE_TTS_REF_TEXT`. Only
380
+ after those gates pass should a maintainer rerun it with `publish: true`; npm
381
+ and PyPI must use trusted publishers.
382
+ `pnpm verify:sdk-release` also runs SDK secret hygiene and fails if non-test
383
+ handoff docs or release files contain full-shaped RunInfra keys, service tokens,
384
+ or package publish tokens.
385
+
386
+ Co-located voice pipelines are available through the native `client.voice.pipeline.create()` helper on pipeline-scoped keys. The helper posts binary audio to the verified `/pipeline` route and returns the JSON transcript / response envelope. Public webhook create/list calls are intentionally unavailable until their gateway routes are verified, and the SDK raises `UnsupportedOperationError` locally for those webhook capabilities without making a request.