@jterrats/open-orchestra 1.0.11 → 1.0.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,58 +1,454 @@
1
- # Audio/Video Transcription Skill
1
+ # Audio/Video Transcription Skill Spike
2
2
 
3
- Open Orchestra can treat recordings as workflow evidence through the
4
- `audio-video-transcription` skill. The skill is loaded only when task text,
5
- risks, evidence, or runtime prompts mention transcription, audio, video,
6
- recordings, interviews, sprint reviews, support calls, subtitles, VTT, or SRT.
3
+ Task: `GH-367-TRANSCRIPTION-SKILL-SPIKE`
4
+ Issue: `GH-367` - Audio and video transcription evidence skill
5
+ Status: proposed architecture spike
6
+ Lead role: Architect
7
+ Review roles covered: Security/Privacy, QA, Developer
7
8
 
8
- ## Privacy Defaults
9
+ ## Goal
9
10
 
10
- - Media is sensitive by default.
11
- - Local/offline transcription is the default path when an approved local engine
12
- is available.
13
- - External transcription providers require explicit policy opt-in before media
14
- or raw transcript text leaves the workspace.
15
- - Transcript artifacts must redact secrets, tokens, credentials, configured PII,
16
- and regulated data markers before persistence.
17
- - Regulated contexts should record consent and retention notes before release.
11
+ Define a local-first transcription skill that turns workflow-local audio and
12
+ video artifacts into searchable, reviewable evidence without uploading media or
13
+ raw transcript text by default.
18
14
 
19
- ## Evidence Contract
15
+ The skill should support interviews, demos, sprint reviews, QA recordings,
16
+ support sessions, discovery calls, and voice notes as task evidence. The first
17
+ release should be an on-demand CLI/service workflow, not a real-time meeting
18
+ bot.
20
19
 
21
- Every transcript evidence artifact should include:
20
+ ## Acceptance Criteria From GH-367
22
21
 
23
- - task id and actor;
24
- - workflow-local source artifact or approved evidence reference;
25
- - source hash;
26
- - duration and language when available;
27
- - engine/provider/model;
28
- - timestamp;
29
- - consent and retention notes when provided;
30
- - redaction policy;
31
- - acceptance-criteria mapping with timestamp ranges;
32
- - decisions, risks, defects, action items, unresolved questions, and lesson
33
- candidates.
22
+ - The skill activates on demand from transcription, audio, video, demo
23
+ recording, sprint review, interview, QA evidence, support call, or discovery
24
+ session signals.
25
+ - Local/offline transcription is the default path when available.
26
+ - External provider transcription requires explicit policy opt-in.
27
+ - Provenance records source file path or artifact id, hash, duration, language,
28
+ engine/provider/model, timestamp, actor, task id, and consent/retention notes
29
+ when provided.
30
+ - Outputs include human-readable Markdown and structured JSON; VTT/SRT are
31
+ available when timestamps are reliable.
32
+ - The skill extracts speakers, timestamps, decisions, risks, action items,
33
+ acceptance-criteria candidates, defects, and lesson-learned candidates.
34
+ - Sensitive values are redacted before persistence.
35
+ - QA evidence can reference transcripts and timestamp ranges.
36
+ - Failure modes are explicit and non-destructive.
37
+ - Tests cover activation, policy gating, redaction, provenance, output formats,
38
+ artifact path safety, and degraded local/offline behavior.
39
+ - Documentation explains local-first setup, privacy defaults, provider opt-in,
40
+ retention, and evidence usage.
34
41
 
35
- Preferred outputs are Markdown for human review and JSON for tooling. VTT and
36
- SRT are appropriate only when timestamp confidence is good enough for reviewers
37
- to navigate the media.
42
+ ## Architecture Decision
38
43
 
39
- ## Failure Handling
44
+ Status: proposed
40
45
 
41
- The skill should fail closed or produce degraded evidence for missing local
42
- engines, unsupported codecs, oversized media, unreadable paths, policy-blocked
43
- providers, redaction failures, partial transcripts, and missing regulated
44
- consent data.
46
+ Decision: build a provider-neutral transcription pipeline with a local engine as
47
+ the default adapter, an external-provider adapter family behind explicit policy
48
+ gates, and a single evidence artifact contract shared by Markdown, JSON, VTT,
49
+ and SRT exporters.
45
50
 
46
- ## Example Task Signals
51
+ Rationale:
52
+
53
+ - The product need is governed evidence, not generic media processing.
54
+ - Local-first execution minimizes default privacy and compliance risk.
55
+ - Provider-neutral interfaces let the product add `whisper.cpp`,
56
+ `faster-whisper`, OpenAI, Google Cloud Speech-to-Text, or future adapters
57
+ without changing the evidence contract.
58
+ - A single provenance envelope makes transcripts auditable across local and
59
+ external execution.
60
+ - Redaction must happen before persisted summaries, handoffs, and evidence
61
+ links are written.
62
+
63
+ Alternatives considered:
64
+
65
+ | Option | Benefits | Costs / Risks | Decision |
66
+ | --- | --- | --- | --- |
67
+ | Local-only skill | Strong privacy posture, works offline after setup, lower vendor risk | Hardware variability, slower on large files, local engine install friction | Use as default path |
68
+ | External-only provider | Better managed scaling and language/diarization options | Media leaves workspace, policy/compliance burden, cost controls required | Reject as default |
69
+ | Provider-specific OpenAI or Google command | Fast MVP if one provider is already approved | Locks product contract to one vendor shape | Reject for core architecture |
70
+ | Manual transcript attachment only | Minimal implementation | Does not satisfy evidence, provenance, or redaction requirements | Reject |
71
+
72
+ ## Scope
73
+
74
+ In scope for the first implementation:
75
+
76
+ - Skill activation metadata and on-demand command/service contract.
77
+ - Workflow-local source validation for files and evidence artifact references.
78
+ - Audio extraction and media metadata probing through a bounded media adapter.
79
+ - Local transcription adapter contract with one supported local engine.
80
+ - Explicit opt-in policy contract for external providers.
81
+ - Transcript normalization, redaction, provenance, and artifact writing.
82
+ - Markdown and JSON transcript evidence output.
83
+ - VTT/SRT export when segment timestamps are available.
84
+ - QA evidence mapping from acceptance criteria to timestamp ranges.
85
+
86
+ Out of scope:
87
+
88
+ - Real-time meeting attendance or live call capture.
89
+ - Automatic external uploads.
90
+ - Medical, legal, or financial interpretation as final advice.
91
+ - Long-term media archive management beyond retention metadata.
92
+ - Speaker identification as identity proof. Diarization is a best-effort label.
93
+
94
+ ## Proposed Component Boundaries
95
+
96
+ ```text
97
+ Task / CLI / API request
98
+ -> Skill activation planner
99
+ -> Transcription request validator
100
+ -> Media probe and extraction adapter
101
+ -> Policy gate
102
+ -> Engine adapter
103
+ -> Transcript normalizer
104
+ -> Redaction service
105
+ -> Finding extractor
106
+ -> Evidence artifact writer
107
+ ```
108
+
109
+ Recommended modules for implementation:
110
+
111
+ | Boundary | Responsibility | Notes |
112
+ | --- | --- | --- |
113
+ | `transcription-request` domain | Narrow request, policy, output, and failure types | No provider-specific fields in public APIs except through typed adapter config |
114
+ | `transcription-service` | Orchestrates validation, media probing, engine execution, redaction, extraction, and artifact writing | No shell string interpolation; adapters receive args arrays |
115
+ | `media-probe-adapter` | Reads duration, codec, stream info, and extracts audio when needed | Wrap `ffprobe`/`ffmpeg` with `spawn`/`execFile` args arrays |
116
+ | `transcription-engine-adapter` | Runs local or external provider and returns normalized segments | Provider-specific parsing stays here |
117
+ | `transcript-redaction-service` | Applies secret, credential, configured PII, and regulated marker redaction | Must fail closed when redaction cannot run |
118
+ | `transcript-evidence-writer` | Writes Markdown, JSON, VTT, SRT, and evidence metadata | Atomic writes; no raw unredacted transcript persistence |
119
+ | `workflow-evidence-integration` | Links transcript artifact ids to task evidence and acceptance criteria | Keeps QA evidence references stable |
120
+
121
+ ## Local-First Stack Options
122
+
123
+ Use `ffmpeg`/`ffprobe` as the media utility layer when present. The
124
+ implementation should detect missing binaries and return a degraded evidence
125
+ result rather than requiring installation during command execution.
126
+
127
+ | Stack | Fit | Strengths | Constraints | Recommended use |
128
+ | --- | --- | --- | --- | --- |
129
+ | `whisper.cpp` | Local CPU/GPU transcription | Portable C/C++ runtime, offline operation, simple deployment model | Model download and hardware-dependent speed; diarization is not the core strength | Default MVP local engine candidate |
130
+ | `faster-whisper` | Local or self-hosted accelerated Whisper inference | CTranslate2-backed performance, useful with GPU or optimized CPU | Python/runtime packaging and model cache management | Phase 2 local high-throughput adapter |
131
+ | OS-native speech APIs | Local-ish desktop support where available | Low setup on specific platforms | Platform variance, privacy terms differ, not CI-friendly | Defer unless a product platform needs it |
132
+ | Self-hosted transcription service | Team-managed private endpoint | Centralized resource control, can use GPUs | Becomes infrastructure and access-control work | Later enterprise option |
133
+
134
+ Local setup principles:
135
+
136
+ - Never auto-download models without explicit user action or config.
137
+ - Store model cache locations in typed configuration.
138
+ - Record model name, model path or digest when available, and runtime version.
139
+ - Enforce default size and duration limits before invoking engines.
140
+ - Keep temp audio files inside a workflow-owned temp directory and delete them
141
+ after successful artifact generation unless retention policy says otherwise.
142
+
143
+ ## External Provider Opt-In
144
+
145
+ External transcription is allowed only when all of these are true:
146
+
147
+ - The task or workspace policy explicitly enables the provider.
148
+ - The source artifact is classified as allowed for external processing.
149
+ - Required consent and retention notes are present for regulated or user
150
+ research recordings.
151
+ - Secrets and obvious credentials are screened before upload where technically
152
+ possible.
153
+ - Cost, size, duration, and rate-limit budgets are configured.
154
+ - The evidence artifact records provider, model, endpoint family, request id or
155
+ equivalent correlation id, timestamp, actor, and policy id.
156
+
157
+ Provider candidates:
158
+
159
+ | Provider family | Fit | Required controls |
160
+ | --- | --- | --- |
161
+ | OpenAI Audio transcription | Optional hosted provider for approved workspaces | Server-side API key only, HTTPS, request/correlation id logging, pinned model config, no client-side key exposure |
162
+ | Google Cloud Speech-to-Text | Optional hosted provider for enterprise Google Cloud tenants | IAM, regional endpoint review, audit logging, quota/budget controls, encryption and retention review |
163
+ | Other providers | Future adapters | Same policy gate and provenance contract before enablement |
164
+
165
+ External-provider adapters must not be referenced from product logic directly.
166
+ The service should receive a `TranscriptionEngineAdapter` selected by policy and
167
+ configuration.
168
+
169
+ ## Privacy, Security, And Compliance Controls
170
+
171
+ Data classification:
172
+
173
+ | Data | Default classification | Handling |
174
+ | --- | --- | --- |
175
+ | Source audio/video | Sensitive evidence | Workflow-local only by default; external provider requires opt-in |
176
+ | Raw transcript before redaction | Sensitive derived data | In memory or temp-only; do not persist unless explicitly configured for debugging in a secure path |
177
+ | Redacted transcript | Workflow evidence | Persist as Markdown/JSON under evidence output path |
178
+ | Metadata/provenance | Audit data | Persist with artifact; avoid leaking local absolute paths in user-facing output when not needed |
179
+ | Consent/retention notes | Compliance metadata | Required for regulated/user research contexts before release |
180
+
181
+ Security requirements:
182
+
183
+ - Validate paths are inside the workspace or approved evidence directories.
184
+ - Reject traversal, symlinks that resolve outside allowed roots, unreadable
185
+ files, and unsupported artifact references.
186
+ - Use `spawn` or `execFile` with args arrays for `ffmpeg`, `ffprobe`, local
187
+ engines, and helper commands.
188
+ - Validate external URLs use `https://` and configured allowlisted hosts.
189
+ - Load provider secrets only from server-side environment or secret manager
190
+ integration; never write them to transcript artifacts or logs.
191
+ - Redact before persistence and before summary generation.
192
+ - Fail closed on redaction failure, provider policy denial, missing consent
193
+ where required, or unsupported file path.
194
+ - Cap file size, duration, segment count, output size, and provider cost.
195
+ - Store failure reports without copying raw transcript or media bytes.
196
+
197
+ Compliance requirements:
198
+
199
+ - Record consent status as one of `not_required`, `provided`, `missing`, or
200
+ `unknown`.
201
+ - Record retention class and delete/review date when provided.
202
+ - Mark regulated domain signals: health, finance, legal, government id,
203
+ children/minors, biometric voice data, and customer support PII.
204
+ - Require human review before using transcript-derived medical, legal,
205
+ financial, hiring, or support-account decisions as final determinations.
206
+ - Keep tenant policy explicit; do not infer permission from provider credentials
207
+ alone.
208
+
209
+ ## Evidence Artifact Contract
210
+
211
+ Each transcript run should write one artifact set with a stable id:
212
+
213
+ ```text
214
+ .agent-workflow/evidence/transcripts/<task-id>/<artifact-id>.md
215
+ .agent-workflow/evidence/transcripts/<task-id>/<artifact-id>.json
216
+ .agent-workflow/evidence/transcripts/<task-id>/<artifact-id>.vtt
217
+ .agent-workflow/evidence/transcripts/<task-id>/<artifact-id>.srt
218
+ ```
219
+
220
+ VTT/SRT files are optional and should be omitted when timestamp confidence is
221
+ too low.
222
+
223
+ Minimum JSON shape:
224
+
225
+ ```json
226
+ {
227
+ "schemaVersion": 1,
228
+ "taskId": "GH-367-TRANSCRIPTION-SKILL-SPIKE",
229
+ "artifactId": "transcript_20260522_000001",
230
+ "source": {
231
+ "kind": "workflow-file",
232
+ "path": ".agent-workflow/evidence/demo.mp4",
233
+ "sha256": "hex",
234
+ "durationMs": 120000,
235
+ "mimeType": "video/mp4",
236
+ "language": "en"
237
+ },
238
+ "engine": {
239
+ "executionMode": "local",
240
+ "provider": "whisper.cpp",
241
+ "model": "base.en",
242
+ "version": "detected-version"
243
+ },
244
+ "policy": {
245
+ "policyId": "workspace-default",
246
+ "externalProviderAllowed": false,
247
+ "redactionPolicyId": "default",
248
+ "consentStatus": "provided",
249
+ "retentionClass": "task-evidence"
250
+ },
251
+ "provenance": {
252
+ "actor": "qa",
253
+ "generatedAt": "2026-05-22T00:00:00.000Z",
254
+ "command": "orchestra transcript run",
255
+ "requestId": "local-run-id"
256
+ },
257
+ "segments": [
258
+ {
259
+ "startMs": 1000,
260
+ "endMs": 3500,
261
+ "speaker": "speaker_1",
262
+ "text": "Redacted transcript segment.",
263
+ "confidence": 0.92
264
+ }
265
+ ],
266
+ "findings": {
267
+ "decisions": [],
268
+ "risks": [],
269
+ "actionItems": [],
270
+ "acceptanceCriteriaCandidates": [],
271
+ "defects": [],
272
+ "lessonCandidates": [],
273
+ "unresolvedQuestions": []
274
+ },
275
+ "redactions": {
276
+ "applied": true,
277
+ "countsByType": {
278
+ "secret": 0,
279
+ "email": 2,
280
+ "phone": 1,
281
+ "regulatedMarker": 0
282
+ }
283
+ },
284
+ "quality": {
285
+ "isPartial": false,
286
+ "timestampConfidence": "high",
287
+ "warnings": []
288
+ }
289
+ }
290
+ ```
291
+
292
+ Markdown report sections:
293
+
294
+ - Summary and provenance.
295
+ - Source, policy, consent, retention, and redaction status.
296
+ - Acceptance-criteria mapping table with timestamp ranges.
297
+ - Decisions, risks, defects, action items, lesson candidates, and unresolved
298
+ questions.
299
+ - Gaps and degraded-mode warnings.
300
+
301
+ ## Failure Model
302
+
303
+ All failures must be non-destructive and evidence-friendly.
304
+
305
+ | Failure | Expected behavior |
306
+ | --- | --- |
307
+ | Missing `ffmpeg`/`ffprobe` | Return degraded result with setup guidance; do not mutate source |
308
+ | Missing local engine | Return policy-safe degraded evidence; do not fall back to external provider automatically |
309
+ | Unsupported codec | Record media probe failure and suggested conversion path |
310
+ | Oversized file or duration | Stop before engine invocation and record configured limit |
311
+ | Path outside allowed roots | Block as security failure |
312
+ | Provider not opted in | Block external execution and record policy id |
313
+ | Provider timeout/rate limit | Record partial/deferred result with retry-safe metadata |
314
+ | Redaction failure | Fail closed; do not persist transcript text |
315
+ | Low timestamp confidence | Write Markdown/JSON with warning; omit VTT/SRT |
316
+ | Partial transcript | Mark `quality.isPartial=true` and require QA review before release |
317
+ | Missing consent in regulated context | Block release approval or require PO risk acceptance |
318
+
319
+ ## QA Evidence And Test Strategy
320
+
321
+ QA must map every GH-367 acceptance criterion to an observable check. The first
322
+ implementation should prefer deterministic fixtures: tiny generated WAV/video
323
+ fixtures, mocked engine adapters, and provider contract fakes.
324
+
325
+ | Acceptance area | Test type | Fixture/setup | Expected evidence |
326
+ | --- | --- | --- | --- |
327
+ | Skill activation | Unit/CLI | Task text with transcription/audio/video signals and unrelated control task | Skill selected only for matching signals |
328
+ | Local default | Unit/service | Local engine available; no provider opt-in | Adapter selection uses local engine |
329
+ | Provider opt-in | Unit/contract | Provider configured with policy denied/allowed variants | Denied blocks; allowed records provider provenance |
330
+ | Provenance | Unit/snapshot | Mock media metadata and engine result | JSON includes source hash, duration, language, model, actor, task id, timestamp |
331
+ | Markdown/JSON output | Unit/golden file | Mock segments and findings | Markdown and JSON match schema; no unredacted secret |
332
+ | VTT/SRT output | Unit/golden file | Timestamped and untimestamped segment variants | Subtitles emitted only with adequate timestamps |
333
+ | Finding extraction | Unit | Transcript with decisions, risks, action items, defects | Structured findings populate expected arrays |
334
+ | Redaction | Unit/security | Secrets, tokens, emails, phones, regulated markers | Redacted before persistence; counts recorded |
335
+ | Path safety | Unit/security | Traversal, symlink escape, valid evidence path | Unsafe paths blocked |
336
+ | Missing tools | Unit/integration | `ffmpeg` or engine unavailable | Degraded evidence result, no provider fallback |
337
+ | Oversized media | Unit | File size/duration over limits | Engine not invoked; clear failure |
338
+ | QA timestamp references | Integration | Transcript artifact linked to AC table | Evidence can reference timestamp ranges |
339
+
340
+ Recommended commands once implemented:
47
341
 
48
342
  ```bash
49
- orchestra task add \
50
- --id QA-TRANSCRIPT-001 \
51
- --title "Transcribe sprint review demo recording into QA evidence" \
52
- --owner qa \
53
- --paths ".agent-workflow/evidence/demo.mp4,docs/evidence/transcripts" \
54
- --acceptance "recording transcript maps acceptance criteria to timestamps" \
55
- --risks "privacy,security,governance,release"
56
-
57
- orchestra skills plan --task QA-TRANSCRIPT-001
343
+ node --test test/transcription-*.test.js
344
+ npm run build
345
+ npm run precommit
58
346
  ```
347
+
348
+ Manual QA for the first release:
349
+
350
+ - Run a small local audio fixture through the command.
351
+ - Verify generated Markdown and JSON artifact paths.
352
+ - Verify no raw unredacted transcript is persisted when fixture contains
353
+ secret-like and PII-like text.
354
+ - Verify provider-denied policy cannot upload.
355
+ - Verify provider-allowed policy records the opt-in decision and correlation
356
+ metadata.
357
+
358
+ ## Developer Implementation Slices
359
+
360
+ 1. `GH-367A` - Skill activation and request contract
361
+ - Add skill manifest/catalog wiring if needed.
362
+ - Define request, policy, output, segment, finding, and failure types.
363
+ - Tests: activation signals, narrow public types, invalid request handling.
364
+
365
+ 2. `GH-367B` - Media validation and local probe adapter
366
+ - Validate workspace/evidence paths.
367
+ - Add `ffprobe` metadata probe and `ffmpeg` audio extraction wrapper using
368
+ args arrays.
369
+ - Tests: path traversal, symlink escape, missing binary, unsupported codec,
370
+ size/duration limits.
371
+
372
+ 3. `GH-367C` - Local engine adapter MVP
373
+ - Add one local engine adapter, preferably `whisper.cpp`, behind the generic
374
+ `TranscriptionEngineAdapter`.
375
+ - Normalize output into segments.
376
+ - Tests: adapter command construction, missing engine, partial transcript,
377
+ timestamp confidence.
378
+
379
+ 4. `GH-367D` - Evidence writer and provenance artifacts
380
+ - Write Markdown/JSON and optional VTT/SRT from normalized transcript.
381
+ - Add atomic artifact writes and evidence registration.
382
+ - Tests: schema, golden Markdown, subtitle gating, artifact id stability.
383
+
384
+ 5. `GH-367E` - Redaction and privacy policy gates
385
+ - Add redaction pipeline before persistence.
386
+ - Add provider opt-in policy and consent/retention checks.
387
+ - Tests: secrets, configured PII, regulated markers, redaction failure,
388
+ missing consent, provider denial.
389
+
390
+ 6. `GH-367F` - External provider adapter behind opt-in
391
+ - Add first hosted provider adapter only after policy gates are tested.
392
+ - Record request/correlation id, provider/model, region/endpoint family when
393
+ available, and cost/duration guardrail metadata.
394
+ - Tests: mocked provider success, timeout, rate limit, malformed response,
395
+ no secret leakage.
396
+
397
+ 7. `GH-367G` - QA evidence mapping and docs
398
+ - Add acceptance criteria timestamp mapping support.
399
+ - Document setup, local engine install expectations, provider opt-in,
400
+ retention, and troubleshooting.
401
+ - Tests: evidence report links transcript timestamp ranges to AC rows.
402
+
403
+ ## Risks And Mitigations
404
+
405
+ | Risk | Severity | Mitigation |
406
+ | --- | --- | --- |
407
+ | Secret or PII leakage from recordings | High | Local default, explicit external opt-in, redaction before persistence, security tests |
408
+ | Raw transcript persistence before redaction | High | In-memory/temp-only handling; fail closed on redaction errors |
409
+ | Provider credentials imply accidental opt-in | High | Separate credentials from policy; require explicit provider policy id |
410
+ | Transcript hallucination or speaker misattribution | Medium | Mark confidence/quality, require human review, avoid identity claims |
411
+ | Large media files exhaust local resources | Medium | Size/duration/segment limits and preflight probing |
412
+ | Local engine availability differs by machine/CI | Medium | Degraded failure model and mocked deterministic tests |
413
+ | Codec variance causes flaky tests | Medium | Tiny generated fixtures and mocked media probe for most tests |
414
+ | Regulated retention/consent varies by tenant | High | Required metadata fields and release blocking for missing regulated consent |
415
+ | External provider terms or model behavior changes | Medium | Adapter isolation, pinned model config, provider-specific docs, eval fixtures |
416
+
417
+ ## Open Questions
418
+
419
+ - Where should transcript artifacts live long term: `.agent-workflow/evidence`
420
+ only, or a configurable workspace evidence root?
421
+ - Should raw transcripts ever be retained for debugging, and if so under what
422
+ encrypted local policy?
423
+ - Which local engine should be the MVP implementation target for supported
424
+ platforms?
425
+ - Which provider, if any, has an approved tenant policy for the first hosted
426
+ adapter?
427
+ - Should diarization be included in MVP or treated as a best-effort optional
428
+ post-processor?
429
+
430
+ ## Recommended Next Stories
431
+
432
+ - `GH-367A-SKILL-CONTRACT`: Define transcription request/output/policy types and
433
+ skill activation behavior.
434
+ - `GH-367B-MEDIA-PREFLIGHT`: Implement workflow-local media validation,
435
+ `ffprobe` metadata extraction, and safe `ffmpeg` audio extraction.
436
+ - `GH-367C-LOCAL-ENGINE`: Add the first local transcription adapter with
437
+ degraded-mode behavior.
438
+ - `GH-367D-EVIDENCE-ARTIFACTS`: Generate Markdown/JSON/VTT/SRT transcript
439
+ artifacts with provenance.
440
+ - `GH-367E-PRIVACY-REDACTION`: Implement redaction, consent, retention, and
441
+ provider policy gates.
442
+ - `GH-367F-PROVIDER-ADAPTER`: Add one external provider adapter behind explicit
443
+ opt-in and mocked contract tests.
444
+ - `GH-367G-QA-EVIDENCE`: Add AC-to-timestamp evidence mapping and QA report
445
+ workflow.
446
+
447
+ ## Reference Notes
448
+
449
+ - `whisper.cpp` is an active local C/C++ port of Whisper suitable for offline
450
+ transcription experiments.
451
+ - `faster-whisper` uses CTranslate2 and is a candidate for faster local or
452
+ self-hosted inference.
453
+ - OpenAI and Google Cloud Speech-to-Text are viable hosted transcription
454
+ adapter families only behind explicit workspace policy opt-in.
@@ -25,6 +25,44 @@ The run state, gate artifacts, handoffs, evidence, reviews, decisions, and
25
25
  clarifications are persisted under `.agent-workflow/` so the delivery story can
26
26
  be audited after the fact.
27
27
 
28
+ ## Task-Scoped Roles
29
+
30
+ Tasks can declare the roles that are required or optional for the workflow. When
31
+ a task is created with only an owner role, Open Orchestra treats that owner as
32
+ the implicit required role instead of falling back to the default delivery
33
+ workflow.
34
+
35
+ ```bash
36
+ orchestra task add --id BUG-001 --title "Fix CLI bug" --owner developer
37
+ orchestra workflow run --task BUG-001 --gates phase
38
+ ```
39
+
40
+ Use explicit roles when the task needs a broader lifecycle:
41
+
42
+ ```bash
43
+ orchestra task add --id STORY-001 --title "Ship user-facing workflow" \
44
+ --owner product_owner --required-roles product_owner,architect,developer,qa,release_manager
45
+ ```
46
+
47
+ This keeps small developer or QA tasks scoped while still allowing full
48
+ PO-to-release delivery for stories that need it.
49
+
50
+ ## Workspace Isolation
51
+
52
+ Workflow commands that operate on run state support `--target-dir` so callers
53
+ can launch work from another directory without writing `.agent-workflow/` state
54
+ to the wrong repo.
55
+
56
+ ```bash
57
+ orchestra workflow run --task STORY-001 --target-dir /path/to/project
58
+ orchestra workflow list --target-dir /path/to/project
59
+ orchestra workflow gate-approve --run <run-id> --target-dir /path/to/project
60
+ ```
61
+
62
+ Use `--target-dir` for temporary E2E projects, editor integrations, local web
63
+ console actions, and any parent process that coordinates more than one
64
+ workspace.
65
+
28
66
  ## Phase Graph
29
67
 
30
68
  ```