npm - @jterrats/open-orchestra - Versions diffs - 1.0.3 → 1.0.5 - Mend

@jterrats/open-orchestra 1.0.3 → 1.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (86) hide show

package/dist/autonomous-phase-lifecycle.js +19 -0
package/dist/autonomous-phase-lifecycle.js.map +1 -1
package/dist/autonomous-run-store.d.ts +2 -1
package/dist/autonomous-run-store.js +4 -0
package/dist/autonomous-run-store.js.map +1 -1
package/dist/autonomous-workflow-constants.d.ts +1 -6
package/dist/autonomous-workflow-constants.js +1 -33
package/dist/autonomous-workflow-constants.js.map +1 -1
package/dist/command-manifest.js +1 -1
package/dist/command-manifest.js.map +1 -1
package/dist/constants.d.ts +2 -4
package/dist/constants.js +2 -21
package/dist/constants.js.map +1 -1
package/dist/defaults.d.ts +1 -0
package/dist/defaults.js +1 -0
package/dist/defaults.js.map +1 -1
package/dist/delegation-decision.js +4 -5
package/dist/delegation-decision.js.map +1 -1
package/dist/delivery-dashboard.js +2 -1
package/dist/delivery-dashboard.js.map +1 -1
package/dist/phase-playbooks.js +32 -28
package/dist/phase-playbooks.js.map +1 -1
package/dist/qa-readiness.js +2 -2
package/dist/qa-readiness.js.map +1 -1
package/dist/release-readiness.js +3 -6
package/dist/release-readiness.js.map +1 -1
package/dist/runtime-execution.d.ts +10 -1
package/dist/runtime-execution.js +118 -0
package/dist/runtime-execution.js.map +1 -1
package/dist/runtime-guardrails.js +1 -0
package/dist/runtime-guardrails.js.map +1 -1
package/dist/skills-catalog.js +135 -0
package/dist/skills-catalog.js.map +1 -1
package/dist/subagent-protocol.js +2 -1
package/dist/subagent-protocol.js.map +1 -1
package/dist/task-graph-commands.js +3 -12
package/dist/task-graph-commands.js.map +1 -1
package/dist/task-split-assessment.d.ts +19 -0
package/dist/task-split-assessment.js +190 -0
package/dist/task-split-assessment.js.map +1 -0
package/dist/task-status.d.ts +22 -0
package/dist/task-status.js +83 -0
package/dist/task-status.js.map +1 -0
package/dist/telemetry-records.js +2 -1
package/dist/telemetry-records.js.map +1 -1
package/dist/tracker-commands.js +2 -2
package/dist/tracker-commands.js.map +1 -1
package/dist/types/model-config.d.ts +2 -0
package/dist/types/runtime.d.ts +1 -1
package/dist/types/tasks.d.ts +1 -0
package/dist/types/workflow-run.d.ts +15 -0
package/dist/types.d.ts +1 -1
package/dist/web-api.js +3 -2
package/dist/web-api.js.map +1 -1
package/dist/web-roles.js +2 -1
package/dist/web-roles.js.map +1 -1
package/dist/workflow-phase-planner.d.ts +4 -2
package/dist/workflow-phase-planner.js +57 -38
package/dist/workflow-phase-planner.js.map +1 -1
package/dist/workflow-phases.d.ts +15 -0
package/dist/workflow-phases.js +86 -0
package/dist/workflow-phases.js.map +1 -0
package/dist/workflow-run-commands.js +88 -2
package/dist/workflow-run-commands.js.map +1 -1
package/dist/workflow-services.js +4 -2
package/dist/workflow-services.js.map +1 -1
package/dist/workflow-task-service.js +2 -4
package/dist/workflow-task-service.js.map +1 -1
package/docs/autonomous-workflow.md +34 -0
package/docs/backlog/chaos-testing-stack-strategy.md +146 -0
package/docs/backlog/project-persona-registry-epic.md +350 -0
package/docs/duplicate-code-enforcement.md +60 -0
package/docs/release-test-matrix.md +14 -0
package/docs/reports/duplicate-code-baseline-20260518.md +41 -0
package/docs/runtime-adapters.md +44 -0
package/docs/runtime-llm-flow.md +4 -2
package/docs/secret-scanning-gitleaks.md +53 -0
package/docs/site-manifest.json +5 -0
package/docs/sonar-architecture-model.md +178 -0
package/docs/sonar-quality-gates.md +178 -0
package/docs/task-split-assessment.md +34 -0
package/package.json +5 -1
package/skills/chaos-resilience-testing/SKILL.md +127 -0
package/skills/chaos-resilience-testing/manifest.json +61 -0
package/skills/oclif-plugin-development/SKILL.md +118 -0
package/skills/oclif-plugin-development/manifest.json +58 -0

package/docs/backlog/chaos-testing-stack-strategy.md ADDED Viewed

@@ -0,0 +1,146 @@
+# Story: Chaos testing stack strategy and deterministic fault harness
+Backlog Item ID: CHAOS-TESTING-STACK-STRATEGY-20260517
+## User Story
+As an architect, QA, SRE, security reviewer, or developer, I want a progressive
+chaos testing stack so that Open Orchestra can validate failure behavior locally
+first and adopt heavier tooling only when the runtime or SaaS architecture needs
+it.
+## Problem
+The `chaos-resilience-testing` skill defines how to reason about deterministic
+failure scenarios, but the project also needs a clear tooling strategy. Without
+that strategy, teams may either under-test provider/API failure modes or add
+heavy infrastructure dependencies too early for the npm package.
+## Stack Strategy
+### Layer 1: MVP Local Deterministic Fault Harness
+Use this layer first. It must work offline and should not add heavyweight
+runtime dependencies.
+Recommended tools:
+- Node test runner for domain, service, CLI, and workflow tests.
+- Fake model providers for timeout, unavailable provider, malformed response,
+  fallback, and budget exhaustion.
+- Fake storage/repository adapters for partial writes, corrupted records,
+  stale reads, and audit write failures.
+- Controlled timers, `AbortController`, and injected clocks for bounded timeout
+  behavior.
+- Local fixtures for malformed imports, stale registry data, missing approvals,
+  and policy-denied activation.
+- Orchestra evidence reports for scenario matrix, expected behavior, actual
+  behavior, final state, events, and recovery path.
+Use for:
+- persona registry and activation;
+- provider routing and model fallbacks;
+- approval and gate behavior;
+- budget envelope enforcement;
+- policy fail-closed behavior;
+- local/offline mode.
+### Layer 2: Web/API Resilience
+Use this when user-facing web or API flows need visible degraded states.
+Recommended tools:
+- Playwright route stubs for API timeout, 500, stale response, empty response,
+  slow response, and malformed payload.
+- Existing web console E2E patterns for screenshots, traces, and visible state.
+- Optional mock handlers such as MSW only if repeated browser/API mocking
+  becomes noisy with native Playwright stubs.
+Use for:
+- web console personas;
+- dashboard/runtime status;
+- approval and gate UX;
+- evidence browsing;
+- settings/provider/budget panels.
+### Layer 3: Integration And SaaS Simulation
+Use this when tests need real network behavior or service boundaries.
+Recommended tools:
+- Docker Compose profiles for optional integration test stacks.
+- Toxiproxy for latency, disconnects, resets, bandwidth constraints, and
+  intermittent network behavior.
+- WireMock or equivalent HTTP stub server for external APIs when receiver-side
+  behavior matters.
+- Pact only when consumer/provider contract testing is part of the product
+  contract.
+- k6 for API load and resilience checks once SaaS endpoints exist.
+- OpenTelemetry for correlation IDs, spans, metrics, and failure traceability.
+Use for:
+- SaaS supervisor or registry APIs;
+- external prompt/persona registry;
+- GitHub worker execution;
+- tenant-scoped APIs;
+- long-running runtime scheduler behavior.
+### Layer 4: Production-Grade Infrastructure Chaos
+Do not include this in the npm package MVP. Use only if Open Orchestra operates
+managed services on orchestration platforms.
+Recommended tools:
+- Chaos Mesh or LitmusChaos for Kubernetes.
+- Cloud provider failure testing where permitted.
+- Synthetic monitoring and SLO/error-budget checks.
+Use for:
+- SaaS production control plane;
+- multi-tenant worker infrastructure;
+- managed model gateway or registry services.
+## First Implementation Slices
+1. Provider timeout and fallback harness.
+2. Registry corruption and import failure harness.
+3. Approval race and missing approver policy harness.
+4. Policy engine fail-closed harness.
+5. Audit/event write failure harness.
+6. Budget exhaustion harness.
+7. Offline mode with optional sources unavailable.
+8. Web console stale/error/timeout state harness.
+## Acceptance Criteria
+- MVP local chaos validation uses deterministic in-process faults and offline
+  tests first.
+- Heavy tools such as Toxiproxy, Docker Compose, k6, OpenTelemetry, Chaos Mesh,
+  and LitmusChaos are optional progressive layers, not package MVP
+  dependencies.
+- Every fault scenario maps to acceptance criteria, expected behavior, final
+  state, evidence, and recovery path.
+- Security, compliance, tenant isolation, regulated authority, payment, secrets,
+  policy, and approval failures fail closed by default.
+- Optional enrichment and advisory features may degrade only with clear user or
+  operator rationale.
+## Non-Goals
+- Adding Toxiproxy, k6, OpenTelemetry, Chaos Mesh, or LitmusChaos dependencies
+  in the first implementation.
+- Randomized failure injection in local unit tests.
+- Running production chaos experiments from the npm package.
+## Suggested Size
+S for strategy.
+Future implementation slices should be estimated independently.

package/docs/backlog/project-persona-registry-epic.md ADDED Viewed

@@ -0,0 +1,350 @@
+# Epic: Project persona registry and generated domain personas
+Backlog Item ID: PERSONA-REGISTRY-EPIC-20260517
+## Epic Story
+As a project owner, I want Open Orchestra to register, govern, and activate
+project-specific personas through APIs and UI instead of manual markdown edits,
+so that each project can orchestrate the right domain experts, reviewers, and
+delivery roles for its own context.
+## Problem
+Open Orchestra ships with software delivery roles, but real projects may need
+domain personas that do not fit a fixed role catalog. A music production project
+may need producers, mastering QA, and rights reviewers. A neural network project
+may need ML researchers, MLOps, data stewards, and AI safety reviewers. A medical
+project may need clinical SMEs, medical reviewers, privacy, compliance, and
+regulatory reviewers.
+Those personas should not require users to open or edit markdown files manually.
+They also should not be blindly activated because a generated profile sounds
+plausible. Projects need a registry, generation workflow, governance model, and
+runtime activation rules.
+## MVP Outcome
+- Local persona registry stored in project workflow state and exportable to
+  version-controlled artifacts.
+- CLI/API operations to list, show, create, update, generate, approve,
+  deactivate, export, and import personas.
+- Generated personas start as drafts and require human approval before runtime
+  activation.
+- Runtime activation uses task signals, phase, required roles, risk level, and
+  persona triggers.
+- Regulated or high-risk personas can advise and review, but cannot approve
+  regulated outcomes unless a human approver is configured.
+- Web console can manage project personas without editing markdown files.
+- Offline-first operation works from packaged baselines and local project
+  context; internet or SaaS enrichment is optional.
+## To-Be Outcome
+- SaaS-backed persona registry with tenant isolation, versioning, approval
+  workflows, audit trails, quality scores, and reusable persona packs.
+- Domain packs for software, music/audio, ML/AI, healthcare, data, security,
+  operations, education, legal/compliance, and other project types.
+- Optional state-of-the-art refresh pipeline that proposes persona updates from
+  curated references, prompt bank entries, lessons learned, and approved domain
+  packs.
+- Guarded supervisor can detect missing personas from project activity and
+  propose draft personas or capability changes.
+- Runtime scheduler and budget policies constrain generation, enrichment, and
+  multi-agent activation.
+- Chaos and resilience validation proves persona workflows degrade safely when
+  providers, registries, approvals, budgets, or context sources fail.
+## Persona Model
+Minimum fields:
+- `id`
+- `displayName`
+- `roleFamily`
+- `domain`
+- `purpose`
+- `responsibilities`
+- `activationTriggers`
+- `workflowPhases`
+- `requiredCapabilities`
+- `optionalCapabilities`
+- `requiredSkills`
+- `authorityLevel`: advisory, reviewer, approver, executor
+- `approvalConstraints`
+- `evidenceExpectations`
+- `securityConstraints`
+- `regulatoryScope`
+- `riskLevel`
+- `sourcePolicy`: packaged, project, generated, imported, external
+- `version`
+- `status`: draft, active, deprecated, rejected
+- `createdBy`
+- `approvedBy`
+- `provenance`
+## Proposed Child Stories
+### Story 1: Local Persona Registry API
+As a project owner, I want a typed local persona registry API so that personas
+can be managed without editing markdown files manually.
+Acceptance criteria:
+- Provides CLI/API operations: list, show, create, update, deactivate, approve,
+  reject, export, and import.
+- Persists project personas in workflow state with schema validation and
+  deterministic IDs.
+- Supports versioning and immutable approval provenance.
+- Prevents duplicate active personas with the same project ID and role family.
+- Exports approved personas to a portable version-controlled artifact.
+- Rejects malformed, oversized, conflicting, or partially imported persona data
+  without corrupting the registry.
+- Includes unit tests for create, update, approval, deactivation, import/export,
+  invalid schema, and duplicate prevention.
+Suggested size: M.
+Suggested owners: Architect, Developer, QA.
+### Story 2: Persona Generation Assistant
+As a project owner, I want Open Orchestra to generate draft personas from project
+needs and context so that I can define domain roles quickly without starting from
+a blank file.
+Acceptance criteria:
+- Generates draft personas from a requested role, domain, task context,
+  workflow phase, existing roles, local standards, lessons learned, and approved
+  prompt bank entries.
+- Uses offline packaged baselines first and optional external enrichment only
+  when enabled.
+- Marks generated personas as draft and inactive by default.
+- Records provenance: prompt id, context sources, model/provider when available,
+  generated timestamp, and reviewer requirements.
+- Applies prompt-injection and secret redaction checks to context before
+  generation.
+- Includes chaos/resilience tests for provider timeout, provider unavailable,
+  unsafe context, prompt registry unavailable, and budget exhaustion.
+- Includes tests for draft status, provenance, source filtering, offline mode,
+  and blocked unsafe context.
+Suggested size: L.
+Suggested owners: Product Owner, Architect, Security, Developer, QA.
+### Story 3: Domain Persona Packs
+As a project owner, I want reusable domain persona packs so that non-software
+projects can bootstrap relevant expert profiles.
+Acceptance criteria:
+- Defines packaged baseline packs for software delivery, music/audio, ML/AI,
+  healthcare, data/analytics, security, operations, and compliance.
+- Packs are compact metadata, not long prompt bodies.
+- Each pack declares default personas, capabilities, activation triggers,
+  evidence expectations, and risk constraints.
+- Healthcare and other regulated packs mark expert personas as human-approval
+  required for regulated decisions.
+- Includes tests for pack loading, filtering by domain, and high-risk activation
+  constraints.
+Suggested size: M.
+Suggested owners: Product Manager, Architect, Compliance/Privacy, QA.
+### Story 4: Runtime Persona Activation
+As a workflow runtime, I want to activate personas only when a task, phase, risk,
+or required role needs them so that context stays bounded and relevant.
+Acceptance criteria:
+- Resolves active personas from task signals, required roles, optional roles,
+  phase, paths, domain, risk level, and explicit user request.
+- Does not load every persona for every task.
+- Returns activation rationale and skipped-persona rationale.
+- Integrates with capability and skill resolution without duplicating
+  hardcoded role maps.
+- Blocks unapproved, rejected, deprecated, or policy-ineligible personas.
+- Includes chaos/resilience tests for corrupted registry state, missing
+  approvals, disabled persona packs, and provider-independent offline
+  activation.
+- Includes tests for phase-scoped activation, domain triggers, no-match cases,
+  high-risk human approval, and capability resolution.
+Suggested size: M.
+Suggested owners: Architect, Developer, QA.
+### Story 5: Persona Governance and Safety Gates
+As a security and compliance reviewer, I want persona creation and activation to
+respect authority boundaries so that generated personas cannot approve unsafe or
+regulated outcomes.
+Acceptance criteria:
+- Requires human approval before any generated persona becomes active.
+- Requires explicit human approver mapping for regulated authority levels.
+- Prevents generated medical, legal, financial, privacy, or regulatory personas
+  from issuing final approval unless policy allows a configured human approver.
+- Records audit events for generation, approval, rejection, activation, and
+  deactivation.
+- Adds security review when personas touch secrets, PII, PHI, financial data,
+  regulated decisions, network calls, or external references.
+- Includes chaos/resilience tests for approval race conditions, missing approver
+  mappings, policy engine failure, audit write failure, and denied regulated
+  authority escalation.
+- Includes tests for approval policy, audit events, blocked activation, and
+  unsafe authority escalation.
+Suggested size: M.
+Suggested owners: Security, Compliance/Privacy, Architect, QA.
+### Story 6: Web Console Persona Management
+As a human user, I want to manage project personas in the web console so that I
+can review and approve personas without editing local files.
+Acceptance criteria:
+- Adds a Personas management page or section with list, detail, create,
+  generate, edit draft, approve, reject, deactivate, export, and import flows.
+- Shows persona status, authority level, domain, risk, triggers, capabilities,
+  evidence expectations, and approval state.
+- Provides empty, loading, error, success, and recovery states.
+- Uses tooltips for authority level, regulated scope, activation triggers, and
+  evidence expectations.
+- Includes resilience coverage for API timeout, failed generation, rejected
+  approval, import validation failure, and stale registry data.
+- Includes Playwright coverage for list, create draft, approve, reject, and
+  high-risk warning flows.
+Suggested size: L.
+Suggested owners: UX/UI Designer, Developer, QA, Technical Writer.
+### Story 7: Persona Documentation and User Guide
+As a user, I want clear persona registry documentation so that I understand how
+to define domain experts, generated personas, and approval constraints.
+Acceptance criteria:
+- Documents CLI and API usage for persona registry operations.
+- Explains packaged personas, generated drafts, approval, activation, and
+  regulated-domain limits.
+- Includes examples for software, music/audio, ML/AI, and healthcare projects.
+- Documents offline-first behavior and optional SaaS/external enrichment.
+- Updates the public site docs if persona management becomes user-facing.
+Suggested size: S.
+Suggested owners: Technical Writer, Product Owner, QA.
+## Epic Acceptance Criteria
+- Persona registry avoids manual markdown editing for normal persona management.
+- Persona generation produces draft, reviewable personas, not automatically
+  active authorities.
+- Runtime activation is bounded by task need, phase, role, risk, and project
+  domain.
+- Regulated domains require explicit human approval policies.
+- Offline-first mode works without internet access.
+- SaaS/external enrichment is optional and tenant-scoped when enabled.
+- Child stories are independently estimable and testable.
+- Chaos testing validates safe degradation for persona registry, runtime
+  activation, generation, governance, and web console flows.
+## Non-Goals
+- Replacing built-in delivery roles in the first release.
+- Building a full SaaS registry in the MVP.
+- Allowing generated personas to approve regulated clinical, legal, financial,
+  or compliance outcomes without a configured human approver.
+- Making internet access mandatory for persona generation.
+- Storing raw prompts or sensitive project context by default.
+## Risks
+- Generated personas may sound authoritative without enough domain validation.
+- Prompt injection from lessons learned, issues, docs, or external references.
+- Secret or sensitive data leakage during persona generation.
+- Context bloat if runtime activation loads too many personas.
+- Tenant and regulatory boundary violations in SaaS mode.
+- Duplicate or conflicting personas if registry ownership is unclear.
+- Workflow degradation if multiple agents generate or activate personas while
+  providers, approval stores, or prompt/context registries are unavailable.
+## Chaos And Resilience Testing
+Persona features should load the `chaos-resilience-testing` skill for controlled
+failure scenarios where they affect runtime behavior or authority. The goal is
+not random fault injection in the first MVP; it is deterministic chaos-style
+validation that proves the product fails closed, preserves auditability, and
+keeps offline operation usable.
+Minimum scenarios:
+- provider timeout or unavailable model during persona generation;
+- prompt bank, lessons, or external enrichment source unavailable;
+- malformed, oversized, or partially imported persona registry data;
+- concurrent approval/update attempts for the same persona;
+- missing human approver mapping for regulated authority;
+- budget envelope exhausted before or during generation;
+- policy engine denies activation or fails closed;
+- audit/event write failure during approval or activation;
+- web console API timeout or stale registry response;
+- offline mode with no external network access.
+Expected behavior:
+- generated personas remain `draft` unless approval succeeds;
+- regulated personas cannot approve final outcomes without configured human
+  authority;
+- runtime activation skips unsafe personas and records the rationale;
+- unavailable optional sources degrade with explicit evidence instead of
+  blocking local workflows;
+- failures produce QA evidence that maps back to the story acceptance criteria.
+## Suggested Epic Size
+XL for full epic.
+MVP slice: L.
+Recommended first implementation order:
+1. Dynamic UX phase routing for user-facing work
+   (`UX-DYNAMIC-PHASE-ROUTING-20260517`) before the persona web console story.
+2. Local Persona Registry API.
+3. Runtime Persona Activation.
+4. Persona Governance and Safety Gates.
+5. Persona Generation Assistant.
+6. Web Console Persona Management.
+7. Domain Persona Packs.
+8. Documentation and user guide.
+## Related Workflow Story
+`UX-DYNAMIC-PHASE-ROUTING-20260517` should add dynamic UX routing for
+user-facing tasks before persona management UI is implemented. The current
+workflow can recommend `ux_review` after implementation, but it does not yet
+support a pre-architecture `ux_design` phase or BA/PO validation of UX artifacts
+before architecture.
+Expected target sequence for user-facing work:
+```txt
+po/ba -> ux_design -> po_validation -> architect -> developer -> qa -> ux_review -> release
+```
+The first implementation may keep this advisory when a task or global
+`workflow.phaseSequence` is manually configured, but the phase planner should
+make the recommendation explicit and explain why UX must provide design inputs
+before architecture for UI-heavy work.

package/docs/duplicate-code-enforcement.md ADDED Viewed

@@ -0,0 +1,60 @@
+# Duplicate-Code Enforcement
+Open Orchestra uses `jscpd` to make the existing DRY and
+`collection-standards` rules enforceable in CI.
+## Command
+```bash
+npm run duplicates
+```
+The command writes machine-readable reports under
+`.agent-workflow/reports/jscpd/` and prints a console summary.
+## Scope
+The detector scans source, scripts, web clients, extension code, rules, skills,
+and selected docs. It excludes generated output, package locks, workflow runtime
+state, diagrams, historical reports, and large backlog documents that are not
+maintained as product code.
+The initial threshold is intentionally conservative:
+- `threshold`: 5%
+- `minLines`: 10
+- `minTokens`: 80
+This keeps the gate focused on meaningful copy-paste while avoiding noisy
+failures from small examples or test setup.
+## Relationship To Collection Standards
+`jscpd` detects textual duplication. It does not replace semantic review.
+When the duplicate block is a repeated list, map, command matrix, provider list,
+role/status list, selector set, fixture set, or validator set, remediation must
+load `collection-standards` and move the values into a typed source of truth.
+Examples:
+- command lists should derive from the command manifest or a command catalog;
+- workflow phases should derive from the workflow phase registry;
+- task statuses should derive from task status helpers;
+- UI dropdown options should derive from shared metadata or API contracts;
+- test fixtures should use builders/factories when reused across files.
+## CI Behavior
+The CI quality job runs `npm run duplicates` after the normal precommit gate.
+This keeps duplicate detection visible in pull requests and release evidence.
+If a finding is intentional, prefer one of these responses:
+1. Extract a shared source of truth.
+2. Narrow the scanned scope if the file is generated or historical.
+3. Add a documented exclusion only when duplication is unavoidable.
+4. Create a follow-up issue when the remediation is larger than the current
+   story.
+Do not suppress findings silently.

package/docs/release-test-matrix.md CHANGED Viewed

@@ -40,6 +40,10 @@ manual intervention is required.
 | Flow                   | Command                               | Evidence                                                                                                                                                               |
 | ---------------------- | ------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | Source quality gate    | `npm run precommit`                   | lint, typecheck, secret scan, security audit, build, unit tests, workflow validation                                                                                   |
+| Secret scanning gate   | `npm run secret-scan`                 | Gitleaks scan with `.gitleaks.toml` when the binary is installed; lightweight fallback for offline local development                                                   |
+| Duplicate-code gate    | `npm run duplicates`                  | jscpd duplicate-code report with generated/runtime outputs excluded and collection-standards follow-up for duplicated domain lists                                     |
+| Task split guard       | `node --test test/task-split-assessment.test.js` | PO/BA functional oversize, Architect technical complexity, routine small-task non-blocking behavior, and markdown evidence rendering                                    |
+| Sonar quality gate     | GitHub Actions: `Sonar`               | conditional quality gate for duplication, bugs, code smells, maintainability, coverage readiness, and security hotspots when `SONAR_TOKEN` is configured              |
 | Browser E2E            | `npm run test:e2e`                    | Playwright checks map scenario acceptance criteria to visible UI state, API persistence, artifact attachment, responsive layout, and recovery behavior                 |
 | Installed package init | `npm run test:e2e:init`               | Installed CLI checks map scenario acceptance criteria to stdout, stderr, exit code, filesystem state, JSON contracts, evidence records, and release-readiness outcomes |
 | Public site build      | `npm run site:build`                  | production site build                                                                                                                                                  |
@@ -52,3 +56,13 @@ manual intervention is required.
 The default release matrix is offline-friendly. Provider and tracker tests that
 need network access must honor `SKIP_NETWORK_TESTS` and report skipped status
 instead of failing offline CI.
+Sonar is conditional because it requires `SONAR_TOKEN`. When configured, a
+failing Sonar quality gate blocks release on new-code quality. When unavailable
+or offline, release evidence must state that Sonar was skipped and attach the
+local quality gates that ran instead.
+The duplicate-code gate is local and CI-friendly after dependencies are
+installed. When it reports copied domain lists, command matrices, providers,
+roles, statuses, fixtures, selectors, or validators, remediation should use the
+`collection-standards` skill and extract a typed source of truth.

package/docs/reports/duplicate-code-baseline-20260518.md ADDED Viewed

@@ -0,0 +1,41 @@
+# Duplicate-Code Baseline 2026-05-18
+Task: `DUPLICATE-CODE-ENFORCEMENT-20260518`
+Command:
+```bash
+npm run duplicates
+```
+Result: pass. The configured threshold is 5%; current total duplicated lines
+are 0.24%.
+## Summary
+| Format | Files analyzed | Clones | Duplicated lines |
+| ------ | -------------- | ------ | ---------------- |
+| TypeScript | 160 | 8 | 119 (0.34%) |
+| Total | 275 | 8 | 119 (0.24%) |
+## Findings
+- `src/workflow-task-service.ts`: repeated mutation/event shape for task removal
+  paths.
+- `src/workflow-approval-service.ts`: repeated approval record construction.
+- `src/telemetry-export.ts`: repeated telemetry export handling.
+- `src/skills-planning.ts`: repeated skill-source grouping logic.
+- `src/phase-executor.ts` and `src/qa-coverage.ts`: repeated text/classification
+  helpers.
+- `src/phase-executor.ts` and `src/planning-commands.ts`: repeated rendering or
+  summary logic.
+- `src/cli.ts` and `src/upgrade-commands.ts`: repeated command/option parsing
+  shape.
+- `src/autonomous-run-state.ts`: repeated phase status transition shape.
+## Decision
+The baseline is accepted for this enforcement task because the duplication is
+below threshold and no finding blocks CI. Future work should address repeated
+domain lists or command matrices using `collection-standards`; repeated control
+flow should be refactored only when it reduces meaningful complexity.

package/docs/runtime-adapters.md CHANGED Viewed

@@ -151,6 +151,50 @@ external provider processes directly; they record auditable suspend, resume,
 cancel, or close events so the parent runtime can reconcile claimed work,
 stale sessions, and handoff state without inventing a second source of truth.
+## Workflow Phase Executors
+`workflow run` can plan how each phase should be executed without confusing the
+role/profile with the runtime executor:
+- **Role/profile**: PM, PO, Architect, Developer, QA, Release, or another phase
+  owner. This controls responsibilities, playbooks, expected evidence, and gate
+  authority.
+- **Runtime executor**: `codex-cli`, `claude-cli`, `cursor-cli`,
+  `opencode-cli`, `vscode-agent`, `windsurf-agent`, or `generic-runtime`.
+  This controls where the brief or delegation packet is intended to run.
+- **Subagent**: a runtime-native role-scoped execution unit, only available
+  when the selected runtime adapter declares `subagents.runtimeNative: true`.
+- **Provider**: a direct model/provider route used by provider-backed phase
+  prompts. Provider APIs are separate from runtime-native subagents and are
+  never used as a silent fallback for runtime delegation.
+The workflow phase execution mode can be selected per run:
+```bash
+orchestra workflow run --task STORY-001 --phase-execution auto
+orchestra workflow run --task STORY-001 --phase-execution subagents
+orchestra workflow run --task STORY-001 --phase-execution single-agent
+```
+`auto` uses runtime-native subagent packets when the selected runtime supports
+them and delegation guardrails allow the spawn; otherwise it records a
+parent-agent fallback reason. `subagents` requires runtime-native support and
+fails fast if the runtime cannot satisfy it. `single-agent` forces the parent
+agent path and records that choice in phase provenance.
+Subagent spawning is bounded by `runtimePolicy.delegation.guardrails`.
+`maxConcurrentDelegates` is the threshold for simultaneously running delegated
+sessions, `maxSpawnsPerTask` limits fan-out for one task, and `limitAction`
+controls whether pressure should `queue` or `reject`. With the default `queue`
+policy, a phase that cannot acquire capacity is paused as a queued runtime
+subagent instead of silently falling back to the parent agent. Resume the
+workflow after capacity is released.
+Each phase stores executor provenance in the workflow run and handoff:
+execution mode, executor type, phase, role, runtime id, delegation packet path
+when one was rendered, session id when available, fallback reason, and
+`directProviderApiAllowed=false`.
 Cursor canvas sync is intentionally runtime-specific:
 ```bash

package/docs/runtime-llm-flow.md CHANGED Viewed

@@ -223,8 +223,10 @@ budget fallback.
   approval.
 - Runtime execution adapters render briefs and delegation packets, but they do
   not yet launch external CLI/IDE processes non-interactively.
-- It records delegation decisions, but it does not automatically spawn
-  subagents yet.
+- `workflow run --phase-execution auto|subagents|single-agent` records phase
+  executor provenance and can render role-scoped runtime-native delegation
+  packets, but the parent runtime still owns any real spawn/interaction with
+  external CLI or IDE subagents.
 - Parallel independent CLI commands are expected to work, but dependent commands
   still need parent-agent ordering or future DAG semantics.
 - Workflow files are local state. Promote durable lessons into docs, skills, or