npm - scene-capability-engine - Versions diffs - 3.6.7 → 3.6.9 - Mend

scene-capability-engine 3.6.7 → 3.6.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/CHANGELOG.md +26 -0
package/README.md +4 -1
package/README.zh.md +4 -1
package/docs/agent-runtime/orchestrator-rate-limit-profiles.md +4 -0
package/docs/command-reference.md +12 -0
package/lib/commands/state.js +90 -0
package/lib/orchestrator/orchestration-engine.js +234 -4
package/lib/orchestrator/orchestrator-config.js +15 -0
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,32 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [3.6.9] - 2026-03-05
+### Added
+- Orchestration runtime now emits machine-readable `rate-limit:decision` telemetry events for retry/throttle/hold/recovery transitions.
+- New anti-429 runtime config knobs in `.sce/config/orchestrator.json`:
+  - `rateLimitRetrySpreadMs`
+  - `rateLimitLaunchHoldPollMs`
+  - `rateLimitDecisionEventThrottleMs`
+### Changed
+- Rate-limit retries now apply deterministic per-spec retry spread to reduce synchronized 429 bursts under high parallelism.
+- Launch-hold polling interval is now configurable, so anti-429 pause loops can be tuned for responsiveness vs overhead.
+## [3.6.8] - 2026-03-05
+### Added
+- New one-shot state command:
+  - `sce state reconcile --all --apply --json`
+  - runs `doctor -> migrate -> doctor` with before/after drift summary in one payload
+- Release workflow now runs state reconcile before reconciliation gate and uploads:
+  - `state-reconcile-<tag>.json`
+### Changed
+- Release workflow state reconciliation gate is now enforce-by-default (`KSE_RELEASE_STATE_MIGRATION_ENFORCE` defaults to `true`).
+- Release workflow Node runtime updated to `22.x` so sqlite-backed reconciliation can run in release pipeline.
 ## [3.6.7] - 2026-03-05
 ### Fixed

package/README.md CHANGED Viewed

@@ -121,6 +121,7 @@ SCE is tool-agnostic and works with Codex, Claude Code, Cursor, Windsurf, VS Cod
 - Session governance is scene-first: `1 scene = 1 primary session`.
 - Spec work is attached as child sessions and auto-archived.
 - Startup now auto-detects adopted projects and aligns takeover baseline defaults automatically.
+- Multi-agent anti-429 runtime now supports deterministic retry spread and machine-readable `rate-limit:decision` telemetry (`rateLimitRetrySpreadMs`, `rateLimitLaunchHoldPollMs`, `rateLimitDecisionEventThrottleMs`).
 - Problem evaluation policy is enabled by default (`.sce/config/problem-eval-policy.json`) and evaluates every Studio stage.
 - Problem closure policy is enabled by default (`.sce/config/problem-closure-policy.json`) and blocks verify/release bypass when required domain/problem evidence is missing.
 - Error handling now follows a full incident loop by default: every record attempt is staged first and auto-closed on verified/promoted outcomes.
@@ -149,8 +150,10 @@ Studio task-stream output contract (default):
   - `sce state plan --json`
   - `sce state doctor --json`
   - `sce state migrate --all --apply --json`
+  - `sce state reconcile --all --apply --json` (doctor -> migrate -> doctor one-shot)
   - `sce state export --out .sce/reports/state-migration/state-export.latest.json --json`
   - reconciliation gate: `npm run gate:state-migration-reconciliation`
+  - release workflow defaults to enforce mode for state reconciliation gate and runs reconcile before publish
   - runtime reads now prefer sqlite indexes for timeline/scene-session views when indexed data exists
   - `state doctor` now emits `summary` and runtime diagnostics (`runtime.timeline`, `runtime.scene_session`) with read-source and consistency status
   - migratable components now include runtime + errorbook + spec-governance + release evidence indexes (`errorbook.entry-index`, `errorbook.incident-index`, `governance.spec-scene-overrides`, `governance.scene-index`, `release.evidence-runs-index`, `release.gate-history-index`)
@@ -215,5 +218,5 @@ MIT. See [LICENSE](LICENSE).
 ---
-**Version**: 3.6.3
+**Version**: 3.6.9
 **Last Updated**: 2026-03-05

package/README.zh.md CHANGED Viewed

@@ -121,6 +121,7 @@ SCE 对工具无锁定，可接入 Codex、Claude Code、Cursor、Windsurf、VS
 - 会话治理默认场景优先：`1 scene = 1 primary session`。
 - Spec 执行作为子会话自动归档，支持跨轮次追踪。
 - 启动时会自动识别已接管项目并对齐接管基线默认配置。
+- 多 Agent 抗 429 运行时新增“确定性重试错峰 + 机器可读 `rate-limit:decision` 事件”，可通过 `rateLimitRetrySpreadMs`、`rateLimitLaunchHoldPollMs`、`rateLimitDecisionEventThrottleMs` 调优。
 - 问题评估策略默认启用（`.sce/config/problem-eval-policy.json`），Studio 各阶段都会执行评估。
 - 问题闭环策略默认启用（`.sce/config/problem-closure-policy.json`），缺失必要问题/领域证据时会在 verify/release 阶段阻断。
 - 错误处理默认进入完整 incident 闭环：每次记录先落到 staging 试错链路，verified/promoted 后自动收束归档。
@@ -149,8 +150,10 @@ Studio 任务流输出契约（默认）：
   - `sce state plan --json`
   - `sce state doctor --json`
   - `sce state migrate --all --apply --json`
+  - `sce state reconcile --all --apply --json`（一键执行 doctor -> migrate -> doctor）
   - `sce state export --out .sce/reports/state-migration/state-export.latest.json --json`
   - 对账门禁：`npm run gate:state-migration-reconciliation`
+  - 发布工作流默认对 state reconciliation 启用 enforce，并在发布前执行 reconcile
   - 运行时读取在存在索引数据时优先使用 SQLite（timeline/scene-session 视图）
   - `state doctor` 新增 `summary` 与运行时诊断（`runtime.timeline`、`runtime.scene_session`），可直接读取读源与一致性状态
   - 可迁移组件扩展到 runtime + errorbook + spec-governance + release evidence 索引（`errorbook.entry-index`、`errorbook.incident-index`、`governance.spec-scene-overrides`、`governance.scene-index`、`release.evidence-runs-index`、`release.gate-history-index`）
@@ -215,5 +218,5 @@ MIT，见 [LICENSE](LICENSE)。
 ---
-**版本**：3.6.3
+**版本**：3.6.9
 **最后更新**：2026-03-05

package/docs/agent-runtime/orchestrator-rate-limit-profiles.md CHANGED Viewed

@@ -23,6 +23,9 @@ This document defines the default anti-429 presets used by SCE multi-agent orche
 | `rateLimitSignalThreshold` | 2 | 3 | 4 |
 | `rateLimitSignalExtraHoldMs` | 5000 | 3000 | 2000 |
 | `rateLimitDynamicBudgetFloor` | 1 | 1 | 2 |
+| `rateLimitRetrySpreadMs` | 1200 | 600 | 250 |
+| `rateLimitLaunchHoldPollMs` | 1000 | 1000 | 1000 |
+| `rateLimitDecisionEventThrottleMs` | 1000 | 1000 | 1000 |
 ## Usage
@@ -64,3 +67,4 @@ Release readiness criteria:
 1. No failing test in orchestrator/rate-limit scope.
 2. `orchestrate profile show --json` returns expected profile and effective values.
 3. Multi-agent run no longer stalls under sustained `429`; launch budget and hold telemetry progress over time.
+4. `rate-limit:decision` events are emitted as machine-readable telemetry for retry/throttle/recovery transitions.

package/docs/command-reference.md CHANGED Viewed

@@ -719,6 +719,9 @@ sce state migrate --all --json
 # apply migration writes into sqlite index tables
 sce state migrate --all --apply --json
+# reconcile in one flow (doctor -> migrate -> doctor)
+sce state reconcile --all --apply --json
 # migrate specific components
 sce state migrate --component collab.agent-registry --component runtime.timeline-index --apply --json
@@ -756,6 +759,7 @@ SQLite index tables introduced for gradual migration:
 Runtime read preference:
 - timeline/session runtime views now prefer SQLite indexes when indexed rows exist.
 - file artifacts remain source-of-truth for content payload and recovery operations.
+- release workflow runs `state reconcile --all --apply` before reconciliation gate.
 - `sce state doctor --json` now includes:
   - `summary` aggregate (`pending_components`, `total_record_drift`, `blocking_count`, `alert_count`)
   - runtime read diagnostics (`runtime.timeline`, `runtime.scene_session`) with read-source/read-preference and consistency status
@@ -1792,6 +1796,9 @@ Recommended `.sce/config/orchestrator.json`:
   "rateLimitSignalThreshold": 3,
   "rateLimitSignalExtraHoldMs": 3000,
   "rateLimitDynamicBudgetFloor": 1,
+  "rateLimitRetrySpreadMs": 600,
+  "rateLimitLaunchHoldPollMs": 1000,
+  "rateLimitDecisionEventThrottleMs": 1000,
   "apiKeyEnvVar": "CODEX_API_KEY",
   "codexArgs": ["--skip-git-repo-check"],
   "codexCommand": "npx @openai/codex"
@@ -1805,9 +1812,14 @@ Recommended `.sce/config/orchestrator.json`:
 - `rateLimitSignalThreshold`: signals required inside window before escalation
 - `rateLimitSignalExtraHoldMs`: extra launch hold per escalation unit
 - `rateLimitDynamicBudgetFloor`: lowest dynamic launch budget allowed during sustained pressure
+- `rateLimitRetrySpreadMs`: deterministic retry spread (per spec/retry round) to reduce synchronized retry bursts
+- `rateLimitLaunchHoldPollMs`: polling interval while launch hold is active (lower values react faster, higher values reduce loop overhead)
+- `rateLimitDecisionEventThrottleMs`: de-dup interval for repeated `rate-limit:decision` telemetry events
 `orchestrate stop` interrupts pending retry waits immediately so long backoff windows do not look like deadlocks.
+Runtime emits machine-readable `rate-limit:decision` events for retry/throttle/hold/recovery transitions, so UI or controller layers can surface anti-429 actions directly.
 Codex sub-agent permission defaults:
 - `--sandbox danger-full-access` is always injected by orchestrator runtime.
 - `--ask-for-approval never` is injected by default when `codexArgs` does not explicitly set approval mode.

package/lib/commands/state.js CHANGED Viewed

@@ -8,6 +8,7 @@ const {
   runStateDoctor,
   runStateExport
 } = require('../state/state-migration-manager');
+const { getSceStateStore } = require('../state/sce-state-store');
 function normalizeString(value) {
   if (typeof value !== 'string') {
@@ -147,6 +148,85 @@ async function runStateExportCommand(options = {}, dependencies = {}) {
   return payload;
 }
+async function runStateReconcileCommand(options = {}, dependencies = {}) {
+  const projectPath = dependencies.projectPath || process.cwd();
+  const fileSystem = dependencies.fileSystem || fs;
+  const env = dependencies.env || process.env;
+  const components = normalizeComponentInput(options.component);
+  const componentIds = options.all === true ? [] : components;
+  const apply = options.apply === true;
+  const stateStore = dependencies.stateStore || getSceStateStore(projectPath, {
+    fileSystem,
+    env
+  });
+  const before = await runStateDoctor({}, {
+    projectPath,
+    fileSystem,
+    env,
+    stateStore
+  });
+  const migration = await runStateMigration({
+    apply,
+    all: options.all === true,
+    componentIds
+  }, {
+    projectPath,
+    fileSystem,
+    env,
+    stateStore
+  });
+  const after = await runStateDoctor({}, {
+    projectPath,
+    fileSystem,
+    env,
+    stateStore
+  });
+  const beforePending = Number(before && before.summary && before.summary.pending_components) || 0;
+  const afterPending = Number(after && after.summary && after.summary.pending_components) || 0;
+  const beforeBlocking = Number(before && before.summary && before.summary.blocking_count) || 0;
+  const afterBlocking = Number(after && after.summary && after.summary.blocking_count) || 0;
+  const pendingReduced = Math.max(0, beforePending - afterPending);
+  const blockingReduced = Math.max(0, beforeBlocking - afterBlocking);
+  const payload = {
+    mode: 'state-reconcile',
+    success: Boolean(migration && migration.success) && afterBlocking === 0,
+    apply,
+    generated_at: new Date().toISOString(),
+    store_path: after && after.store_path ? after.store_path : null,
+    sqlite: after && after.sqlite ? after.sqlite : null,
+    migration,
+    before: {
+      summary: before && before.summary ? before.summary : null,
+      blocking: Array.isArray(before && before.blocking) ? before.blocking : [],
+      alerts: Array.isArray(before && before.alerts) ? before.alerts : []
+    },
+    after: {
+      summary: after && after.summary ? after.summary : null,
+      blocking: Array.isArray(after && after.blocking) ? after.blocking : [],
+      alerts: Array.isArray(after && after.alerts) ? after.alerts : []
+    },
+    summary: {
+      apply,
+      migrated_components: Number(migration && migration.summary && migration.summary.migrated_components) || 0,
+      migrated_records: Number(migration && migration.summary && migration.summary.migrated_records) || 0,
+      before_pending_components: beforePending,
+      after_pending_components: afterPending,
+      pending_components_reduced: pendingReduced,
+      before_blocking_count: beforeBlocking,
+      after_blocking_count: afterBlocking,
+      blocking_reduced: blockingReduced
+    }
+  };
+  printPayload(payload, options, 'State Reconcile');
+  return payload;
+}
 async function safeRun(handler, options = {}, dependencies = {}, title = 'state command') {
   try {
     await handler(options, dependencies);
@@ -199,6 +279,15 @@ function registerStateCommands(program) {
     .option('--out <path>', 'Output file path', '.sce/reports/state-migration/state-export.latest.json')
     .option('--json', 'Print machine-readable JSON output')
     .action(async (options) => safeRun(runStateExportCommand, options, {}, 'state export'));
+  state
+    .command('reconcile')
+    .description('Run doctor + migrate + doctor in one flow (dry-run by default)')
+    .option('--component <id>', `Component id (repeatable): ${knownIds}`, collectOptionValue, [])
+    .option('--all', 'Reconcile all known components')
+    .option('--apply', 'Apply migration writes (default is dry-run)')
+    .option('--json', 'Print machine-readable JSON output')
+    .action(async (options) => safeRun(runStateReconcileCommand, options, {}, 'state reconcile'));
 }
 module.exports = {
@@ -206,5 +295,6 @@ module.exports = {
   runStateDoctorCommand,
   runStateMigrateCommand,
   runStateExportCommand,
+  runStateReconcileCommand,
   registerStateCommands
 };

package/lib/orchestrator/orchestration-engine.js CHANGED Viewed

@@ -28,6 +28,10 @@ const DEFAULT_RATE_LIMIT_SIGNAL_WINDOW_MS = 30000;
 const DEFAULT_RATE_LIMIT_SIGNAL_THRESHOLD = 3;
 const DEFAULT_RATE_LIMIT_SIGNAL_EXTRA_HOLD_MS = 3000;
 const DEFAULT_RATE_LIMIT_DYNAMIC_BUDGET_FLOOR = 1;
+const DEFAULT_RATE_LIMIT_RETRY_SPREAD_MS = 600;
+const DEFAULT_RATE_LIMIT_LAUNCH_HOLD_POLL_MS = 1000;
+const DEFAULT_RATE_LIMIT_DECISION_EVENT_THROTTLE_MS = 1000;
+const MAX_RATE_LIMIT_RETRY_SPREAD_MS = 60000;
 const DEFAULT_AGENT_WAIT_TIMEOUT_SECONDS = 600;
 const AGENT_WAIT_TIMEOUT_GRACE_MS = 30000;
 const RATE_LIMIT_BACKOFF_JITTER_RATIO = 0.5;
@@ -136,12 +140,22 @@ class OrchestrationEngine extends EventEmitter {
     this._rateLimitSignalExtraHoldMs = DEFAULT_RATE_LIMIT_SIGNAL_EXTRA_HOLD_MS;
     /** @type {number} minimum dynamic launch budget floor under sustained pressure */
     this._rateLimitDynamicBudgetFloor = DEFAULT_RATE_LIMIT_DYNAMIC_BUDGET_FLOOR;
+    /** @type {number} deterministic per-spec retry spread to prevent synchronized retry bursts */
+    this._rateLimitRetrySpreadMs = DEFAULT_RATE_LIMIT_RETRY_SPREAD_MS;
+    /** @type {number} polling interval while launch hold is active */
+    this._rateLimitLaunchHoldPollMs = DEFAULT_RATE_LIMIT_LAUNCH_HOLD_POLL_MS;
+    /** @type {number} minimum interval between repeated rate-limit decision events */
+    this._rateLimitDecisionEventThrottleMs = DEFAULT_RATE_LIMIT_DECISION_EVENT_THROTTLE_MS;
     /** @type {number[]} timestamps (ms) of recent spec launches for rolling budget accounting */
     this._rateLimitLaunchTimestamps = [];
     /** @type {number} last launch-budget hold telemetry emission timestamp (ms) */
     this._launchBudgetLastHoldSignalAt = 0;
     /** @type {number} last launch-budget hold duration emitted to telemetry (ms) */
     this._launchBudgetLastHoldMs = 0;
+    /** @type {number} last rate-limit decision event emission timestamp (ms) */
+    this._lastRateLimitDecisionAt = 0;
+    /** @type {string} dedupe key for last rate-limit decision event */
+    this._lastRateLimitDecisionKey = '';
     /** @type {Set<{timer: NodeJS.Timeout|null, resolve: (() => void)|null}>} cancellable sleep waiters */
     this._pendingSleeps = new Set();
     /** @type {number} fallback wait timeout to avoid indefinite hangs when lifecycle events are missing */
@@ -373,10 +387,29 @@ class OrchestrationEngine extends EventEmitter {
         const launchHoldMs = Math.max(rateLimitHoldMs, launchBudgetHoldMs);
         if (launchHoldMs > 0) {
           // Pause new launches when provider asks us to retry later or launch budget is exhausted.
+          const holdReason = launchBudgetHoldMs >= rateLimitHoldMs
+            ? 'launch-budget'
+            : 'rate-limit-retry-hold';
           if (launchBudgetHoldMs > 0) {
             this._onLaunchBudgetHold(launchBudgetHoldMs);
           }
-          await this._sleep(Math.min(launchHoldMs, 1000));
+          const launchHoldPollMs = this._toPositiveInteger(
+            this._rateLimitLaunchHoldPollMs,
+            DEFAULT_RATE_LIMIT_LAUNCH_HOLD_POLL_MS
+          );
+          const holdSleepMs = Math.max(1, Math.min(launchHoldMs, launchHoldPollMs));
+          this._emitRateLimitDecision('launch-hold', {
+            reason: holdReason,
+            holdMs: launchHoldMs,
+            sleepMs: holdSleepMs,
+            pendingSpecs: pending.length,
+            inFlightSpecs: inFlight.size,
+            effectiveMaxParallel: this._toPositiveInteger(
+              this._effectiveMaxParallel,
+              this._toPositiveInteger(maxParallel, 1)
+            ),
+          });
+          await this._sleep(holdSleepMs);
           continue;
         }
@@ -606,9 +639,10 @@ class OrchestrationEngine extends EventEmitter {
       this._statusMonitor.incrementRetry(specName);
       this._statusMonitor.updateSpecStatus(specName, 'pending', null, resolvedError);
-      const retryDelayMs = isRateLimitError
-        ? this._resolveRateLimitRetryDelayMs(resolvedError, retryCount)
-        : 0;
+      const retryPlan = isRateLimitError
+        ? this._buildRateLimitRetryPlan(specName, retryCount, resolvedError)
+        : null;
+      const retryDelayMs = retryPlan ? retryPlan.totalDelayMs : 0;
       if (retryDelayMs > 0) {
         this._onRateLimitSignal(retryDelayMs);
         const launchHoldMs = this._getRateLimitLaunchHoldRemainingMs();
@@ -616,13 +650,33 @@ class OrchestrationEngine extends EventEmitter {
           specName,
           retryCount,
           retryDelayMs,
+          retryBaseDelayMs: retryPlan ? retryPlan.baseDelayMs : retryDelayMs,
+          retryHintMs: retryPlan ? retryPlan.retryAfterHintMs : 0,
+          retryBackoffMs: retryPlan ? retryPlan.computedBackoffMs : retryDelayMs,
+          retrySpreadMs: retryPlan ? retryPlan.spreadDelayMs : 0,
           launchHoldMs,
           error: resolvedError,
         });
+        this._emitRateLimitDecision('retry', {
+          reason: 'rate-limit-retry',
+          specName,
+          retryCount,
+          retryDelayMs,
+          retryBaseDelayMs: retryPlan ? retryPlan.baseDelayMs : retryDelayMs,
+          retryHintMs: retryPlan ? retryPlan.retryAfterHintMs : 0,
+          retryBackoffMs: retryPlan ? retryPlan.computedBackoffMs : retryDelayMs,
+          retrySpreadMs: retryPlan ? retryPlan.spreadDelayMs : 0,
+          launchHoldMs,
+          pendingRetryCount: this._retryCounts.get(specName) || 0,
+        });
         this.emit('spec:rate-limited', {
           specName,
           retryCount,
           retryDelayMs,
+          retryBaseDelayMs: retryPlan ? retryPlan.baseDelayMs : retryDelayMs,
+          retryHintMs: retryPlan ? retryPlan.retryAfterHintMs : 0,
+          retryBackoffMs: retryPlan ? retryPlan.computedBackoffMs : retryDelayMs,
+          retrySpreadMs: retryPlan ? retryPlan.spreadDelayMs : 0,
           launchHoldMs,
           error: resolvedError,
         });
@@ -638,6 +692,21 @@ class OrchestrationEngine extends EventEmitter {
       // Final failure (Req 5.3)
       this._failedSpecs.add(specName);
       this._statusMonitor.updateSpecStatus(specName, 'failed', agentId, resolvedError);
+      if (isRateLimitError) {
+        this._emitRateLimitDecision('retry-exhausted', {
+          reason: 'rate-limit-retry-budget-exhausted',
+          specName,
+          retryCount,
+          retryLimit,
+          error: resolvedError,
+        });
+        this.emit('spec:rate-limit-exhausted', {
+          specName,
+          retryCount,
+          retryLimit,
+          error: resolvedError,
+        });
+      }
       // Sync external status
       await this._syncExternalSafe(specName, 'failed');
@@ -1091,6 +1160,21 @@ class OrchestrationEngine extends EventEmitter {
       config && config.rateLimitDynamicBudgetFloor,
       DEFAULT_RATE_LIMIT_DYNAMIC_BUDGET_FLOOR
     );
+    this._rateLimitRetrySpreadMs = Math.min(
+      MAX_RATE_LIMIT_RETRY_SPREAD_MS,
+      this._toNonNegativeInteger(
+        config && config.rateLimitRetrySpreadMs,
+        DEFAULT_RATE_LIMIT_RETRY_SPREAD_MS
+      )
+    );
+    this._rateLimitLaunchHoldPollMs = this._toPositiveInteger(
+      config && config.rateLimitLaunchHoldPollMs,
+      DEFAULT_RATE_LIMIT_LAUNCH_HOLD_POLL_MS
+    );
+    this._rateLimitDecisionEventThrottleMs = this._toNonNegativeInteger(
+      config && config.rateLimitDecisionEventThrottleMs,
+      DEFAULT_RATE_LIMIT_DECISION_EVENT_THROTTLE_MS
+    );
   }
   /**
@@ -1225,6 +1309,12 @@ class OrchestrationEngine extends EventEmitter {
         effectiveMaxParallel: next,
         floor,
       });
+      this._emitRateLimitDecision('parallel-throttled', {
+        reason: 'rate-limit',
+        previousMaxParallel: current,
+        effectiveMaxParallel: next,
+        floor,
+      });
     } else {
       this._effectiveMaxParallel = current;
     }
@@ -1269,6 +1359,12 @@ class OrchestrationEngine extends EventEmitter {
         effectiveMaxParallel: next,
         maxParallel: boundedMax,
       });
+      this._emitRateLimitDecision('parallel-recovered', {
+        reason: 'rate-limit-cooldown',
+        previousMaxParallel: current,
+        effectiveMaxParallel: next,
+        maxParallel: boundedMax,
+      });
     }
   }
@@ -1415,6 +1511,13 @@ class OrchestrationEngine extends EventEmitter {
       windowMs,
       used: this._rateLimitLaunchTimestamps.length,
     });
+    this._emitRateLimitDecision('launch-budget-hold', {
+      reason: 'rate-limit-launch-budget',
+      holdMs,
+      budgetPerMinute,
+      windowMs,
+      used: this._rateLimitLaunchTimestamps.length,
+    });
   }
   /**
@@ -1606,6 +1709,11 @@ class OrchestrationEngine extends EventEmitter {
     if (extraHoldMs > 0) {
       const currentHoldUntil = this._toNonNegativeInteger(this._rateLimitLaunchHoldUntil, 0);
       this._rateLimitLaunchHoldUntil = Math.max(currentHoldUntil, now + extraHoldMs);
+      this._emitRateLimitDecision('launch-hold-escalated', {
+        reason: 'rate-limit-spike-hold',
+        signalCount,
+        extraHoldMs,
+      });
     }
     const configuredBudget = this._toNonNegativeInteger(
@@ -1654,6 +1762,13 @@ class OrchestrationEngine extends EventEmitter {
       windowMs: launchBudgetConfig.windowMs,
       holdMs,
     });
+    this._emitRateLimitDecision('launch-budget-throttled', {
+      reason: 'rate-limit-spike',
+      signalCount,
+      budgetPerMinute: launchBudgetConfig.budgetPerMinute,
+      windowMs: launchBudgetConfig.windowMs,
+      holdMs,
+    });
   }
   /**
@@ -1708,6 +1823,12 @@ class OrchestrationEngine extends EventEmitter {
       windowMs: launchBudgetConfig.windowMs,
       holdMs,
     });
+    this._emitRateLimitDecision('launch-budget-recovered', {
+      reason: 'rate-limit-cooldown',
+      budgetPerMinute: launchBudgetConfig.budgetPerMinute,
+      windowMs: launchBudgetConfig.windowMs,
+      holdMs,
+    });
   }
   /**
@@ -1730,6 +1851,113 @@ class OrchestrationEngine extends EventEmitter {
     return Math.max(1, Math.min(candidateDelayMs, maxDelayMs));
   }
+  /**
+   * Build retry delay details for a rate-limit failure.
+   * Keeps backoff compliant with provider hint while spreading retries across specs.
+   *
+   * @param {string} specName
+   * @param {number} retryCount
+   * @param {string} error
+   * @returns {{computedBackoffMs: number, retryAfterHintMs: number, baseDelayMs: number, spreadDelayMs: number, totalDelayMs: number}}
+   * @private
+   */
+  _buildRateLimitRetryPlan(specName, retryCount, error) {
+    const computedBackoffMs = this._calculateRateLimitBackoffMs(retryCount);
+    const retryAfterHintMs = this._extractRateLimitRetryAfterMs(error);
+    const baseDelayMs = this._resolveRateLimitRetryDelayMs(error, retryCount);
+    const spreadDelayMs = this._calculateRateLimitRetrySpreadMs(specName, retryCount);
+    return {
+      computedBackoffMs,
+      retryAfterHintMs,
+      baseDelayMs,
+      spreadDelayMs,
+      totalDelayMs: baseDelayMs + spreadDelayMs,
+    };
+  }
+  /**
+   * Spread same-round retries across specs to avoid synchronized 429 bursts.
+   *
+   * @param {string} specName
+   * @param {number} retryCount
+   * @returns {number}
+   * @private
+   */
+  _calculateRateLimitRetrySpreadMs(specName, retryCount) {
+    const spreadCapMs = Math.min(
+      MAX_RATE_LIMIT_RETRY_SPREAD_MS,
+      this._toNonNegativeInteger(
+        this._rateLimitRetrySpreadMs,
+        DEFAULT_RATE_LIMIT_RETRY_SPREAD_MS
+      )
+    );
+    if (spreadCapMs <= 0) {
+      return 0;
+    }
+    const normalizedSpecName = `${specName || ''}`.trim() || 'unknown-spec';
+    const retryOrdinal = this._toNonNegativeInteger(retryCount, 0) + 1;
+    const seed = `${normalizedSpecName}#${retryOrdinal}`;
+    const hash = this._hashString(seed);
+    return hash % (spreadCapMs + 1);
+  }
+  /**
+   * Lightweight deterministic hash for retry spread.
+   *
+   * @param {string} value
+   * @returns {number}
+   * @private
+   */
+  _hashString(value) {
+    let hash = 0;
+    const input = `${value || ''}`;
+    for (let idx = 0; idx < input.length; idx++) {
+      hash = ((hash * 31) + input.charCodeAt(idx)) >>> 0;
+    }
+    return hash;
+  }
+  /**
+   * Emit machine-readable rate-limit decision telemetry with simple de-dup throttling.
+   *
+   * @param {string} decision
+   * @param {object} payload
+   * @private
+   */
+  _emitRateLimitDecision(decision, payload = {}) {
+    const normalizedDecision = `${decision || ''}`.trim();
+    if (!normalizedDecision) {
+      return;
+    }
+    const now = this._getNow();
+    const reason = payload && typeof payload.reason === 'string'
+      ? payload.reason.trim()
+      : '';
+    const dedupeKey = `${normalizedDecision}:${reason}`;
+    const throttleMs = this._toNonNegativeInteger(
+      this._rateLimitDecisionEventThrottleMs,
+      DEFAULT_RATE_LIMIT_DECISION_EVENT_THROTTLE_MS
+    );
+    if (
+      throttleMs > 0
+      && dedupeKey === this._lastRateLimitDecisionKey
+      && (now - this._lastRateLimitDecisionAt) < throttleMs
+    ) {
+      return;
+    }
+    this._lastRateLimitDecisionAt = now;
+    this._lastRateLimitDecisionKey = dedupeKey;
+    this.emit('rate-limit:decision', {
+      decision: normalizedDecision,
+      at: new Date(now).toISOString(),
+      ...(payload && typeof payload === 'object' ? payload : {}),
+    });
+  }
   /**
    * @param {number} ms
    * @returns {Promise<void>}
@@ -1941,6 +2169,8 @@ class OrchestrationEngine extends EventEmitter {
     this._dynamicLaunchBudgetPerMinute = null;
     this._launchBudgetLastHoldSignalAt = 0;
     this._launchBudgetLastHoldMs = 0;
+    this._lastRateLimitDecisionAt = 0;
+    this._lastRateLimitDecisionKey = '';
   }
 }

package/lib/orchestrator/orchestrator-config.js CHANGED Viewed

@@ -37,6 +37,9 @@ const KNOWN_KEYS = new Set([
   'rateLimitSignalThreshold',
   'rateLimitSignalExtraHoldMs',
   'rateLimitDynamicBudgetFloor',
+  'rateLimitRetrySpreadMs',
+  'rateLimitLaunchHoldPollMs',
+  'rateLimitDecisionEventThrottleMs',
   'apiKeyEnvVar',
   'bootstrapTemplate',
   'codexArgs',
@@ -57,6 +60,9 @@ const RATE_LIMIT_PROFILE_PRESETS = Object.freeze({
     rateLimitSignalThreshold: 2,
     rateLimitSignalExtraHoldMs: 5000,
     rateLimitDynamicBudgetFloor: 1,
+    rateLimitRetrySpreadMs: 1200,
+    rateLimitLaunchHoldPollMs: 1000,
+    rateLimitDecisionEventThrottleMs: 1000,
   }),
   balanced: Object.freeze({
     rateLimitMaxRetries: 8,
@@ -71,6 +77,9 @@ const RATE_LIMIT_PROFILE_PRESETS = Object.freeze({
     rateLimitSignalThreshold: 3,
     rateLimitSignalExtraHoldMs: 3000,
     rateLimitDynamicBudgetFloor: 1,
+    rateLimitRetrySpreadMs: 600,
+    rateLimitLaunchHoldPollMs: 1000,
+    rateLimitDecisionEventThrottleMs: 1000,
   }),
   aggressive: Object.freeze({
     rateLimitMaxRetries: 6,
@@ -85,6 +94,9 @@ const RATE_LIMIT_PROFILE_PRESETS = Object.freeze({
     rateLimitSignalThreshold: 4,
     rateLimitSignalExtraHoldMs: 2000,
     rateLimitDynamicBudgetFloor: 2,
+    rateLimitRetrySpreadMs: 250,
+    rateLimitLaunchHoldPollMs: 1000,
+    rateLimitDecisionEventThrottleMs: 1000,
   }),
 });
@@ -124,6 +136,9 @@ const DEFAULT_CONFIG = Object.freeze({
   rateLimitSignalThreshold: 3,
   rateLimitSignalExtraHoldMs: 3000,
   rateLimitDynamicBudgetFloor: 1,
+  rateLimitRetrySpreadMs: 600,
+  rateLimitLaunchHoldPollMs: 1000,
+  rateLimitDecisionEventThrottleMs: 1000,
   apiKeyEnvVar: 'CODEX_API_KEY',
   bootstrapTemplate: null,
   codexArgs: [],

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "scene-capability-engine",
-  "version": "3.6.7",
+  "version": "3.6.9",
   "description": "SCE (Scene Capability Engine) - A CLI tool and npm package for spec-driven development with AI coding assistants.",
   "main": "index.js",
   "bin": {