scene-capability-engine 3.6.7 → 3.6.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,32 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [3.6.9] - 2026-03-05
11
+
12
+ ### Added
13
+ - Orchestration runtime now emits machine-readable `rate-limit:decision` telemetry events for retry/throttle/hold/recovery transitions.
14
+ - New anti-429 runtime config knobs in `.sce/config/orchestrator.json`:
15
+ - `rateLimitRetrySpreadMs`
16
+ - `rateLimitLaunchHoldPollMs`
17
+ - `rateLimitDecisionEventThrottleMs`
18
+
19
+ ### Changed
20
+ - Rate-limit retries now apply deterministic per-spec retry spread to reduce synchronized 429 bursts under high parallelism.
21
+ - Launch-hold polling interval is now configurable, so anti-429 pause loops can be tuned for responsiveness vs overhead.
22
+
23
+ ## [3.6.8] - 2026-03-05
24
+
25
+ ### Added
26
+ - New one-shot state command:
27
+ - `sce state reconcile --all --apply --json`
28
+ - runs `doctor -> migrate -> doctor` with before/after drift summary in one payload
29
+ - Release workflow now runs state reconcile before reconciliation gate and uploads:
30
+ - `state-reconcile-<tag>.json`
31
+
32
+ ### Changed
33
+ - Release workflow state reconciliation gate is now enforce-by-default (`KSE_RELEASE_STATE_MIGRATION_ENFORCE` defaults to `true`).
34
+ - Release workflow Node runtime updated to `22.x` so sqlite-backed reconciliation can run in release pipeline.
35
+
10
36
  ## [3.6.7] - 2026-03-05
11
37
 
12
38
  ### Fixed
package/README.md CHANGED
@@ -121,6 +121,7 @@ SCE is tool-agnostic and works with Codex, Claude Code, Cursor, Windsurf, VS Cod
121
121
  - Session governance is scene-first: `1 scene = 1 primary session`.
122
122
  - Spec work is attached as child sessions and auto-archived.
123
123
  - Startup now auto-detects adopted projects and aligns takeover baseline defaults automatically.
124
+ - Multi-agent anti-429 runtime now supports deterministic retry spread and machine-readable `rate-limit:decision` telemetry (`rateLimitRetrySpreadMs`, `rateLimitLaunchHoldPollMs`, `rateLimitDecisionEventThrottleMs`).
124
125
  - Problem evaluation policy is enabled by default (`.sce/config/problem-eval-policy.json`) and evaluates every Studio stage.
125
126
  - Problem closure policy is enabled by default (`.sce/config/problem-closure-policy.json`) and blocks verify/release bypass when required domain/problem evidence is missing.
126
127
  - Error handling now follows a full incident loop by default: every record attempt is staged first and auto-closed on verified/promoted outcomes.
@@ -149,8 +150,10 @@ Studio task-stream output contract (default):
149
150
  - `sce state plan --json`
150
151
  - `sce state doctor --json`
151
152
  - `sce state migrate --all --apply --json`
153
+ - `sce state reconcile --all --apply --json` (doctor -> migrate -> doctor one-shot)
152
154
  - `sce state export --out .sce/reports/state-migration/state-export.latest.json --json`
153
155
  - reconciliation gate: `npm run gate:state-migration-reconciliation`
156
+ - release workflow defaults to enforce mode for state reconciliation gate and runs reconcile before publish
154
157
  - runtime reads now prefer sqlite indexes for timeline/scene-session views when indexed data exists
155
158
  - `state doctor` now emits `summary` and runtime diagnostics (`runtime.timeline`, `runtime.scene_session`) with read-source and consistency status
156
159
  - migratable components now include runtime + errorbook + spec-governance + release evidence indexes (`errorbook.entry-index`, `errorbook.incident-index`, `governance.spec-scene-overrides`, `governance.scene-index`, `release.evidence-runs-index`, `release.gate-history-index`)
@@ -215,5 +218,5 @@ MIT. See [LICENSE](LICENSE).
215
218
 
216
219
  ---
217
220
 
218
- **Version**: 3.6.3
221
+ **Version**: 3.6.9
219
222
  **Last Updated**: 2026-03-05
package/README.zh.md CHANGED
@@ -121,6 +121,7 @@ SCE 对工具无锁定,可接入 Codex、Claude Code、Cursor、Windsurf、VS
121
121
  - 会话治理默认场景优先:`1 scene = 1 primary session`。
122
122
  - Spec 执行作为子会话自动归档,支持跨轮次追踪。
123
123
  - 启动时会自动识别已接管项目并对齐接管基线默认配置。
124
+ - 多 Agent 抗 429 运行时新增“确定性重试错峰 + 机器可读 `rate-limit:decision` 事件”,可通过 `rateLimitRetrySpreadMs`、`rateLimitLaunchHoldPollMs`、`rateLimitDecisionEventThrottleMs` 调优。
124
125
  - 问题评估策略默认启用(`.sce/config/problem-eval-policy.json`),Studio 各阶段都会执行评估。
125
126
  - 问题闭环策略默认启用(`.sce/config/problem-closure-policy.json`),缺失必要问题/领域证据时会在 verify/release 阶段阻断。
126
127
  - 错误处理默认进入完整 incident 闭环:每次记录先落到 staging 试错链路,verified/promoted 后自动收束归档。
@@ -149,8 +150,10 @@ Studio 任务流输出契约(默认):
149
150
  - `sce state plan --json`
150
151
  - `sce state doctor --json`
151
152
  - `sce state migrate --all --apply --json`
153
+ - `sce state reconcile --all --apply --json`(一键执行 doctor -> migrate -> doctor)
152
154
  - `sce state export --out .sce/reports/state-migration/state-export.latest.json --json`
153
155
  - 对账门禁:`npm run gate:state-migration-reconciliation`
156
+ - 发布工作流默认对 state reconciliation 启用 enforce,并在发布前执行 reconcile
154
157
  - 运行时读取在存在索引数据时优先使用 SQLite(timeline/scene-session 视图)
155
158
  - `state doctor` 新增 `summary` 与运行时诊断(`runtime.timeline`、`runtime.scene_session`),可直接读取读源与一致性状态
156
159
  - 可迁移组件扩展到 runtime + errorbook + spec-governance + release evidence 索引(`errorbook.entry-index`、`errorbook.incident-index`、`governance.spec-scene-overrides`、`governance.scene-index`、`release.evidence-runs-index`、`release.gate-history-index`)
@@ -215,5 +218,5 @@ MIT,见 [LICENSE](LICENSE)。
215
218
 
216
219
  ---
217
220
 
218
- **版本**:3.6.3
221
+ **版本**:3.6.9
219
222
  **最后更新**:2026-03-05
@@ -23,6 +23,9 @@ This document defines the default anti-429 presets used by SCE multi-agent orche
23
23
  | `rateLimitSignalThreshold` | 2 | 3 | 4 |
24
24
  | `rateLimitSignalExtraHoldMs` | 5000 | 3000 | 2000 |
25
25
  | `rateLimitDynamicBudgetFloor` | 1 | 1 | 2 |
26
+ | `rateLimitRetrySpreadMs` | 1200 | 600 | 250 |
27
+ | `rateLimitLaunchHoldPollMs` | 1000 | 1000 | 1000 |
28
+ | `rateLimitDecisionEventThrottleMs` | 1000 | 1000 | 1000 |
26
29
 
27
30
  ## Usage
28
31
 
@@ -64,3 +67,4 @@ Release readiness criteria:
64
67
  1. No failing test in orchestrator/rate-limit scope.
65
68
  2. `orchestrate profile show --json` returns expected profile and effective values.
66
69
  3. Multi-agent run no longer stalls under sustained `429`; launch budget and hold telemetry progress over time.
70
+ 4. `rate-limit:decision` events are emitted as machine-readable telemetry for retry/throttle/recovery transitions.
@@ -719,6 +719,9 @@ sce state migrate --all --json
719
719
  # apply migration writes into sqlite index tables
720
720
  sce state migrate --all --apply --json
721
721
 
722
+ # reconcile in one flow (doctor -> migrate -> doctor)
723
+ sce state reconcile --all --apply --json
724
+
722
725
  # migrate specific components
723
726
  sce state migrate --component collab.agent-registry --component runtime.timeline-index --apply --json
724
727
 
@@ -756,6 +759,7 @@ SQLite index tables introduced for gradual migration:
756
759
  Runtime read preference:
757
760
  - timeline/session runtime views now prefer SQLite indexes when indexed rows exist.
758
761
  - file artifacts remain source-of-truth for content payload and recovery operations.
762
+ - release workflow runs `state reconcile --all --apply` before reconciliation gate.
759
763
  - `sce state doctor --json` now includes:
760
764
  - `summary` aggregate (`pending_components`, `total_record_drift`, `blocking_count`, `alert_count`)
761
765
  - runtime read diagnostics (`runtime.timeline`, `runtime.scene_session`) with read-source/read-preference and consistency status
@@ -1792,6 +1796,9 @@ Recommended `.sce/config/orchestrator.json`:
1792
1796
  "rateLimitSignalThreshold": 3,
1793
1797
  "rateLimitSignalExtraHoldMs": 3000,
1794
1798
  "rateLimitDynamicBudgetFloor": 1,
1799
+ "rateLimitRetrySpreadMs": 600,
1800
+ "rateLimitLaunchHoldPollMs": 1000,
1801
+ "rateLimitDecisionEventThrottleMs": 1000,
1795
1802
  "apiKeyEnvVar": "CODEX_API_KEY",
1796
1803
  "codexArgs": ["--skip-git-repo-check"],
1797
1804
  "codexCommand": "npx @openai/codex"
@@ -1805,9 +1812,14 @@ Recommended `.sce/config/orchestrator.json`:
1805
1812
  - `rateLimitSignalThreshold`: signals required inside window before escalation
1806
1813
  - `rateLimitSignalExtraHoldMs`: extra launch hold per escalation unit
1807
1814
  - `rateLimitDynamicBudgetFloor`: lowest dynamic launch budget allowed during sustained pressure
1815
+ - `rateLimitRetrySpreadMs`: deterministic retry spread (per spec/retry round) to reduce synchronized retry bursts
1816
+ - `rateLimitLaunchHoldPollMs`: polling interval while launch hold is active (lower values react faster, higher values reduce loop overhead)
1817
+ - `rateLimitDecisionEventThrottleMs`: de-dup interval for repeated `rate-limit:decision` telemetry events
1808
1818
 
1809
1819
  `orchestrate stop` interrupts pending retry waits immediately so long backoff windows do not look like deadlocks.
1810
1820
 
1821
+ Runtime emits machine-readable `rate-limit:decision` events for retry/throttle/hold/recovery transitions, so UI or controller layers can surface anti-429 actions directly.
1822
+
1811
1823
  Codex sub-agent permission defaults:
1812
1824
  - `--sandbox danger-full-access` is always injected by orchestrator runtime.
1813
1825
  - `--ask-for-approval never` is injected by default when `codexArgs` does not explicitly set approval mode.
@@ -8,6 +8,7 @@ const {
8
8
  runStateDoctor,
9
9
  runStateExport
10
10
  } = require('../state/state-migration-manager');
11
+ const { getSceStateStore } = require('../state/sce-state-store');
11
12
 
12
13
  function normalizeString(value) {
13
14
  if (typeof value !== 'string') {
@@ -147,6 +148,85 @@ async function runStateExportCommand(options = {}, dependencies = {}) {
147
148
  return payload;
148
149
  }
149
150
 
151
+ async function runStateReconcileCommand(options = {}, dependencies = {}) {
152
+ const projectPath = dependencies.projectPath || process.cwd();
153
+ const fileSystem = dependencies.fileSystem || fs;
154
+ const env = dependencies.env || process.env;
155
+ const components = normalizeComponentInput(options.component);
156
+ const componentIds = options.all === true ? [] : components;
157
+ const apply = options.apply === true;
158
+ const stateStore = dependencies.stateStore || getSceStateStore(projectPath, {
159
+ fileSystem,
160
+ env
161
+ });
162
+
163
+ const before = await runStateDoctor({}, {
164
+ projectPath,
165
+ fileSystem,
166
+ env,
167
+ stateStore
168
+ });
169
+
170
+ const migration = await runStateMigration({
171
+ apply,
172
+ all: options.all === true,
173
+ componentIds
174
+ }, {
175
+ projectPath,
176
+ fileSystem,
177
+ env,
178
+ stateStore
179
+ });
180
+
181
+ const after = await runStateDoctor({}, {
182
+ projectPath,
183
+ fileSystem,
184
+ env,
185
+ stateStore
186
+ });
187
+
188
+ const beforePending = Number(before && before.summary && before.summary.pending_components) || 0;
189
+ const afterPending = Number(after && after.summary && after.summary.pending_components) || 0;
190
+ const beforeBlocking = Number(before && before.summary && before.summary.blocking_count) || 0;
191
+ const afterBlocking = Number(after && after.summary && after.summary.blocking_count) || 0;
192
+ const pendingReduced = Math.max(0, beforePending - afterPending);
193
+ const blockingReduced = Math.max(0, beforeBlocking - afterBlocking);
194
+
195
+ const payload = {
196
+ mode: 'state-reconcile',
197
+ success: Boolean(migration && migration.success) && afterBlocking === 0,
198
+ apply,
199
+ generated_at: new Date().toISOString(),
200
+ store_path: after && after.store_path ? after.store_path : null,
201
+ sqlite: after && after.sqlite ? after.sqlite : null,
202
+ migration,
203
+ before: {
204
+ summary: before && before.summary ? before.summary : null,
205
+ blocking: Array.isArray(before && before.blocking) ? before.blocking : [],
206
+ alerts: Array.isArray(before && before.alerts) ? before.alerts : []
207
+ },
208
+ after: {
209
+ summary: after && after.summary ? after.summary : null,
210
+ blocking: Array.isArray(after && after.blocking) ? after.blocking : [],
211
+ alerts: Array.isArray(after && after.alerts) ? after.alerts : []
212
+ },
213
+ summary: {
214
+ apply,
215
+ migrated_components: Number(migration && migration.summary && migration.summary.migrated_components) || 0,
216
+ migrated_records: Number(migration && migration.summary && migration.summary.migrated_records) || 0,
217
+ before_pending_components: beforePending,
218
+ after_pending_components: afterPending,
219
+ pending_components_reduced: pendingReduced,
220
+ before_blocking_count: beforeBlocking,
221
+ after_blocking_count: afterBlocking,
222
+ blocking_reduced: blockingReduced
223
+ }
224
+ };
225
+
226
+ printPayload(payload, options, 'State Reconcile');
227
+ return payload;
228
+ }
229
+
150
230
  async function safeRun(handler, options = {}, dependencies = {}, title = 'state command') {
151
231
  try {
152
232
  await handler(options, dependencies);
@@ -199,6 +279,15 @@ function registerStateCommands(program) {
199
279
  .option('--out <path>', 'Output file path', '.sce/reports/state-migration/state-export.latest.json')
200
280
  .option('--json', 'Print machine-readable JSON output')
201
281
  .action(async (options) => safeRun(runStateExportCommand, options, {}, 'state export'));
282
+
283
+ state
284
+ .command('reconcile')
285
+ .description('Run doctor + migrate + doctor in one flow (dry-run by default)')
286
+ .option('--component <id>', `Component id (repeatable): ${knownIds}`, collectOptionValue, [])
287
+ .option('--all', 'Reconcile all known components')
288
+ .option('--apply', 'Apply migration writes (default is dry-run)')
289
+ .option('--json', 'Print machine-readable JSON output')
290
+ .action(async (options) => safeRun(runStateReconcileCommand, options, {}, 'state reconcile'));
202
291
  }
203
292
 
204
293
  module.exports = {
@@ -206,5 +295,6 @@ module.exports = {
206
295
  runStateDoctorCommand,
207
296
  runStateMigrateCommand,
208
297
  runStateExportCommand,
298
+ runStateReconcileCommand,
209
299
  registerStateCommands
210
300
  };
@@ -28,6 +28,10 @@ const DEFAULT_RATE_LIMIT_SIGNAL_WINDOW_MS = 30000;
28
28
  const DEFAULT_RATE_LIMIT_SIGNAL_THRESHOLD = 3;
29
29
  const DEFAULT_RATE_LIMIT_SIGNAL_EXTRA_HOLD_MS = 3000;
30
30
  const DEFAULT_RATE_LIMIT_DYNAMIC_BUDGET_FLOOR = 1;
31
+ const DEFAULT_RATE_LIMIT_RETRY_SPREAD_MS = 600;
32
+ const DEFAULT_RATE_LIMIT_LAUNCH_HOLD_POLL_MS = 1000;
33
+ const DEFAULT_RATE_LIMIT_DECISION_EVENT_THROTTLE_MS = 1000;
34
+ const MAX_RATE_LIMIT_RETRY_SPREAD_MS = 60000;
31
35
  const DEFAULT_AGENT_WAIT_TIMEOUT_SECONDS = 600;
32
36
  const AGENT_WAIT_TIMEOUT_GRACE_MS = 30000;
33
37
  const RATE_LIMIT_BACKOFF_JITTER_RATIO = 0.5;
@@ -136,12 +140,22 @@ class OrchestrationEngine extends EventEmitter {
136
140
  this._rateLimitSignalExtraHoldMs = DEFAULT_RATE_LIMIT_SIGNAL_EXTRA_HOLD_MS;
137
141
  /** @type {number} minimum dynamic launch budget floor under sustained pressure */
138
142
  this._rateLimitDynamicBudgetFloor = DEFAULT_RATE_LIMIT_DYNAMIC_BUDGET_FLOOR;
143
+ /** @type {number} deterministic per-spec retry spread to prevent synchronized retry bursts */
144
+ this._rateLimitRetrySpreadMs = DEFAULT_RATE_LIMIT_RETRY_SPREAD_MS;
145
+ /** @type {number} polling interval while launch hold is active */
146
+ this._rateLimitLaunchHoldPollMs = DEFAULT_RATE_LIMIT_LAUNCH_HOLD_POLL_MS;
147
+ /** @type {number} minimum interval between repeated rate-limit decision events */
148
+ this._rateLimitDecisionEventThrottleMs = DEFAULT_RATE_LIMIT_DECISION_EVENT_THROTTLE_MS;
139
149
  /** @type {number[]} timestamps (ms) of recent spec launches for rolling budget accounting */
140
150
  this._rateLimitLaunchTimestamps = [];
141
151
  /** @type {number} last launch-budget hold telemetry emission timestamp (ms) */
142
152
  this._launchBudgetLastHoldSignalAt = 0;
143
153
  /** @type {number} last launch-budget hold duration emitted to telemetry (ms) */
144
154
  this._launchBudgetLastHoldMs = 0;
155
+ /** @type {number} last rate-limit decision event emission timestamp (ms) */
156
+ this._lastRateLimitDecisionAt = 0;
157
+ /** @type {string} dedupe key for last rate-limit decision event */
158
+ this._lastRateLimitDecisionKey = '';
145
159
  /** @type {Set<{timer: NodeJS.Timeout|null, resolve: (() => void)|null}>} cancellable sleep waiters */
146
160
  this._pendingSleeps = new Set();
147
161
  /** @type {number} fallback wait timeout to avoid indefinite hangs when lifecycle events are missing */
@@ -373,10 +387,29 @@ class OrchestrationEngine extends EventEmitter {
373
387
  const launchHoldMs = Math.max(rateLimitHoldMs, launchBudgetHoldMs);
374
388
  if (launchHoldMs > 0) {
375
389
  // Pause new launches when provider asks us to retry later or launch budget is exhausted.
390
+ const holdReason = launchBudgetHoldMs >= rateLimitHoldMs
391
+ ? 'launch-budget'
392
+ : 'rate-limit-retry-hold';
376
393
  if (launchBudgetHoldMs > 0) {
377
394
  this._onLaunchBudgetHold(launchBudgetHoldMs);
378
395
  }
379
- await this._sleep(Math.min(launchHoldMs, 1000));
396
+ const launchHoldPollMs = this._toPositiveInteger(
397
+ this._rateLimitLaunchHoldPollMs,
398
+ DEFAULT_RATE_LIMIT_LAUNCH_HOLD_POLL_MS
399
+ );
400
+ const holdSleepMs = Math.max(1, Math.min(launchHoldMs, launchHoldPollMs));
401
+ this._emitRateLimitDecision('launch-hold', {
402
+ reason: holdReason,
403
+ holdMs: launchHoldMs,
404
+ sleepMs: holdSleepMs,
405
+ pendingSpecs: pending.length,
406
+ inFlightSpecs: inFlight.size,
407
+ effectiveMaxParallel: this._toPositiveInteger(
408
+ this._effectiveMaxParallel,
409
+ this._toPositiveInteger(maxParallel, 1)
410
+ ),
411
+ });
412
+ await this._sleep(holdSleepMs);
380
413
  continue;
381
414
  }
382
415
 
@@ -606,9 +639,10 @@ class OrchestrationEngine extends EventEmitter {
606
639
  this._statusMonitor.incrementRetry(specName);
607
640
  this._statusMonitor.updateSpecStatus(specName, 'pending', null, resolvedError);
608
641
 
609
- const retryDelayMs = isRateLimitError
610
- ? this._resolveRateLimitRetryDelayMs(resolvedError, retryCount)
611
- : 0;
642
+ const retryPlan = isRateLimitError
643
+ ? this._buildRateLimitRetryPlan(specName, retryCount, resolvedError)
644
+ : null;
645
+ const retryDelayMs = retryPlan ? retryPlan.totalDelayMs : 0;
612
646
  if (retryDelayMs > 0) {
613
647
  this._onRateLimitSignal(retryDelayMs);
614
648
  const launchHoldMs = this._getRateLimitLaunchHoldRemainingMs();
@@ -616,13 +650,33 @@ class OrchestrationEngine extends EventEmitter {
616
650
  specName,
617
651
  retryCount,
618
652
  retryDelayMs,
653
+ retryBaseDelayMs: retryPlan ? retryPlan.baseDelayMs : retryDelayMs,
654
+ retryHintMs: retryPlan ? retryPlan.retryAfterHintMs : 0,
655
+ retryBackoffMs: retryPlan ? retryPlan.computedBackoffMs : retryDelayMs,
656
+ retrySpreadMs: retryPlan ? retryPlan.spreadDelayMs : 0,
619
657
  launchHoldMs,
620
658
  error: resolvedError,
621
659
  });
660
+ this._emitRateLimitDecision('retry', {
661
+ reason: 'rate-limit-retry',
662
+ specName,
663
+ retryCount,
664
+ retryDelayMs,
665
+ retryBaseDelayMs: retryPlan ? retryPlan.baseDelayMs : retryDelayMs,
666
+ retryHintMs: retryPlan ? retryPlan.retryAfterHintMs : 0,
667
+ retryBackoffMs: retryPlan ? retryPlan.computedBackoffMs : retryDelayMs,
668
+ retrySpreadMs: retryPlan ? retryPlan.spreadDelayMs : 0,
669
+ launchHoldMs,
670
+ pendingRetryCount: this._retryCounts.get(specName) || 0,
671
+ });
622
672
  this.emit('spec:rate-limited', {
623
673
  specName,
624
674
  retryCount,
625
675
  retryDelayMs,
676
+ retryBaseDelayMs: retryPlan ? retryPlan.baseDelayMs : retryDelayMs,
677
+ retryHintMs: retryPlan ? retryPlan.retryAfterHintMs : 0,
678
+ retryBackoffMs: retryPlan ? retryPlan.computedBackoffMs : retryDelayMs,
679
+ retrySpreadMs: retryPlan ? retryPlan.spreadDelayMs : 0,
626
680
  launchHoldMs,
627
681
  error: resolvedError,
628
682
  });
@@ -638,6 +692,21 @@ class OrchestrationEngine extends EventEmitter {
638
692
  // Final failure (Req 5.3)
639
693
  this._failedSpecs.add(specName);
640
694
  this._statusMonitor.updateSpecStatus(specName, 'failed', agentId, resolvedError);
695
+ if (isRateLimitError) {
696
+ this._emitRateLimitDecision('retry-exhausted', {
697
+ reason: 'rate-limit-retry-budget-exhausted',
698
+ specName,
699
+ retryCount,
700
+ retryLimit,
701
+ error: resolvedError,
702
+ });
703
+ this.emit('spec:rate-limit-exhausted', {
704
+ specName,
705
+ retryCount,
706
+ retryLimit,
707
+ error: resolvedError,
708
+ });
709
+ }
641
710
 
642
711
  // Sync external status
643
712
  await this._syncExternalSafe(specName, 'failed');
@@ -1091,6 +1160,21 @@ class OrchestrationEngine extends EventEmitter {
1091
1160
  config && config.rateLimitDynamicBudgetFloor,
1092
1161
  DEFAULT_RATE_LIMIT_DYNAMIC_BUDGET_FLOOR
1093
1162
  );
1163
+ this._rateLimitRetrySpreadMs = Math.min(
1164
+ MAX_RATE_LIMIT_RETRY_SPREAD_MS,
1165
+ this._toNonNegativeInteger(
1166
+ config && config.rateLimitRetrySpreadMs,
1167
+ DEFAULT_RATE_LIMIT_RETRY_SPREAD_MS
1168
+ )
1169
+ );
1170
+ this._rateLimitLaunchHoldPollMs = this._toPositiveInteger(
1171
+ config && config.rateLimitLaunchHoldPollMs,
1172
+ DEFAULT_RATE_LIMIT_LAUNCH_HOLD_POLL_MS
1173
+ );
1174
+ this._rateLimitDecisionEventThrottleMs = this._toNonNegativeInteger(
1175
+ config && config.rateLimitDecisionEventThrottleMs,
1176
+ DEFAULT_RATE_LIMIT_DECISION_EVENT_THROTTLE_MS
1177
+ );
1094
1178
  }
1095
1179
 
1096
1180
  /**
@@ -1225,6 +1309,12 @@ class OrchestrationEngine extends EventEmitter {
1225
1309
  effectiveMaxParallel: next,
1226
1310
  floor,
1227
1311
  });
1312
+ this._emitRateLimitDecision('parallel-throttled', {
1313
+ reason: 'rate-limit',
1314
+ previousMaxParallel: current,
1315
+ effectiveMaxParallel: next,
1316
+ floor,
1317
+ });
1228
1318
  } else {
1229
1319
  this._effectiveMaxParallel = current;
1230
1320
  }
@@ -1269,6 +1359,12 @@ class OrchestrationEngine extends EventEmitter {
1269
1359
  effectiveMaxParallel: next,
1270
1360
  maxParallel: boundedMax,
1271
1361
  });
1362
+ this._emitRateLimitDecision('parallel-recovered', {
1363
+ reason: 'rate-limit-cooldown',
1364
+ previousMaxParallel: current,
1365
+ effectiveMaxParallel: next,
1366
+ maxParallel: boundedMax,
1367
+ });
1272
1368
  }
1273
1369
  }
1274
1370
 
@@ -1415,6 +1511,13 @@ class OrchestrationEngine extends EventEmitter {
1415
1511
  windowMs,
1416
1512
  used: this._rateLimitLaunchTimestamps.length,
1417
1513
  });
1514
+ this._emitRateLimitDecision('launch-budget-hold', {
1515
+ reason: 'rate-limit-launch-budget',
1516
+ holdMs,
1517
+ budgetPerMinute,
1518
+ windowMs,
1519
+ used: this._rateLimitLaunchTimestamps.length,
1520
+ });
1418
1521
  }
1419
1522
 
1420
1523
  /**
@@ -1606,6 +1709,11 @@ class OrchestrationEngine extends EventEmitter {
1606
1709
  if (extraHoldMs > 0) {
1607
1710
  const currentHoldUntil = this._toNonNegativeInteger(this._rateLimitLaunchHoldUntil, 0);
1608
1711
  this._rateLimitLaunchHoldUntil = Math.max(currentHoldUntil, now + extraHoldMs);
1712
+ this._emitRateLimitDecision('launch-hold-escalated', {
1713
+ reason: 'rate-limit-spike-hold',
1714
+ signalCount,
1715
+ extraHoldMs,
1716
+ });
1609
1717
  }
1610
1718
 
1611
1719
  const configuredBudget = this._toNonNegativeInteger(
@@ -1654,6 +1762,13 @@ class OrchestrationEngine extends EventEmitter {
1654
1762
  windowMs: launchBudgetConfig.windowMs,
1655
1763
  holdMs,
1656
1764
  });
1765
+ this._emitRateLimitDecision('launch-budget-throttled', {
1766
+ reason: 'rate-limit-spike',
1767
+ signalCount,
1768
+ budgetPerMinute: launchBudgetConfig.budgetPerMinute,
1769
+ windowMs: launchBudgetConfig.windowMs,
1770
+ holdMs,
1771
+ });
1657
1772
  }
1658
1773
 
1659
1774
  /**
@@ -1708,6 +1823,12 @@ class OrchestrationEngine extends EventEmitter {
1708
1823
  windowMs: launchBudgetConfig.windowMs,
1709
1824
  holdMs,
1710
1825
  });
1826
+ this._emitRateLimitDecision('launch-budget-recovered', {
1827
+ reason: 'rate-limit-cooldown',
1828
+ budgetPerMinute: launchBudgetConfig.budgetPerMinute,
1829
+ windowMs: launchBudgetConfig.windowMs,
1830
+ holdMs,
1831
+ });
1711
1832
  }
1712
1833
 
1713
1834
  /**
@@ -1730,6 +1851,113 @@ class OrchestrationEngine extends EventEmitter {
1730
1851
  return Math.max(1, Math.min(candidateDelayMs, maxDelayMs));
1731
1852
  }
1732
1853
 
1854
+ /**
1855
+ * Build retry delay details for a rate-limit failure.
1856
+ * Keeps backoff compliant with provider hint while spreading retries across specs.
1857
+ *
1858
+ * @param {string} specName
1859
+ * @param {number} retryCount
1860
+ * @param {string} error
1861
+ * @returns {{computedBackoffMs: number, retryAfterHintMs: number, baseDelayMs: number, spreadDelayMs: number, totalDelayMs: number}}
1862
+ * @private
1863
+ */
1864
+ _buildRateLimitRetryPlan(specName, retryCount, error) {
1865
+ const computedBackoffMs = this._calculateRateLimitBackoffMs(retryCount);
1866
+ const retryAfterHintMs = this._extractRateLimitRetryAfterMs(error);
1867
+ const baseDelayMs = this._resolveRateLimitRetryDelayMs(error, retryCount);
1868
+ const spreadDelayMs = this._calculateRateLimitRetrySpreadMs(specName, retryCount);
1869
+ return {
1870
+ computedBackoffMs,
1871
+ retryAfterHintMs,
1872
+ baseDelayMs,
1873
+ spreadDelayMs,
1874
+ totalDelayMs: baseDelayMs + spreadDelayMs,
1875
+ };
1876
+ }
1877
+
1878
+ /**
1879
+ * Spread same-round retries across specs to avoid synchronized 429 bursts.
1880
+ *
1881
+ * @param {string} specName
1882
+ * @param {number} retryCount
1883
+ * @returns {number}
1884
+ * @private
1885
+ */
1886
+ _calculateRateLimitRetrySpreadMs(specName, retryCount) {
1887
+ const spreadCapMs = Math.min(
1888
+ MAX_RATE_LIMIT_RETRY_SPREAD_MS,
1889
+ this._toNonNegativeInteger(
1890
+ this._rateLimitRetrySpreadMs,
1891
+ DEFAULT_RATE_LIMIT_RETRY_SPREAD_MS
1892
+ )
1893
+ );
1894
+ if (spreadCapMs <= 0) {
1895
+ return 0;
1896
+ }
1897
+
1898
+ const normalizedSpecName = `${specName || ''}`.trim() || 'unknown-spec';
1899
+ const retryOrdinal = this._toNonNegativeInteger(retryCount, 0) + 1;
1900
+ const seed = `${normalizedSpecName}#${retryOrdinal}`;
1901
+ const hash = this._hashString(seed);
1902
+ return hash % (spreadCapMs + 1);
1903
+ }
1904
+
1905
+ /**
1906
+ * Lightweight deterministic hash for retry spread.
1907
+ *
1908
+ * @param {string} value
1909
+ * @returns {number}
1910
+ * @private
1911
+ */
1912
+ _hashString(value) {
1913
+ let hash = 0;
1914
+ const input = `${value || ''}`;
1915
+ for (let idx = 0; idx < input.length; idx++) {
1916
+ hash = ((hash * 31) + input.charCodeAt(idx)) >>> 0;
1917
+ }
1918
+ return hash;
1919
+ }
1920
+
1921
+ /**
1922
+ * Emit machine-readable rate-limit decision telemetry with simple de-dup throttling.
1923
+ *
1924
+ * @param {string} decision
1925
+ * @param {object} payload
1926
+ * @private
1927
+ */
1928
+ _emitRateLimitDecision(decision, payload = {}) {
1929
+ const normalizedDecision = `${decision || ''}`.trim();
1930
+ if (!normalizedDecision) {
1931
+ return;
1932
+ }
1933
+
1934
+ const now = this._getNow();
1935
+ const reason = payload && typeof payload.reason === 'string'
1936
+ ? payload.reason.trim()
1937
+ : '';
1938
+ const dedupeKey = `${normalizedDecision}:${reason}`;
1939
+ const throttleMs = this._toNonNegativeInteger(
1940
+ this._rateLimitDecisionEventThrottleMs,
1941
+ DEFAULT_RATE_LIMIT_DECISION_EVENT_THROTTLE_MS
1942
+ );
1943
+
1944
+ if (
1945
+ throttleMs > 0
1946
+ && dedupeKey === this._lastRateLimitDecisionKey
1947
+ && (now - this._lastRateLimitDecisionAt) < throttleMs
1948
+ ) {
1949
+ return;
1950
+ }
1951
+
1952
+ this._lastRateLimitDecisionAt = now;
1953
+ this._lastRateLimitDecisionKey = dedupeKey;
1954
+ this.emit('rate-limit:decision', {
1955
+ decision: normalizedDecision,
1956
+ at: new Date(now).toISOString(),
1957
+ ...(payload && typeof payload === 'object' ? payload : {}),
1958
+ });
1959
+ }
1960
+
1733
1961
  /**
1734
1962
  * @param {number} ms
1735
1963
  * @returns {Promise<void>}
@@ -1941,6 +2169,8 @@ class OrchestrationEngine extends EventEmitter {
1941
2169
  this._dynamicLaunchBudgetPerMinute = null;
1942
2170
  this._launchBudgetLastHoldSignalAt = 0;
1943
2171
  this._launchBudgetLastHoldMs = 0;
2172
+ this._lastRateLimitDecisionAt = 0;
2173
+ this._lastRateLimitDecisionKey = '';
1944
2174
  }
1945
2175
  }
1946
2176
 
@@ -37,6 +37,9 @@ const KNOWN_KEYS = new Set([
37
37
  'rateLimitSignalThreshold',
38
38
  'rateLimitSignalExtraHoldMs',
39
39
  'rateLimitDynamicBudgetFloor',
40
+ 'rateLimitRetrySpreadMs',
41
+ 'rateLimitLaunchHoldPollMs',
42
+ 'rateLimitDecisionEventThrottleMs',
40
43
  'apiKeyEnvVar',
41
44
  'bootstrapTemplate',
42
45
  'codexArgs',
@@ -57,6 +60,9 @@ const RATE_LIMIT_PROFILE_PRESETS = Object.freeze({
57
60
  rateLimitSignalThreshold: 2,
58
61
  rateLimitSignalExtraHoldMs: 5000,
59
62
  rateLimitDynamicBudgetFloor: 1,
63
+ rateLimitRetrySpreadMs: 1200,
64
+ rateLimitLaunchHoldPollMs: 1000,
65
+ rateLimitDecisionEventThrottleMs: 1000,
60
66
  }),
61
67
  balanced: Object.freeze({
62
68
  rateLimitMaxRetries: 8,
@@ -71,6 +77,9 @@ const RATE_LIMIT_PROFILE_PRESETS = Object.freeze({
71
77
  rateLimitSignalThreshold: 3,
72
78
  rateLimitSignalExtraHoldMs: 3000,
73
79
  rateLimitDynamicBudgetFloor: 1,
80
+ rateLimitRetrySpreadMs: 600,
81
+ rateLimitLaunchHoldPollMs: 1000,
82
+ rateLimitDecisionEventThrottleMs: 1000,
74
83
  }),
75
84
  aggressive: Object.freeze({
76
85
  rateLimitMaxRetries: 6,
@@ -85,6 +94,9 @@ const RATE_LIMIT_PROFILE_PRESETS = Object.freeze({
85
94
  rateLimitSignalThreshold: 4,
86
95
  rateLimitSignalExtraHoldMs: 2000,
87
96
  rateLimitDynamicBudgetFloor: 2,
97
+ rateLimitRetrySpreadMs: 250,
98
+ rateLimitLaunchHoldPollMs: 1000,
99
+ rateLimitDecisionEventThrottleMs: 1000,
88
100
  }),
89
101
  });
90
102
 
@@ -124,6 +136,9 @@ const DEFAULT_CONFIG = Object.freeze({
124
136
  rateLimitSignalThreshold: 3,
125
137
  rateLimitSignalExtraHoldMs: 3000,
126
138
  rateLimitDynamicBudgetFloor: 1,
139
+ rateLimitRetrySpreadMs: 600,
140
+ rateLimitLaunchHoldPollMs: 1000,
141
+ rateLimitDecisionEventThrottleMs: 1000,
127
142
  apiKeyEnvVar: 'CODEX_API_KEY',
128
143
  bootstrapTemplate: null,
129
144
  codexArgs: [],
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "scene-capability-engine",
3
- "version": "3.6.7",
3
+ "version": "3.6.9",
4
4
  "description": "SCE (Scene Capability Engine) - A CLI tool and npm package for spec-driven development with AI coding assistants.",
5
5
  "main": "index.js",
6
6
  "bin": {