universal-dev-standards 5.1.0-beta.6 → 5.1.0-beta.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/uds.js +12 -0
- package/bundled/ai/standards/agent-communication-protocol.ai.yaml +34 -0
- package/bundled/ai/standards/anti-sycophancy-prompting.ai.yaml +111 -0
- package/bundled/ai/standards/capability-declaration.ai.yaml +113 -0
- package/bundled/ai/standards/circuit-breaker.ai.yaml +93 -0
- package/bundled/ai/standards/developer-memory.ai.yaml +13 -0
- package/bundled/ai/standards/dual-phase-output.ai.yaml +108 -0
- package/bundled/ai/standards/failure-source-taxonomy.ai.yaml +115 -0
- package/bundled/ai/standards/frontend-design-standards.ai.yaml +305 -0
- package/bundled/ai/standards/health-check-standards.ai.yaml +140 -0
- package/bundled/ai/standards/immutability-first.ai.yaml +112 -0
- package/bundled/ai/standards/model-selection.ai.yaml +111 -3
- package/bundled/ai/standards/packaging-standards.ai.yaml +142 -0
- package/bundled/ai/standards/recovery-recipe-registry.ai.yaml +200 -0
- package/bundled/ai/standards/retry-standards.ai.yaml +134 -0
- package/bundled/ai/standards/security-decision.ai.yaml +87 -0
- package/bundled/ai/standards/skill-standard-alignment-check.ai.yaml +119 -0
- package/bundled/ai/standards/standard-admission-criteria.ai.yaml +107 -0
- package/bundled/ai/standards/standard-lifecycle-management.ai.yaml +144 -0
- package/bundled/ai/standards/timeout-standards.ai.yaml +104 -0
- package/bundled/ai/standards/token-budget.ai.yaml +108 -0
- package/bundled/core/anti-sycophancy-prompting.md +184 -0
- package/bundled/core/capability-declaration.md +59 -0
- package/bundled/core/circuit-breaker.md +58 -0
- package/bundled/core/developer-memory.md +29 -1
- package/bundled/core/dual-phase-output.md +56 -0
- package/bundled/core/failure-source-taxonomy.md +72 -0
- package/bundled/core/frontend-design-standards.md +474 -0
- package/bundled/core/health-check-standards.md +72 -0
- package/bundled/core/immutability-first.md +105 -0
- package/bundled/core/model-selection.md +80 -0
- package/bundled/core/packaging-standards.md +216 -0
- package/bundled/core/recovery-recipe-registry.md +69 -0
- package/bundled/core/retry-standards.md +62 -0
- package/bundled/core/security-decision.md +65 -0
- package/bundled/core/skill-standard-alignment-check.md +79 -0
- package/bundled/core/standard-admission-criteria.md +84 -0
- package/bundled/core/standard-lifecycle-management.md +94 -0
- package/bundled/core/timeout-standards.md +63 -0
- package/bundled/core/token-budget.md +58 -0
- package/bundled/locales/zh-CN/CHANGELOG.md +22 -3
- package/bundled/locales/zh-CN/README.md +1 -1
- package/bundled/locales/zh-TW/CHANGELOG.md +22 -3
- package/bundled/locales/zh-TW/README.md +1 -1
- package/bundled/locales/zh-TW/core/anti-sycophancy-prompting.md +184 -0
- package/bundled/locales/zh-TW/core/packaging-standards.md +224 -0
- package/bundled/skills/e2e-assistant/SKILL.md +19 -5
- package/bundled/skills/testing-guide/SKILL.md +5 -0
- package/bundled/skills/testing-guide/test-skeleton-templates.md +316 -0
- package/package.json +1 -1
- package/src/commands/config.js +9 -0
- package/src/commands/init.js +91 -46
- package/src/commands/mcp.js +26 -0
- package/src/commands/run-intent.js +66 -0
- package/src/commands/update.js +35 -4
- package/src/core/command-router.js +85 -0
- package/src/core/project-config.js +91 -0
- package/src/flows/init-flow.js +6 -1
- package/src/i18n/messages.js +6 -6
- package/src/mcp/__tests__/server.test.js +251 -0
- package/src/mcp/server.js +352 -0
- package/src/prompts/init.js +157 -1
- package/src/reconciler/actual-state-scanner.js +24 -0
- package/src/uninstallers/hook-uninstaller.js +32 -1
- package/src/utils/e2e-analyzer.js +88 -5
- package/src/utils/e2e-detector.js +73 -1
- package/src/utils/integration-generator.js +22 -3
- package/standards-registry.json +193 -5
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# Standard Admission Criteria - AI Optimized
|
|
2
|
+
# Source: XSPEC-070 (DEC-043 Wave 1 Governance Meta Pack)
|
|
3
|
+
|
|
4
|
+
standard:
|
|
5
|
+
id: standard-admission-criteria
|
|
6
|
+
name: Standard Admission Criteria
|
|
7
|
+
description: 新標準納入 UDS 的四項條件(Evidence / Scope / Non-overlapping / AI-executable)與拒絕理由明文化
|
|
8
|
+
|
|
9
|
+
meta:
|
|
10
|
+
version: "1.0.0"
|
|
11
|
+
updated: "2026-04-17"
|
|
12
|
+
status: trial
|
|
13
|
+
since: "2026-04-17"
|
|
14
|
+
expires: "2026-10-17"
|
|
15
|
+
source: XSPEC-070
|
|
16
|
+
borrowed_from: DEC-043
|
|
17
|
+
description: >
|
|
18
|
+
在 DEC-043 提出 60+ 候選新標準的背景下,需要一個明文的納入檢查清單,
|
|
19
|
+
避免標準庫膨脹(重疊、未使用)與降低品質。本標準是 UDS 的治理層 meta 標準,
|
|
20
|
+
用來「決定標準的標準」。每個候選新標準必須通過四項條件才能從 Proposed 進入 Trial。
|
|
21
|
+
scope: universal
|
|
22
|
+
industry_reference: "IETF RFC admission criteria, Python PEP process, W3C Recommendation Track"
|
|
23
|
+
|
|
24
|
+
guidelines:
|
|
25
|
+
- "所有候選新標準必須填寫 admission checklist,缺任一項視為不合格"
|
|
26
|
+
- "拒絕候選標準時必須寫下明文理由,不得以「不合適」之類籠統用詞草率結案"
|
|
27
|
+
- "通過 admission 僅代表可進 Trial,不代表直接 Active(需通過試驗期驗證)"
|
|
28
|
+
- "本標準本身也必須符合四項條件(self-applicability 原則)"
|
|
29
|
+
- "既有 66 個 Active 標準不溯及既往;新增標準自本標準發布後適用"
|
|
30
|
+
|
|
31
|
+
criteria:
|
|
32
|
+
evidence:
|
|
33
|
+
description: "至少 2 個具體使用場景(非 hypothetical)"
|
|
34
|
+
checks:
|
|
35
|
+
- "場景來自實際專案、Repo、論文或 DEC 記錄"
|
|
36
|
+
- "場景描述具體(可舉出檔案 / 函式 / commit),非泛泛而談"
|
|
37
|
+
- "至少 1 個場景來自 AsiaOstrich 三專案內部痛點或外部產業佐證"
|
|
38
|
+
rejection_example: "「未來可能用到」— 無具體場景,不通過"
|
|
39
|
+
|
|
40
|
+
scope:
|
|
41
|
+
description: "明確的作用域(哪些活動 / 哪些子專案 / 哪些情境)"
|
|
42
|
+
checks:
|
|
43
|
+
- "在 meta.scope 標示 universal / partial / uds-specific"
|
|
44
|
+
- "frontmatter 列出適用的活動類型(如 development, deployment, testing)"
|
|
45
|
+
- "若為 partial 或 uds-specific,說明不通用的原因"
|
|
46
|
+
rejection_example: "「所有場合都適用」— 過度泛化,不通過"
|
|
47
|
+
|
|
48
|
+
non_overlapping:
|
|
49
|
+
description: "與既有 UDS 標準無重大重疊(< 30% 內容重複)"
|
|
50
|
+
checks:
|
|
51
|
+
- "列出最接近的 3 個既有標準,說明如何補足差異"
|
|
52
|
+
- "若有 ≥ 30% 重疊,應改為擴充既有標準而非新增"
|
|
53
|
+
- "定義 integration_points 說明與既有標準的關係"
|
|
54
|
+
rejection_example: "與既有 `retry-standards` 80% 內容重複 — 應合併,不通過"
|
|
55
|
+
|
|
56
|
+
ai_executable:
|
|
57
|
+
description: "至少一個 DevAP QualityGate / VibeOps Agent prompt / Skill 能消費此標準"
|
|
58
|
+
checks:
|
|
59
|
+
- "定義清楚的 guidelines(bullet point,每條可驗證)"
|
|
60
|
+
- "至少包含 2 個具體 scenarios(Given-When-Then 格式)"
|
|
61
|
+
- "若需型別 / 介面,提供 interface / types 區塊"
|
|
62
|
+
- "與現有 Skill / QualityGate 的 integration_points 明確"
|
|
63
|
+
rejection_example: "只有抽象原則,無任何 AI 可執行的規則 — 不通過"
|
|
64
|
+
|
|
65
|
+
rejection_protocol:
|
|
66
|
+
rule_1: "拒絕理由必須具體指出未通過的 criterion(evidence / scope / non-overlapping / ai-executable)"
|
|
67
|
+
rule_2: "拒絕記錄寫入 `cross-project/decisions/` 或 DEC 的 rejection log 區塊"
|
|
68
|
+
rule_3: "候選者可依理由修正後重新申請(不永久封鎖)"
|
|
69
|
+
rule_4: "若拒絕理由涉及與既有標準重疊,應建議改為擴充既有標準"
|
|
70
|
+
|
|
71
|
+
self_applicability:
|
|
72
|
+
description: "本標準也必須符合四項條件"
|
|
73
|
+
evidence: "DEC-043 的 60+ 候選標準 + XSPEC-070 scenario 1-3 為具體使用場景"
|
|
74
|
+
scope: "universal,作用於 UDS 自身治理"
|
|
75
|
+
non_overlapping: "與既有 adr-standards 互補(ADR 記錄決策,admission 記錄納入條件)"
|
|
76
|
+
ai_executable: "本 yaml 的 scenarios 可被 standard-lifecycle Skill 消費"
|
|
77
|
+
|
|
78
|
+
scenarios:
|
|
79
|
+
scenario_1_new_standard_passes:
|
|
80
|
+
given: "候選標準 `retry-standards` 申請納入"
|
|
81
|
+
when: "執行 admission 檢查"
|
|
82
|
+
then: "Evidence(XSPEC-067 Scenario 1-1/1-2)、Scope(universal)、Non-overlapping(與 circuit-breaker 互補)、AI-executable(9 條 guidelines + 3 scenarios)全部通過"
|
|
83
|
+
and: "狀態從 Proposed 進入 Trial"
|
|
84
|
+
|
|
85
|
+
scenario_2_rejected_for_overlap:
|
|
86
|
+
given: "候選標準 `advanced-retry-with-jitter` 申請納入"
|
|
87
|
+
when: "Non-overlapping 檢查"
|
|
88
|
+
then: "與既有 retry-standards 重疊 > 30%,拒絕"
|
|
89
|
+
and: "拒絕理由:「80% 內容與 retry-standards 重疊,建議改為擴充既有標準的 Phase 2」"
|
|
90
|
+
|
|
91
|
+
scenario_3_rejected_for_vague_evidence:
|
|
92
|
+
given: "候選標準 `universal-best-practices` 申請納入"
|
|
93
|
+
when: "Evidence 檢查"
|
|
94
|
+
then: "無具體場景,只有「未來可能用到」類描述,拒絕"
|
|
95
|
+
and: "拒絕理由:「缺具體場景,請提供至少 2 個已發生的使用案例或產業佐證」"
|
|
96
|
+
|
|
97
|
+
error_codes:
|
|
98
|
+
ADMISSION-001: "MISSING_EVIDENCE — Evidence criterion 未通過"
|
|
99
|
+
ADMISSION-002: "SCOPE_UNDEFINED — Scope criterion 未通過"
|
|
100
|
+
ADMISSION-003: "OVERLAP_EXCEEDED — 與既有標準重疊 > 30%"
|
|
101
|
+
ADMISSION-004: "NOT_AI_EXECUTABLE — 缺 guidelines / scenarios,AI 無法消費"
|
|
102
|
+
|
|
103
|
+
integration_points:
|
|
104
|
+
- "standard-lifecycle-management.ai.yaml — 通過 admission 後從 Proposed → Trial"
|
|
105
|
+
- "skill-standard-alignment-check.ai.yaml — admission 通過的標準才能被 Skill 錨定"
|
|
106
|
+
- "adr-standards.ai.yaml — admission 決策本身可能需要 ADR 記錄"
|
|
107
|
+
- "DEC-043 — 本標準是 Wave 1 的前置條件,後續 Wave 2/3 的候選都走此流程"
|
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
# Standard Lifecycle Management - AI Optimized
|
|
2
|
+
# Source: XSPEC-070 (DEC-043 Wave 1 Governance Meta Pack)
|
|
3
|
+
|
|
4
|
+
standard:
|
|
5
|
+
id: standard-lifecycle-management
|
|
6
|
+
name: Standard Lifecycle Management
|
|
7
|
+
description: UDS 標準生命週期狀態機(Proposed → Trial → Active → Deprecated → Archived)與 frontmatter 必要欄位
|
|
8
|
+
|
|
9
|
+
meta:
|
|
10
|
+
version: "1.0.0"
|
|
11
|
+
updated: "2026-04-17"
|
|
12
|
+
status: trial
|
|
13
|
+
since: "2026-04-17"
|
|
14
|
+
expires: "2026-10-17"
|
|
15
|
+
source: XSPEC-070
|
|
16
|
+
borrowed_from: DEC-043
|
|
17
|
+
description: >
|
|
18
|
+
既有 66 個標準無明確狀態管理:新增標準沒有試驗期、過時標準無棄用路徑、廢棄標準仍被引用。
|
|
19
|
+
本標準建立五狀態機(Proposed / Trial / Active / Deprecated / Archived)與合法轉移規則,
|
|
20
|
+
並規範所有 .ai.yaml 標準必須在 frontmatter 標示 status / since / expires / supersedes。
|
|
21
|
+
scope: universal
|
|
22
|
+
industry_reference: "IETF RFC lifecycle (Proposed → Draft → Internet Standard), Python PEP states, W3C Recommendation Track"
|
|
23
|
+
|
|
24
|
+
guidelines:
|
|
25
|
+
- "所有 .ai.yaml 標準 frontmatter 必須包含 status / since 欄位"
|
|
26
|
+
- "Trial 狀態必須指定 expires(試驗期限,預設 6 個月)"
|
|
27
|
+
- "Deprecated 狀態必須指定 supersedes(替代者的 standard id 或遷移文件路徑)"
|
|
28
|
+
- "禁止反向轉移(Active → Proposed、Archived → Active 無意義)"
|
|
29
|
+
- "Trial → Active 需在 expires 前完成驗證決定;逾期未決則自動 Archived"
|
|
30
|
+
|
|
31
|
+
states:
|
|
32
|
+
Proposed:
|
|
33
|
+
description: "草案階段,尚未通過 admission-criteria"
|
|
34
|
+
allowed_locations: "DEC 文件 / XSPEC 文件 / PR branch;不進主線 ai/standards/"
|
|
35
|
+
can_be_referenced_by_skill: false
|
|
36
|
+
transition_out: ["Trial"]
|
|
37
|
+
transition_criteria:
|
|
38
|
+
Trial: "通過 standard-admission-criteria 四項條件"
|
|
39
|
+
|
|
40
|
+
Trial:
|
|
41
|
+
description: "批准但試驗中,有明確試驗期限"
|
|
42
|
+
allowed_locations: "ai/standards/,frontmatter 標記 status=trial"
|
|
43
|
+
can_be_referenced_by_skill: "true (標註 trial 狀態)"
|
|
44
|
+
default_trial_period_months: 6
|
|
45
|
+
transition_out: ["Active", "Archived"]
|
|
46
|
+
transition_criteria:
|
|
47
|
+
Active: "試驗期內被至少 2 個獨立場景實際使用且無重大修正需求"
|
|
48
|
+
Archived: "試驗期結束未達 Active 條件,或發現根本設計缺陷"
|
|
49
|
+
|
|
50
|
+
Active:
|
|
51
|
+
description: "全面採用,是 standard-of-truth"
|
|
52
|
+
allowed_locations: "ai/standards/,frontmatter 標記 status=active"
|
|
53
|
+
can_be_referenced_by_skill: true
|
|
54
|
+
transition_out: ["Deprecated"]
|
|
55
|
+
transition_criteria:
|
|
56
|
+
Deprecated: "發現更好替代標準、或標準已過時、或業界實踐改變"
|
|
57
|
+
|
|
58
|
+
Deprecated:
|
|
59
|
+
description: "標記棄用,仍可用但不建議;必須提供遷移路徑"
|
|
60
|
+
allowed_locations: "ai/standards/,frontmatter 標記 status=deprecated + supersedes"
|
|
61
|
+
can_be_referenced_by_skill: "true (Skill 應警示 deprecated 並指向 supersedes)"
|
|
62
|
+
default_migration_period_months: 6
|
|
63
|
+
transition_out: ["Archived"]
|
|
64
|
+
transition_criteria:
|
|
65
|
+
Archived: "遷移期結束,所有 downstream 已切換至替代標準"
|
|
66
|
+
|
|
67
|
+
Archived:
|
|
68
|
+
description: "已移除,僅保留歷史紀錄;不再載入"
|
|
69
|
+
allowed_locations: "ai/standards/archive/ 或 git 歷史"
|
|
70
|
+
can_be_referenced_by_skill: false
|
|
71
|
+
transition_out: []
|
|
72
|
+
|
|
73
|
+
transition_matrix:
|
|
74
|
+
description: "合法狀態轉移路徑(其他皆禁止)"
|
|
75
|
+
legal_transitions:
|
|
76
|
+
- "Proposed → Trial"
|
|
77
|
+
- "Trial → Active"
|
|
78
|
+
- "Trial → Archived"
|
|
79
|
+
- "Active → Deprecated"
|
|
80
|
+
- "Deprecated → Archived"
|
|
81
|
+
forbidden_transitions:
|
|
82
|
+
- "Active → Proposed(無意義)"
|
|
83
|
+
- "Archived → Active(應重新申請 admission)"
|
|
84
|
+
- "Deprecated → Active(應重新 Trial)"
|
|
85
|
+
- "Proposed → Active(須先 Trial 驗證)"
|
|
86
|
+
|
|
87
|
+
frontmatter_required_fields:
|
|
88
|
+
description: "所有 .ai.yaml 標準必須在 meta 區塊包含的欄位"
|
|
89
|
+
always_required:
|
|
90
|
+
status: "proposed | trial | active | deprecated | archived"
|
|
91
|
+
since: "ISO-8601 日期,進入當前狀態的日期"
|
|
92
|
+
version: "semver 字串"
|
|
93
|
+
conditional_required:
|
|
94
|
+
expires:
|
|
95
|
+
when: "status = trial"
|
|
96
|
+
format: "ISO-8601 日期,預設 since + 6 months"
|
|
97
|
+
supersedes:
|
|
98
|
+
when: "status = deprecated"
|
|
99
|
+
format: "替代標準 id(如 'retry-standards-v2')或遷移文件路徑"
|
|
100
|
+
migration_guide:
|
|
101
|
+
when: "status = deprecated"
|
|
102
|
+
format: "遷移指引的相對路徑(選填,但強烈建議)"
|
|
103
|
+
|
|
104
|
+
scenarios:
|
|
105
|
+
scenario_1_trial_to_active:
|
|
106
|
+
given: "retry-standards 處於 trial 狀態,since=2026-04-17, expires=2026-10-17"
|
|
107
|
+
when: "2026-08-01 審視使用情況,發現 DevAP Fix Loop 和 VibeOps Builder 都已採用,無重大缺陷"
|
|
108
|
+
then: "轉移到 Active,更新 status=active, since=2026-08-01,移除 expires 欄位"
|
|
109
|
+
note: "Trial → Active 的典型路徑"
|
|
110
|
+
|
|
111
|
+
scenario_2_trial_auto_archive:
|
|
112
|
+
given: "某標準 trial 期限 2026-10-17 到期,尚未通過驗證"
|
|
113
|
+
when: "2026-10-17 自動審視"
|
|
114
|
+
then: "狀態轉為 Archived,記錄未通過原因"
|
|
115
|
+
note: "避免 Trial 標準無限期停留,造成標準庫混亂"
|
|
116
|
+
|
|
117
|
+
scenario_3_deprecated_with_migration:
|
|
118
|
+
given: "legacy-retry-logic 發現有更好的 retry-standards 取代"
|
|
119
|
+
when: "執行 deprecation"
|
|
120
|
+
then: "status=deprecated, since=2026-04-17, supersedes=retry-standards, migration_guide=docs/migrations/retry-v1-to-v2.md"
|
|
121
|
+
and: "Skill 使用 legacy-retry-logic 時顯示警告,建議遷移至 retry-standards"
|
|
122
|
+
|
|
123
|
+
telemetry_events:
|
|
124
|
+
standard_state_change:
|
|
125
|
+
fields:
|
|
126
|
+
standard_id: string
|
|
127
|
+
from_state: "proposed | trial | active | deprecated | archived"
|
|
128
|
+
to_state: "proposed | trial | active | deprecated | archived"
|
|
129
|
+
reason: string
|
|
130
|
+
timestamp: string
|
|
131
|
+
when: "每次標準狀態變更時上報(對齊 telemetry-server)"
|
|
132
|
+
|
|
133
|
+
error_codes:
|
|
134
|
+
LIFECYCLE-001: "MISSING_STATUS — frontmatter 缺 status 欄位"
|
|
135
|
+
LIFECYCLE-002: "MISSING_EXPIRES — trial 狀態缺 expires 欄位"
|
|
136
|
+
LIFECYCLE-003: "MISSING_SUPERSEDES — deprecated 狀態缺 supersedes 欄位"
|
|
137
|
+
LIFECYCLE-004: "FORBIDDEN_TRANSITION — 嘗試非法狀態轉移"
|
|
138
|
+
LIFECYCLE-005: "TRIAL_EXPIRED — Trial 期限已過但未完成 Active / Archived 決策"
|
|
139
|
+
|
|
140
|
+
integration_points:
|
|
141
|
+
- "standard-admission-criteria.ai.yaml — Proposed → Trial 必須通過 admission"
|
|
142
|
+
- "skill-standard-alignment-check.ai.yaml — Skill 僅可錨定 trial/active/deprecated 狀態標準"
|
|
143
|
+
- "adr-standards.ai.yaml — Active → Deprecated 的決策建議搭配 ADR 記錄"
|
|
144
|
+
- "telemetry-server — standard_state_change 事件匯總,用於分析標準演進"
|
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
# Timeout Standards - AI Optimized
|
|
2
|
+
# Source: XSPEC-067 (DEC-043 Wave 1 Reliability Pack)
|
|
3
|
+
|
|
4
|
+
standard:
|
|
5
|
+
id: timeout-standards
|
|
6
|
+
name: Timeout Standards
|
|
7
|
+
description: Timeout 標準 — 層級預算(cascading 0.8×)、deadline propagation、與 circuit-breaker 整合
|
|
8
|
+
|
|
9
|
+
meta:
|
|
10
|
+
version: "1.0.0"
|
|
11
|
+
updated: "2026-04-17"
|
|
12
|
+
status: trial
|
|
13
|
+
since: "2026-04-17"
|
|
14
|
+
expires: "2026-10-17"
|
|
15
|
+
source: XSPEC-067
|
|
16
|
+
borrowed_from: DEC-043
|
|
17
|
+
description: >
|
|
18
|
+
避免多層呼叫鏈中下層 timeout 大於上層(導致上層先 timeout 而下層仍在執行的資源浪費)。
|
|
19
|
+
透過 cascading 預算規則(每層 ≤ 0.8× 上層)與 deadline propagation(傳 absolute timestamp)
|
|
20
|
+
讓整條呼叫鏈都能精準 fail-fast。與 circuit-breaker 整合,timeout 計入 failure count。
|
|
21
|
+
scope: universal
|
|
22
|
+
industry_reference: "gRPC deadline propagation, Envoy timeout budgeting, Google SRE Book Ch.22"
|
|
23
|
+
|
|
24
|
+
guidelines:
|
|
25
|
+
- "多層呼叫的 timeout 必須逐層遞減,每下一層 ≤ 0.8 × 上層(預留 20% buffer)"
|
|
26
|
+
- "跨服務呼叫必須傳遞 deadline(absolute timestamp),不得只傳 relative duration"
|
|
27
|
+
- "收到請求後若 now > deadline,必須立即 fail-fast,禁止發起下游呼叫"
|
|
28
|
+
- "Timeout 觸發必須計入對應 circuit-breaker 的 failure count"
|
|
29
|
+
- "禁止下層 timeout 大於上層 timeout(違反 fail-fast,等同沒設 timeout)"
|
|
30
|
+
|
|
31
|
+
cascading_budget:
|
|
32
|
+
rule: "每下一層 timeout ≤ 0.8 × 上層 timeout"
|
|
33
|
+
rationale:
|
|
34
|
+
- "預留 20% buffer 給序列化、網路傳輸、重試等開銷"
|
|
35
|
+
- "避免上層先 timeout 而下層仍在執行(資源浪費 + 冷門錯誤)"
|
|
36
|
+
- "0.8 是業界經驗值(gRPC / Envoy 常用 0.7~0.85)"
|
|
37
|
+
example:
|
|
38
|
+
client_timeout_ms: 10000
|
|
39
|
+
gateway_timeout_ms: 8000 # 10000 * 0.8
|
|
40
|
+
service_a_timeout_ms: 6400 # 8000 * 0.8
|
|
41
|
+
downstream_db_timeout_ms: 5120 # 6400 * 0.8
|
|
42
|
+
|
|
43
|
+
deadline_propagation:
|
|
44
|
+
format: "absolute ISO-8601 timestamp(不是 relative duration)"
|
|
45
|
+
header_name: "X-Deadline"
|
|
46
|
+
rule_1: "發起呼叫前計算 deadline = now + timeout,寫入 header"
|
|
47
|
+
rule_2: "收到請求後立即檢查 now > deadline_header → 若是則 fail-fast(回 DEADLINE_EXCEEDED)"
|
|
48
|
+
rule_3: "向下游呼叫時 timeout = min(cascading_budget, deadline - now),取兩者較小"
|
|
49
|
+
rationale: >
|
|
50
|
+
Relative duration(如 timeout=5s)無法在多層呼叫中累積扣除;
|
|
51
|
+
absolute timestamp 讓每一層都能精準計算剩餘時間,避免超時後仍發起無意義請求。
|
|
52
|
+
|
|
53
|
+
timeout_categories:
|
|
54
|
+
connect_timeout:
|
|
55
|
+
description: "建立 TCP / TLS 連線的時間上限"
|
|
56
|
+
default_ms: 5000
|
|
57
|
+
note: "通常比 request_timeout 短很多"
|
|
58
|
+
request_timeout:
|
|
59
|
+
description: "發送請求到收到完整回應的時間上限"
|
|
60
|
+
default_ms: 30000
|
|
61
|
+
note: "最常見的 timeout 類型;受 cascading 預算約束"
|
|
62
|
+
idle_timeout:
|
|
63
|
+
description: "連線閒置多久後關閉"
|
|
64
|
+
default_ms: 60000
|
|
65
|
+
note: "server 端設定;與 connection pool 配合"
|
|
66
|
+
total_deadline:
|
|
67
|
+
description: "含所有重試在內的整體上限"
|
|
68
|
+
default_ms: 60000
|
|
69
|
+
note: "retry × attempt_timeout 的總和不得超過此值"
|
|
70
|
+
|
|
71
|
+
circuit_breaker_integration:
|
|
72
|
+
rule_1: "每次 timeout 觸發視為一次失敗,計入 breaker 的 failure count"
|
|
73
|
+
rule_2: "連續 timeout 達 failureThreshold 時 breaker 進入 OPEN"
|
|
74
|
+
rule_3: "OPEN 狀態下的請求應套用極短 timeout(或直接 fail-fast),不走完整 request_timeout"
|
|
75
|
+
|
|
76
|
+
scenarios:
|
|
77
|
+
scenario_1_cascading_budget:
|
|
78
|
+
given: "Client timeout=10s,呼叫鏈 Client → Gateway → Service A → DB"
|
|
79
|
+
when: "設定各層 request_timeout"
|
|
80
|
+
then: "Gateway=8s, Service A=6.4s, DB=5.12s(每層 0.8×)"
|
|
81
|
+
note: "確保下層先 timeout,上層有機會捕獲並 fallback"
|
|
82
|
+
|
|
83
|
+
scenario_2_deadline_expired_on_receive:
|
|
84
|
+
given: "請求抵達 Service A 時 header X-Deadline 已過期"
|
|
85
|
+
when: "Service A 收到請求"
|
|
86
|
+
then: "立即回 DEADLINE_EXCEEDED,不呼叫 DB,不消耗資源"
|
|
87
|
+
note: "Deadline propagation 的 fail-fast 機制"
|
|
88
|
+
|
|
89
|
+
scenario_3_timeout_triggers_breaker:
|
|
90
|
+
given: "連續 3 次下游呼叫皆 timeout(failureThreshold=3)"
|
|
91
|
+
when: "第 4 次呼叫"
|
|
92
|
+
then: "circuit-breaker 進入 OPEN,立即回 CircuitOpenError"
|
|
93
|
+
note: "timeout 計入 breaker failure count,防止持續浪費資源"
|
|
94
|
+
|
|
95
|
+
error_codes:
|
|
96
|
+
TIMEOUT-001: "REQUEST_TIMEOUT — 單次請求超時"
|
|
97
|
+
TIMEOUT-002: "DEADLINE_EXCEEDED — 整體 deadline 已過,拒絕發起 / 處理請求"
|
|
98
|
+
TIMEOUT-003: "CASCADING_BUDGET_VIOLATION — 下層 timeout > 上層 timeout(配置錯誤)"
|
|
99
|
+
|
|
100
|
+
integration_points:
|
|
101
|
+
- "circuit-breaker.ai.yaml — timeout 計入 failure count,觸發 OPEN"
|
|
102
|
+
- "retry-standards.ai.yaml — 單次重試 timeout 不得超過 deadline - now"
|
|
103
|
+
- "failure-source-taxonomy.ai.yaml — timeout 對應 upstream_unavailable 或 tool_failure"
|
|
104
|
+
- "observability-standards(XSPEC-063 規劃中)— timeout 是 RED metric 的 Error 來源之一"
|
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
# Token Budget Zone Standard - AI Optimized
|
|
2
|
+
# Source: XSPEC-036 (claude-code-book Ch.7 four-zone threshold model)
|
|
3
|
+
|
|
4
|
+
standard:
|
|
5
|
+
id: token-budget
|
|
6
|
+
name: Token Budget Zone Standard
|
|
7
|
+
description: Token 閾值四區模型 — 漸進觸發保護策略,避免突然崩潰
|
|
8
|
+
|
|
9
|
+
meta:
|
|
10
|
+
version: "1.0.0"
|
|
11
|
+
updated: "2026-04-15"
|
|
12
|
+
source: XSPEC-036
|
|
13
|
+
description: >
|
|
14
|
+
以使用率百分比劃分四個運作區間,漸進觸發不同強度的保護策略。
|
|
15
|
+
比「打到上限才停」更優雅,為使用者提供早期預警和自動降級機會。
|
|
16
|
+
scope: universal
|
|
17
|
+
borrowed_from: "claude-code-book Ch.7 four-zone context management (0-85/85-90/90-95/95-100%)"
|
|
18
|
+
|
|
19
|
+
guidelines:
|
|
20
|
+
- "所有有 token 預算限制的執行環境必須使用四區模型監控使用率"
|
|
21
|
+
- "WARNING 區必須記錄日誌並發出可觀測事件,不得靜默"
|
|
22
|
+
- "DANGER 區必須觸發輕量保護策略(如截斷工具結果、縮減輸出預算)"
|
|
23
|
+
- "BLOCKING 區必須拒絕新請求,回傳 TOKEN_BUDGET_EXCEEDED 而非讓請求超時崩潰"
|
|
24
|
+
- "壓縮操作本身需保留足夠的輸出空間(否則壓縮本身也可能失敗)"
|
|
25
|
+
|
|
26
|
+
zones:
|
|
27
|
+
SAFE:
|
|
28
|
+
range: "0% – 84%"
|
|
29
|
+
action: "正常執行"
|
|
30
|
+
log_level: null
|
|
31
|
+
|
|
32
|
+
WARNING:
|
|
33
|
+
range: "85% – 89%"
|
|
34
|
+
action: "發出 TOKEN_BUDGET_WARNING 事件,通知 Coordinator / 使用者"
|
|
35
|
+
log_level: info
|
|
36
|
+
optional_actions:
|
|
37
|
+
- "降低 model_tier(capable → standard)"
|
|
38
|
+
- "提示使用者考慮分割任務"
|
|
39
|
+
|
|
40
|
+
DANGER:
|
|
41
|
+
range: "90% – 94%"
|
|
42
|
+
action: "觸發輕量壓縮策略"
|
|
43
|
+
log_level: warn
|
|
44
|
+
required_actions:
|
|
45
|
+
- "截斷超大工具結果(Tool Result Snip)"
|
|
46
|
+
- "縮減後續 Agent 的 maxToolRounds(建議降低 20%)"
|
|
47
|
+
optional_actions:
|
|
48
|
+
- "將重要輸出持久化到磁碟,上下文只保留摘要"
|
|
49
|
+
|
|
50
|
+
BLOCKING:
|
|
51
|
+
range: "95% – 100%"
|
|
52
|
+
action: "拒絕新請求,回傳 TOKEN_BUDGET_EXCEEDED"
|
|
53
|
+
log_level: error
|
|
54
|
+
required_actions:
|
|
55
|
+
- "停止接受新的工具呼叫或 Agent 任務"
|
|
56
|
+
- "回傳結構化錯誤(含當前使用率)"
|
|
57
|
+
|
|
58
|
+
thresholds:
|
|
59
|
+
WARNING_THRESHOLD: 0.85
|
|
60
|
+
DANGER_THRESHOLD: 0.90
|
|
61
|
+
BLOCKING_THRESHOLD: 0.95
|
|
62
|
+
|
|
63
|
+
types:
|
|
64
|
+
TokenBudgetZone:
|
|
65
|
+
values: [safe, warning, danger, blocking]
|
|
66
|
+
|
|
67
|
+
TokenBudgetStatus:
|
|
68
|
+
fields:
|
|
69
|
+
current_tokens: number
|
|
70
|
+
max_tokens: number
|
|
71
|
+
usage_ratio: number
|
|
72
|
+
zone: TokenBudgetZone
|
|
73
|
+
|
|
74
|
+
TokenBudgetExceededError:
|
|
75
|
+
fields:
|
|
76
|
+
code: "TOKEN_BUDGET_EXCEEDED"
|
|
77
|
+
current_tokens: number
|
|
78
|
+
max_tokens: number
|
|
79
|
+
usage_ratio: number
|
|
80
|
+
zone: "blocking"
|
|
81
|
+
|
|
82
|
+
post_compact_budget:
|
|
83
|
+
note: "壓縮後必須保留足夠輸出空間(書中常數參考)"
|
|
84
|
+
constants:
|
|
85
|
+
MAX_FILES_TO_RESTORE: 5
|
|
86
|
+
TOTAL_TOKEN_BUDGET: 50000
|
|
87
|
+
MAX_TOKENS_PER_FILE: 5000
|
|
88
|
+
MAX_TOKENS_PER_SKILL: 5000
|
|
89
|
+
|
|
90
|
+
telemetry_events:
|
|
91
|
+
token_budget_zone_change:
|
|
92
|
+
fields:
|
|
93
|
+
from_zone: TokenBudgetZone
|
|
94
|
+
to_zone: TokenBudgetZone
|
|
95
|
+
usage_ratio: number
|
|
96
|
+
agent_name: string
|
|
97
|
+
timestamp: string
|
|
98
|
+
|
|
99
|
+
applicable_scenarios:
|
|
100
|
+
- "DevAP 任務執行(Task token 預算監控)"
|
|
101
|
+
- "VibeOps 9-Agent Pipeline(跨 Agent 累積上下文監控)"
|
|
102
|
+
- "VibeOps PipelineMemory Snip 觸發條件"
|
|
103
|
+
- "任何有 maxTotalTokens 限制的 Agent 執行環境"
|
|
104
|
+
|
|
105
|
+
error_codes:
|
|
106
|
+
TB-001: "TOKEN_BUDGET_EXCEEDED — 已進入 BLOCKING 區,拒絕新請求"
|
|
107
|
+
TB-002: "TOKEN_BUDGET_WARNING — 已進入 WARNING 區,建議採取行動"
|
|
108
|
+
TB-003: "SNIP_FAILED — 輕量壓縮失敗,仍在 DANGER 區"
|
|
@@ -0,0 +1,184 @@
|
|
|
1
|
+
# Anti-Sycophancy Prompting Standards
|
|
2
|
+
|
|
3
|
+
> **Language**: English | [繁體中文](../locales/zh-TW/core/anti-sycophancy-prompting.md)
|
|
4
|
+
|
|
5
|
+
**Version**: 1.0.0
|
|
6
|
+
**Last Updated**: 2026-04-15
|
|
7
|
+
**Applicability**: All AI agent implementations and LLM prompt design
|
|
8
|
+
**Scope**: universal
|
|
9
|
+
**Industry Standards**: None (UDS original, informed by RLHF sycophancy research)
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Purpose
|
|
14
|
+
|
|
15
|
+
This standard defines techniques and rules for designing prompts that elicit genuine, critical responses from LLMs rather than sycophantic agreement with the user's implied preferences.
|
|
16
|
+
|
|
17
|
+
Sycophancy in LLMs originates from RLHF training objectives where human raters prefer agreeable responses, causing models to optimize for user satisfaction over accuracy.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Core Techniques
|
|
22
|
+
|
|
23
|
+
### 1. Socratic Critique Framework (REQ-1)
|
|
24
|
+
|
|
25
|
+
Reframe the task from "evaluate my idea" to "attack my idea" to eliminate the incentive for sycophancy.
|
|
26
|
+
|
|
27
|
+
| DO | DO NOT |
|
|
28
|
+
|----|--------|
|
|
29
|
+
| ✅ Ask for the 3 most fatal objections to the idea | ❌ Ask "is this a good idea?" |
|
|
30
|
+
| ✅ Require each objection to be technically grounded | ❌ Allow vague positive framing |
|
|
31
|
+
| ✅ Prohibit positive opening phrases | ❌ Accept "Great idea, but..." patterns |
|
|
32
|
+
|
|
33
|
+
**Prompt Template**:
|
|
34
|
+
```
|
|
35
|
+
Do not evaluate whether this is good or bad.
|
|
36
|
+
List the 3 most fatal objections to: [idea]
|
|
37
|
+
Each objection must be technically grounded and non-trivial to dismiss.
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
### 2. Anchor Prevention Protocol (REQ-2)
|
|
43
|
+
|
|
44
|
+
Obtain the LLM's independent judgment before revealing the user's position, preventing anchoring bias.
|
|
45
|
+
|
|
46
|
+
| Step | Action |
|
|
47
|
+
|------|--------|
|
|
48
|
+
| 1 | Ask for neutral comparison without revealing preference |
|
|
49
|
+
| 2 | Receive independent judgment |
|
|
50
|
+
| 3 | Reveal user's position |
|
|
51
|
+
| 4 | Require explicit technical justification if model changes stance |
|
|
52
|
+
|
|
53
|
+
**Workflow**:
|
|
54
|
+
```
|
|
55
|
+
Round 1: "Compare [A] vs [B] for [context]. Which is better?"
|
|
56
|
+
→ Wait for independent judgment
|
|
57
|
+
|
|
58
|
+
Round 2: "I prefer [A]. Does this change your assessment? Why?"
|
|
59
|
+
→ Model must justify any position change with technical facts
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
### 3. Symmetric Dual-Column Output (REQ-3)
|
|
65
|
+
|
|
66
|
+
Use format constraints to force balanced presentation of opposing viewpoints.
|
|
67
|
+
|
|
68
|
+
**Required Format**:
|
|
69
|
+
```
|
|
70
|
+
| Arguments FOR the decision | Arguments AGAINST the decision |
|
|
71
|
+
|---------------------------|-------------------------------|
|
|
72
|
+
| [Equal weight content] | [Equal weight content] |
|
|
73
|
+
|
|
74
|
+
Net Recommendation: [Must take a clear stance, may recommend against]
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
**Rules**:
|
|
78
|
+
- Both columns must have similar length (< 20% difference)
|
|
79
|
+
- Net recommendation must be explicit and may be negative
|
|
80
|
+
- Model cannot escape the format by padding one side
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
### 4. Confidence and Uncertainty Labeling (REQ-4)
|
|
85
|
+
|
|
86
|
+
Require confidence scores on all recommendations to surface uncertainty.
|
|
87
|
+
|
|
88
|
+
**Format**:
|
|
89
|
+
```
|
|
90
|
+
Recommendation: [specific action]
|
|
91
|
+
Confidence: [1-5] — [reason for uncertainty]
|
|
92
|
+
Unknown: [what information would change this assessment]
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
**Confidence Scale**:
|
|
96
|
+
|
|
97
|
+
| Level | Meaning |
|
|
98
|
+
|-------|---------|
|
|
99
|
+
| 5 | Validated at similar scale, high certainty |
|
|
100
|
+
| 4 | Industry standard with sufficient documentation |
|
|
101
|
+
| 3 | Reasonable inference, PoC recommended |
|
|
102
|
+
| 2 | Uncertain, Spike strongly recommended |
|
|
103
|
+
| 1 | Highly uncertain, not recommended for direct adoption |
|
|
104
|
+
|
|
105
|
+
**Rules**:
|
|
106
|
+
- Confidence < 3 must include "More information needed before confirming"
|
|
107
|
+
- All major claims require confidence labeling
|
|
108
|
+
- Uncertainty must be actionable (specify what information resolves it)
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
### 5. Sycophancy Detection Heuristics (REQ-5)
|
|
113
|
+
|
|
114
|
+
Heuristics for identifying sycophantic responses, usable in automated post-processing.
|
|
115
|
+
|
|
116
|
+
| Signal Type | Detection Rule |
|
|
117
|
+
|-------------|---------------|
|
|
118
|
+
| Positive opener | Response starts with agreeable phrase within first 50 tokens (e.g., "great", "interesting", "certainly", "of course") |
|
|
119
|
+
| Position flip | Model reverses stance after user reveals preference without new technical evidence |
|
|
120
|
+
| Risk minimization | Pattern: "While there are some minor issues, overall..." without specifying the issues |
|
|
121
|
+
| Missing quantification | Major recommendation lacks confidence score or specific metrics |
|
|
122
|
+
|
|
123
|
+
**Trigger**: If 2+ signals detected → invoke re-evaluation with explicit Red Team framing.
|
|
124
|
+
|
|
125
|
+
---
|
|
126
|
+
|
|
127
|
+
## Prohibited Behaviors
|
|
128
|
+
|
|
129
|
+
| Prohibited | Correct Action |
|
|
130
|
+
|-----------|----------------|
|
|
131
|
+
| Opening critique with positive affirmation | Start directly with the analysis |
|
|
132
|
+
| Reversing stance without new technical evidence | Maintain position or cite specific new information |
|
|
133
|
+
| Describing risks as "minor" without evidence | Quantify risk or explain why it is bounded |
|
|
134
|
+
| Providing major recommendations without confidence | Always include confidence (1-5) and uncertainty statement |
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
## Integration with Agent Prompts
|
|
139
|
+
|
|
140
|
+
When applying to AI agents:
|
|
141
|
+
|
|
142
|
+
| Agent Type | Apply Rules |
|
|
143
|
+
|------------|-------------|
|
|
144
|
+
| Code Review Agent | REQ-1 (Socratic) + REQ-3 (Dual-column) + REQ-5 (Detection) |
|
|
145
|
+
| Architecture Advisor Agent | REQ-2 (Anchor Prevention) + REQ-4 (Confidence) + REQ-5 (Detection) |
|
|
146
|
+
| Bug Analysis Agent | REQ-1 (Socratic) + REQ-4 (Confidence) |
|
|
147
|
+
| General Consultation Agent | REQ-3 (Dual-column) + REQ-4 (Confidence) |
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## Complete Anti-Sycophancy Prompt Template
|
|
152
|
+
|
|
153
|
+
```
|
|
154
|
+
You are a domain expert with no emotional investment in my satisfaction.
|
|
155
|
+
Your role is to identify flaws in my thinking, not to make me feel good.
|
|
156
|
+
|
|
157
|
+
Rules:
|
|
158
|
+
- Do NOT open with positive phrases (good, interesting, nice, certainly)
|
|
159
|
+
- Every recommendation must include a confidence level (1-5) and what you are uncertain about
|
|
160
|
+
- If my direction is wrong, say so directly
|
|
161
|
+
|
|
162
|
+
My question: [question]
|
|
163
|
+
|
|
164
|
+
First, list the incorrect assumptions I may be holding about this problem.
|
|
165
|
+
Then give your honest recommendation.
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
## Checklist
|
|
171
|
+
|
|
172
|
+
- [ ] Prompt does not invite agreement ("is this good?")
|
|
173
|
+
- [ ] Positive opening phrases explicitly prohibited
|
|
174
|
+
- [ ] Model's independent stance obtained before revealing user preference (if applicable)
|
|
175
|
+
- [ ] Dual-column format enforced for evaluation tasks
|
|
176
|
+
- [ ] Confidence levels required on major recommendations
|
|
177
|
+
- [ ] Sycophancy detection applied to output before presenting to user
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## Related Standards
|
|
182
|
+
|
|
183
|
+
- [anti-hallucination.md](anti-hallucination.md) — Prevents fabrication; complements anti-sycophancy
|
|
184
|
+
- [agent-epistemic-calibration.md](agent-epistemic-calibration.md) — Epistemic humility in agent design (where applicable)
|