@sireai/optimus 0.1.9 → 0.1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (72) hide show
  1. package/dist/cli/optimus.js +5 -5
  2. package/dist/cli/optimus.js.map +1 -1
  3. package/dist/cli/self-update.js +21 -2
  4. package/dist/cli/self-update.js.map +1 -1
  5. package/dist/integrations/feishu/feishu-auth-service.d.ts +2 -2
  6. package/dist/integrations/feishu/feishu-auth-service.js +2 -2
  7. package/dist/integrations/feishu/feishu-auth-service.js.map +1 -1
  8. package/dist/integrations/feishu/feishu-doc-service.d.ts +7 -1
  9. package/dist/integrations/feishu/feishu-doc-service.js +48 -6
  10. package/dist/integrations/feishu/feishu-doc-service.js.map +1 -1
  11. package/dist/integrations/jira/jira-auth-refresh.d.ts +12 -1
  12. package/dist/integrations/jira/jira-auth-refresh.js +76 -0
  13. package/dist/integrations/jira/jira-auth-refresh.js.map +1 -1
  14. package/dist/integrations/jira/jira-client.d.ts +7 -1
  15. package/dist/integrations/jira/jira-client.js +18 -0
  16. package/dist/integrations/jira/jira-client.js.map +1 -1
  17. package/dist/integrations/jira/jira-submit.d.ts +4 -0
  18. package/dist/integrations/jira/jira-submit.js +76 -0
  19. package/dist/integrations/jira/jira-submit.js.map +1 -1
  20. package/dist/problem-solving-core/codex/codex-runner.js +13 -0
  21. package/dist/problem-solving-core/codex/codex-runner.js.map +1 -1
  22. package/dist/problem-solving-core/codex/evolution-skill-guard.js +0 -2
  23. package/dist/problem-solving-core/codex/evolution-skill-guard.js.map +1 -1
  24. package/dist/task-environment/delivery/commit-message/bugfix-commit-message-template.js +4 -1
  25. package/dist/task-environment/delivery/commit-message/bugfix-commit-message-template.js.map +1 -1
  26. package/dist/task-environment/delivery/feishu-analysis-doc-service.js +5 -1
  27. package/dist/task-environment/delivery/feishu-analysis-doc-service.js.map +1 -1
  28. package/dist/task-environment/delivery/feishu-card-renderer.js +2 -2
  29. package/dist/task-environment/delivery/feishu-card-renderer.js.map +1 -1
  30. package/dist/task-environment/delivery/feishu-content/feishu-content-renderer.d.ts +7 -1
  31. package/dist/task-environment/delivery/feishu-content/feishu-content-renderer.js +51 -2
  32. package/dist/task-environment/delivery/feishu-content/feishu-content-renderer.js.map +1 -1
  33. package/dist/task-environment/delivery/feishu-notifier.js +7 -1
  34. package/dist/task-environment/delivery/feishu-notifier.js.map +1 -1
  35. package/dist/task-environment/delivery/feishu-templates/analysis-message-template.js +1 -1
  36. package/dist/task-environment/delivery/feishu-templates/analysis-message-template.js.map +1 -1
  37. package/dist/task-environment/delivery/feishu-templates/bugfix-message-template.js +27 -4
  38. package/dist/task-environment/delivery/feishu-templates/bugfix-message-template.js.map +1 -1
  39. package/dist/task-environment/delivery/feishu-templates/default-message-template.js +1 -1
  40. package/dist/task-environment/delivery/feishu-templates/default-message-template.js.map +1 -1
  41. package/dist/task-environment/delivery/feishu-templates/patch-message-template.js +27 -4
  42. package/dist/task-environment/delivery/feishu-templates/patch-message-template.js.map +1 -1
  43. package/dist/task-environment/delivery/feishu-templates/sentry-bugfix-message-template.js +27 -4
  44. package/dist/task-environment/delivery/feishu-templates/sentry-bugfix-message-template.js.map +1 -1
  45. package/dist/task-environment/delivery/feishu-templates/template-types.d.ts +7 -1
  46. package/dist/task-environment/delivery/sentry-feishu-card-renderer.js +1 -1
  47. package/dist/task-environment/delivery/sentry-feishu-card-renderer.js.map +1 -1
  48. package/dist/task-environment/delivery/task-delivery-service.d.ts +2 -0
  49. package/dist/task-environment/delivery/task-delivery-service.js +178 -6
  50. package/dist/task-environment/delivery/task-delivery-service.js.map +1 -1
  51. package/dist/task-environment/delivery/task-publication-service.d.ts +28 -2
  52. package/dist/task-environment/delivery/task-publication-service.js +205 -22
  53. package/dist/task-environment/delivery/task-publication-service.js.map +1 -1
  54. package/dist/task-environment/execution-addresses.d.ts +8 -0
  55. package/dist/task-environment/execution-addresses.js +8 -0
  56. package/dist/task-environment/execution-addresses.js.map +1 -1
  57. package/dist/task-environment/intake/triage-rejection-feedback-service.d.ts +12 -0
  58. package/dist/task-environment/intake/triage-rejection-feedback-service.js +64 -11
  59. package/dist/task-environment/intake/triage-rejection-feedback-service.js.map +1 -1
  60. package/dist/task-environment/observability/logger.js +3 -0
  61. package/dist/task-environment/observability/logger.js.map +1 -1
  62. package/dist/task-environment/orchestration/execution-context-assembler.d.ts +19 -0
  63. package/dist/task-environment/orchestration/execution-context-assembler.js +422 -1
  64. package/dist/task-environment/orchestration/execution-context-assembler.js.map +1 -1
  65. package/dist/task-environment/orchestration/task-orchestrator.js +49 -5
  66. package/dist/task-environment/orchestration/task-orchestrator.js.map +1 -1
  67. package/dist/types.d.ts +19 -0
  68. package/embedded-skills/shared/video-keyframe-analyzer/SKILL.md +0 -2
  69. package/package.json +2 -2
  70. package/task-harnesses/bugfix/STANDARD.md +121 -42
  71. package/embedded-skills/shared/video-keyframe-analyzer/references/encountered-problems.md +0 -12
  72. package/embedded-skills/shared/video-keyframe-analyzer/references/triage-checklist.md +0 -48
@@ -19,30 +19,45 @@
19
19
  ## Validation policy
20
20
 
21
21
  - A claimed fix requires validation.
22
- - Prefer stronger evidence before weaker evidence.
23
- - Report the strongest level reached and why stronger levels were unavailable.
22
+ - Prefer the highest-reliability validation that is feasible now, not the cheapest one.
23
+ - If a stronger level is blocked, state the blocker and downgrade explicitly.
24
+ - Report exactly one strongest token and keep the method/result details below it.
25
+ - Also report exactly one validation grade `V1` to `V5`.
24
26
 
25
- 1. `L4 functional`: real device, simulator, or another directly runnable environment
26
- 2. `L3 self-check`: local unit tests, targeted custom tests, lightweight scenario injection
27
- 3. `L2 build`: relevant compile target, module build, or targeted test task
28
- 4. `L1 code evidence`: static reasoning, call-chain review, diff review
27
+ Reliability and cost order:
29
28
 
30
- ### Android order
29
+ 1. `V5`: `device_verified` - highest confidence, highest cost
30
+ 2. `V4`: `simulator_verified`, `scenario_verified` - very high or high confidence, high or medium-high cost
31
+ 3. `V3`: `regression_tests_passed`, `unit_tests_passed` - medium-high or medium confidence, medium or medium-high cost
32
+ 4. `V2`: `module_build_passed`, `compile_passed`, `targeted_tests_passed` - partial executable proof, medium to lower confidence, low-medium to medium cost
33
+ 5. `V1`: `code_reviewed` - lowest confidence, lowest cost
31
34
 
32
- 1. Prefer real-device or simulator validation when `adb` and runnable targets exist.
33
- 2. Otherwise prefer local tests or scenario injection.
34
- 3. Use device-side automated checks only when local validation cannot cover the behavior.
35
- 4. Fall back to compile validation such as `compileDebugKotlin` or relevant unit tests.
36
- 5. Use code-only validation only when all stronger forms are unavailable.
35
+ Token contract:
36
+
37
+ - `device_verified` -> `V5`: real device, real business path, fix behavior observed
38
+ - `simulator_verified` -> `V4`: simulator/emulator, real business path, fix behavior observed
39
+ - `scenario_verified` -> `V4`: directly runnable real path or end-to-end scenario executed in a near-real environment
40
+ - `regression_tests_passed` -> `V3`: multiple relevant existing regression/integration cases passed
41
+ - `unit_tests_passed` -> `V3`: real project unit tests passed
42
+ - `module_build_passed` -> `V2`: real module/package build passed
43
+ - `compile_passed` -> `V2`: real compile target passed
44
+ - `targeted_tests_passed` -> `V2`: targeted harness, stub, mock, injected script, minimal API-surface check, or one-off focused executable proof passed
45
+ - `code_reviewed` -> `V1`: no executable validation succeeded; conclusion relies on code/log/evidence review only
46
+
47
+ Never overstate:
48
+
49
+ - Stub, mock, temporary harness, minimal API surface, injected script, or ad hoc executable proof must be `targeted_tests_passed`, not `scenario_verified`.
50
+ - Real compile/build did not pass: do not claim `compile_passed` or `module_build_passed`.
51
+ - No real device/simulator path was exercised: do not claim `device_verified`, `simulator_verified`, or `scenario_verified`.
37
52
 
38
53
  ## Closure policy
39
54
 
40
55
  - Close as fix only when analysis, code changes, validation evidence, and residual-risk understanding are credible.
41
56
  - Close as analysis when information, environment, reproduction, or validation is insufficient for a trustworthy patch claim.
42
- - If code changed but validation reached only `L2` or `L1`, describe it as a repair candidate, not a verified fix.
43
- - If the issue is interaction, crash, device, integration, or resource related and validation stayed at `L2`, state what stronger environment or tooling was missing.
57
+ - If code changed but fix validation stayed at `V2` or `V1`, describe it as a repair candidate, not a verified fix.
58
+ - If the issue is interaction, crash, device, integration, or resource related and fix validation stayed at `V2`, state what stronger environment or tooling was missing.
44
59
  - If build or test failed for unrelated reasons, report the stage, failure reason, and why it is treated as noise or a pre-existing blocker.
45
- - If only `L1` evidence exists, do not submit a formal patch claim; close as analysis.
60
+ - If only `V1` evidence exists, do not submit a formal verified-fix claim; close as analysis unless a repair candidate is still justified.
46
61
  - Analysis closure must still provide root-cause judgment, fix direction, and either targeted local guidance or a module-level strategy.
47
62
 
48
63
  ## Runtime contract
@@ -81,40 +96,75 @@
81
96
  - Before writing `result.md`, determine `Closure Level`, then follow exactly one language mode:
82
97
  - `Verified Fix` or `Repair Candidate`: Patch Closure Mode; all narrative sections are English
83
98
  - `Analysis Only`: Analysis Closure Mode; narrative sections are Chinese
84
- - `Validation Summary` stays a single English token in all cases.
99
+ - `Reproduction` / `复现情况` uses the compact form `<token> (R*)` when grade is known.
100
+ - `Fix Validation` / `修复验证` uses the compact form `<token> (V*)` when grade is known.
85
101
  - Use repository-relative code paths only; never use absolute local paths.
86
102
  - Commands, logs, stack traces, API errors, and identifiers may stay in their original language when needed.
87
103
  - If closure is patch-related and any narrative field is Chinese, the output is invalid and must be rewritten.
88
104
 
89
105
  ## Mandatory field-name contract
90
106
 
91
- Do not rename these downstream-consumed keys:
107
+ These keys are parsed by downstream delivery code. Keep exact capitalization and wording. Do not rename them.
108
+
109
+ Two layers are consumed:
110
+
111
+ - Delivery Summary keys: compact, high-density fields for cards, comments, and status output
112
+ - Detail keys: strongest conclusion, evidence, and next-action fields reused by downstream renderers
92
113
 
93
- - English patch-mode keys:
94
- - `Validation Summary`
114
+ Patch mode:
115
+
116
+ - Delivery Summary keys:
117
+ - `Root Cause`
118
+ - `Fix`
119
+ - `Reproduction`
120
+ - `Fix Validation`
121
+ - `Impact Check`
122
+ - `Confidence`
123
+ - `Blocking Point`
124
+ - Detail keys:
95
125
  - `Strongest Current Conclusion`
96
- - `Analysis Summary`
97
126
  - `Key Evidence`
98
127
  - `Recommended Action`
99
128
  - `Analysis Doc URL`
100
- - Chinese analysis-mode keys:
101
- - `验证摘要`
129
+
130
+ Analysis mode:
131
+
132
+ - Delivery Summary keys:
133
+ - `根因摘要`
134
+ - `修复建议`
135
+ - `复现情况`
136
+ - `修复验证`
137
+ - `影响评估`
138
+ - `确定性`
139
+ - `阻塞点`
140
+ - Detail keys:
102
141
  - `当前最强结论`
103
142
  - `分析摘要`
104
143
  - `关键证据`
105
144
  - `建议动作`
106
145
  - `分析文档链接`
107
146
 
108
- Keep exact capitalization and wording.
109
-
110
147
  ## Result content
111
148
 
112
149
  At minimum, `result.md` must include:
113
150
 
151
+ - one compact `Delivery Summary` / `交付摘要` block using the exact Delivery Summary keys above
152
+ Keep each summary value dense and short enough for comment/card reuse.
153
+ - a reproduction summary as one high-density English token
154
+ Allowed values: `naturally_reproduced`, `induced_reproduced`, `historical_evidence_matched`, `not_reproduced`
155
+ - a reproduction grade inside `Reproduction` / `复现情况`
156
+ Allowed values: `R1`, `R2`, `R3`, `R4`
114
157
  - a validation summary as one high-density English token
115
158
  Allowed values: `device_verified`, `simulator_verified`, `scenario_verified`, `unit_tests_passed`, `targeted_tests_passed`, `regression_tests_passed`, `compile_passed`, `module_build_passed`, `code_reviewed`
116
159
  Forbidden generic values: `validation_completed`, `tests_passed`, `verified`, `passed`, `done`
117
- If multiple validations were performed, report only the strongest one.
160
+ If multiple validations were performed, report only the strongest one by the reliability order above.
161
+ - a validation grade inside `Fix Validation` / `修复验证`
162
+ Allowed values: `V1`, `V2`, `V3`, `V4`, `V5`
163
+ It must match the selected validation token.
164
+ - an impact-check summary token
165
+ Allowed values: `neighbor_paths_checked`, `partial_neighbor_check`, `not_checked`
166
+ - a confidence token
167
+ Allowed values: `C1`, `C2`, `C3`, `C4`
118
168
  - problem summary and impact scope
119
169
  - category: functional, stability, performance, or compatibility
120
170
  - reproduction likelihood: always, high, low, or unknown
@@ -135,40 +185,49 @@ At minimum, `result.md` must include:
135
185
  - All narrative text must be English.
136
186
  - Do not emit Chinese prose in headings, bullets, conclusions, evidence, patch notes, validation narrative, risks, or next steps.
137
187
  - Do not emit `Analysis Summary` in patch closure output.
138
- - Patch closure must include:
139
- - `Strongest Current Conclusion`
140
- - `Key Evidence`
141
- - `Recommended Action`
188
+ - Patch closure must include the exact Patch Detail keys from the field-name contract.
142
189
 
143
190
  ### Patch Closure Template
144
191
 
145
192
  ```md
146
193
  # Bugfix Result
147
194
 
148
- ## Summary
195
+ ## Delivery Summary
196
+ - Root Cause:
197
+ - Fix:
198
+ - Reproduction:
199
+ - Fix Validation:
200
+ - Impact Check:
201
+ - Confidence:
202
+ - Blocking Point: `None` if not blocked
203
+
204
+ ## Detail
205
+
206
+ ### Summary
149
207
  - Problem:
150
208
  - Impact:
151
209
  - Category: Functional / Stability / Performance / Compatibility
152
210
  - Reproduction Likelihood: Always / High / Low / Unknown
153
211
 
154
- ## Root Cause
212
+ ### Root Cause
155
213
  - Strongest Current Conclusion:
156
214
  - Key Evidence:
157
215
  - Relevant Code Locations:
158
216
 
159
- ## Change
217
+ ### Change
160
218
  - Closure Level: Verified Fix / Repair Candidate
161
219
  - Patch Notes:
162
220
  - Fix Strategy:
163
221
  - Blocking Point: `None` if not blocked
164
222
 
165
- ## Validation
223
+ ### Validation
166
224
  - Validation Summary: exactly one short English token, strongest validation only
225
+ - Validation Grade: exactly one token, `V1` to `V5`, matching `Validation Summary`
167
226
  - Method:
168
227
  - Result:
169
228
  - Unverified Items:
170
229
 
171
- ## Risks
230
+ ### Risks
172
231
  - Residual Risk:
173
232
  - Recommended Action:
174
233
  ```
@@ -181,7 +240,7 @@ At minimum, `result.md` must include:
181
240
  - Narrative text must remain Chinese.
182
241
  - `验证摘要` must still be a single English token.
183
242
  - Do not force patch-delivery prose such as `Patch Notes` into analysis closure output.
184
- - Analysis closure must include the Chinese contract keys for strongest conclusion, summary, evidence, and recommendation.
243
+ - Analysis closure must include the exact Analysis Detail keys from the field-name contract.
185
244
 
186
245
  ### Analysis Closure Template
187
246
 
@@ -190,29 +249,41 @@ Use the following Chinese output structure exactly:
190
249
  ```md
191
250
  # 缺陷分析结果
192
251
 
193
- ## 问题概述
252
+ ## 交付摘要
253
+ - 根因摘要:
254
+ - 修复建议:
255
+ - 复现情况: `<token> (R*)` when grade is known
256
+ - 修复验证: `<token> (V*)` when grade is known
257
+ - 影响评估:
258
+ - 确定性:
259
+ - 阻塞点:
260
+
261
+ ## 详细分析
262
+
263
+ ### 问题概述
194
264
  - 问题:
195
265
  - 影响:
196
266
  - 分类: 功能 / 稳定性 / 性能 / 兼容性
197
267
  - 复现概率: 必现 / 高概率 / 低概率 / 未知
198
268
 
199
- ## 分析结论
269
+ ### 分析结论
200
270
  - 当前最强结论:
201
271
  - 分析摘要:
202
272
  - 关键证据:
203
273
  - 相关代码位置:
204
274
 
205
- ## 修复判断
275
+ ### 修复判断
206
276
  - Closure Level: Analysis Only
207
277
  - 阻塞点:
208
278
  - 建议动作:
209
279
 
210
- ## 验证情况
280
+ ### 验证情况
211
281
  - 验证摘要: exactly one short English token, strongest validation only
282
+ - 验证等级: exactly one token, `V1` to `V5`, matching `验证摘要`
212
283
  - 已验证内容:
213
284
  - 未验证内容:
214
285
 
215
- ## 风险
286
+ ### 风险
216
287
  - 主要风险:
217
288
  - 建议动作:
218
289
  - 分析文档链接:
@@ -236,6 +307,8 @@ Patch closure examples:
236
307
  - If code changed, ensure `result.md` and `patch.diff` do not contradict each other.
237
308
  - If important code changed, ensure explanatory comments are present.
238
309
  - If validation was performed, ensure claims are not overstated.
310
+ - Ensure validation token matches the strongest proof actually executed, not the intended proof.
311
+ - Ensure `Delivery Summary` / `交付摘要` is present and consistent with the detailed sections below it.
239
312
  - Distinguish confirmed facts from inference.
240
313
  - For patch closure, ensure `Strongest Current Conclusion`, `Key Evidence`, and `Recommended Action` are English before returning.
241
314
 
@@ -245,7 +318,13 @@ Patch closure examples:
245
318
  - if `patch.diff` exists, closure mode is not `Analysis Only`
246
319
  - patch closure uses English narrative only
247
320
  - analysis closure uses Chinese narrative only
248
- - required downstream field names are exact
321
+ - Delivery Summary keys are exact
322
+ - Detail keys are exact
249
323
  - patch closure does not emit `Analysis Summary`
250
324
  - analysis closure does not mix in patch-delivery sections
251
- - `Validation Summary` or `验证摘要` is exactly one strongest English token
325
+ - `Reproduction` or `复现情况` uses exactly one allowed token and an `R*` grade when grade is present
326
+ - `Fix Validation` or `修复验证` uses exactly one strongest allowed validation token and a matching `V*` grade when grade is present
327
+ - `Impact Check` or `影响评估` uses exactly one allowed token
328
+ - `Confidence` or `确定性` uses exactly one allowed token
329
+ - detailed validation section still contains `Validation Summary` / `验证摘要`
330
+ - detailed validation section still contains `Validation Grade` / `验证等级`
@@ -1,12 +0,0 @@
1
- # Encountered Problems
2
-
3
- Known operational issues from real Jira recording analysis:
4
-
5
- - Jira attachment links may require browser/CAS auth; direct curl can return a login page.
6
- - Browser downloads may appear first as hidden temporary files before the final filename is visible.
7
- - Interactive shell environment variables may not reach non-interactive task commands.
8
- - Cloud video understanding can be blocked by network sandboxing.
9
- - Remote video processing can be slow even for short recordings.
10
- - Remote polling can fail with transient partial reads.
11
-
12
- The local keyframe workflow avoids most of these by reducing video analysis to local image artifacts.
@@ -1,48 +0,0 @@
1
- # Triage Checklist
2
-
3
- Use this compact structure after reviewing keyframes.
4
-
5
- ## Summary
6
-
7
- - User flow
8
- - Visible failure
9
- - Post-failure state
10
-
11
- ## Timeline
12
-
13
- - `00:00-00:03`: entry state
14
- - `00:03-00:08`: visible user action or transition
15
- - `00:08-00:10`: anomaly
16
- - `00:10+`: post-failure behavior
17
-
18
- ## Repro Steps
19
-
20
- Infer only what the recording supports:
21
-
22
- 1. Open app and enter the visible module.
23
- 2. Perform the visible interaction.
24
- 3. Observe the incorrect outcome.
25
-
26
- ## Observed vs Expected
27
-
28
- - Observed: what the UI actually did.
29
- - Expected: what should happen in a healthy flow.
30
-
31
- ## Evidence
32
-
33
- Useful visual evidence includes:
34
-
35
- - app returns to launcher unexpectedly
36
- - relaunch lands on login or wrong page
37
- - dialog remains visible after state changes
38
- - loading spinner or white screen stays unchanged
39
- - expected toast, navigation, or dismissal never appears
40
-
41
- ## Open Questions
42
-
43
- Keep speculation separate:
44
-
45
- - account or environment state
46
- - network condition
47
- - exact trigger if not visible
48
- - whether the issue reproduces across devices or accounts