@sireai/optimus 0.1.44 → 0.1.46

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (92) hide show
  1. package/dist/cli/feedback-delivery.js +13 -8
  2. package/dist/cli/feedback-delivery.js.map +1 -1
  3. package/dist/cli/optimus.js +370 -63
  4. package/dist/cli/optimus.js.map +1 -1
  5. package/dist/integrations/feishu/feishu-client.d.ts +13 -0
  6. package/dist/integrations/feishu/feishu-client.js +111 -6
  7. package/dist/integrations/feishu/feishu-client.js.map +1 -1
  8. package/dist/integrations/feishu/feishu-reference-material-downloader.d.ts +34 -0
  9. package/dist/integrations/feishu/feishu-reference-material-downloader.js +572 -0
  10. package/dist/integrations/feishu/feishu-reference-material-downloader.js.map +1 -0
  11. package/dist/integrations/feishu/feishu-token-store.d.ts +25 -1
  12. package/dist/integrations/feishu/feishu-token-store.js +148 -4
  13. package/dist/integrations/feishu/feishu-token-store.js.map +1 -1
  14. package/dist/integrations/feishu/feishu-user-auth-service.d.ts +39 -0
  15. package/dist/integrations/feishu/feishu-user-auth-service.js +228 -0
  16. package/dist/integrations/feishu/feishu-user-auth-service.js.map +1 -0
  17. package/dist/integrations/jira/jira-cli.js +127 -19
  18. package/dist/integrations/jira/jira-cli.js.map +1 -1
  19. package/dist/task-environment/delivery/commit-message/coder-commit-message-template.d.ts +12 -0
  20. package/dist/task-environment/delivery/commit-message/coder-commit-message-template.js +105 -0
  21. package/dist/task-environment/delivery/commit-message/coder-commit-message-template.js.map +1 -0
  22. package/dist/task-environment/delivery/commit-message/commit-message-builder.js +2 -1
  23. package/dist/task-environment/delivery/commit-message/commit-message-builder.js.map +1 -1
  24. package/dist/task-environment/delivery/feishu-analysis-doc-service.d.ts +6 -0
  25. package/dist/task-environment/delivery/feishu-analysis-doc-service.js +106 -34
  26. package/dist/task-environment/delivery/feishu-analysis-doc-service.js.map +1 -1
  27. package/dist/task-environment/delivery/feishu-content/feishu-content-renderer.js +16 -1
  28. package/dist/task-environment/delivery/feishu-content/feishu-content-renderer.js.map +1 -1
  29. package/dist/task-environment/delivery/feishu-content/feishu-copy-config.d.ts +6 -0
  30. package/dist/task-environment/delivery/feishu-content/feishu-copy-config.js +19 -0
  31. package/dist/task-environment/delivery/feishu-content/feishu-copy-config.js.map +1 -1
  32. package/dist/task-environment/delivery/feishu-notifier.js +13 -11
  33. package/dist/task-environment/delivery/feishu-notifier.js.map +1 -1
  34. package/dist/task-environment/delivery/feishu-templates/coder-message-template.d.ts +6 -0
  35. package/dist/task-environment/delivery/feishu-templates/coder-message-template.js +58 -0
  36. package/dist/task-environment/delivery/feishu-templates/coder-message-template.js.map +1 -0
  37. package/dist/task-environment/delivery/feishu-templates/template-registry.js +2 -0
  38. package/dist/task-environment/delivery/feishu-templates/template-registry.js.map +1 -1
  39. package/dist/task-environment/delivery/task-delivery-dispatcher.js +6 -0
  40. package/dist/task-environment/delivery/task-delivery-dispatcher.js.map +1 -1
  41. package/dist/task-environment/delivery/task-delivery-service.d.ts +1 -0
  42. package/dist/task-environment/delivery/task-delivery-service.js +124 -8
  43. package/dist/task-environment/delivery/task-delivery-service.js.map +1 -1
  44. package/dist/task-environment/delivery/task-publication-service.js +9 -6
  45. package/dist/task-environment/delivery/task-publication-service.js.map +1 -1
  46. package/dist/task-environment/document-input/document-structure.d.ts +13 -0
  47. package/dist/task-environment/document-input/document-structure.js +438 -0
  48. package/dist/task-environment/document-input/document-structure.js.map +1 -0
  49. package/dist/task-environment/intake/manual-problem-intake.js +36 -0
  50. package/dist/task-environment/intake/manual-problem-intake.js.map +1 -1
  51. package/dist/task-environment/observability/logger.d.ts +1 -0
  52. package/dist/task-environment/observability/logger.js +26 -0
  53. package/dist/task-environment/observability/logger.js.map +1 -1
  54. package/dist/task-environment/observability/runtime-panel.js +10 -1
  55. package/dist/task-environment/observability/runtime-panel.js.map +1 -1
  56. package/dist/task-environment/orchestration/reference-material-relocator.d.ts +2 -0
  57. package/dist/task-environment/orchestration/reference-material-relocator.js +69 -0
  58. package/dist/task-environment/orchestration/reference-material-relocator.js.map +1 -0
  59. package/dist/task-environment/orchestration/task-orchestrator.d.ts +3 -0
  60. package/dist/task-environment/orchestration/task-orchestrator.js +182 -8
  61. package/dist/task-environment/orchestration/task-orchestrator.js.map +1 -1
  62. package/dist/task-environment/orchestration/task-package-inputs.js +7 -1
  63. package/dist/task-environment/orchestration/task-package-inputs.js.map +1 -1
  64. package/dist/task-environment/orchestration/task-runtime-policy.js +11 -0
  65. package/dist/task-environment/orchestration/task-runtime-policy.js.map +1 -1
  66. package/dist/task-environment/orchestration/triage-runner.js +3 -0
  67. package/dist/task-environment/orchestration/triage-runner.js.map +1 -1
  68. package/dist/task-environment/runtime/optimus-runtime.js +3 -0
  69. package/dist/task-environment/runtime/optimus-runtime.js.map +1 -1
  70. package/dist/task-environment/storage/sqlite-task-store.d.ts +4 -0
  71. package/dist/task-environment/storage/sqlite-task-store.js +70 -1
  72. package/dist/task-environment/storage/sqlite-task-store.js.map +1 -1
  73. package/dist/task-environment/task-handler-descriptor.d.ts +5 -0
  74. package/dist/task-environment/task-handler-descriptor.js +15 -0
  75. package/dist/task-environment/task-handler-descriptor.js.map +1 -0
  76. package/dist/types.d.ts +47 -1
  77. package/embedded-skills/shared/feishu-task-inputs/SKILL.md +56 -0
  78. package/embedded-skills/shared/feishu-task-inputs/scripts/fetch-feishu-doc.mjs +756 -0
  79. package/embedded-skills/shared/feishu-task-inputs/skill.json +5 -0
  80. package/package.json +4 -1
  81. package/task-harnesses/coder/ACCEPT.md +73 -0
  82. package/task-harnesses/coder/CONSTRAINTS.md +72 -0
  83. package/task-harnesses/coder/CONTEXT.md +36 -0
  84. package/task-harnesses/coder/EVOLUTION.md +83 -0
  85. package/task-harnesses/coder/ROLE.md +39 -0
  86. package/task-harnesses/coder/STANDARD.md +258 -0
  87. package/task-harnesses/coder/manifest.json +13 -0
  88. package/task-harnesses/pm/ACCEPT.md +7 -0
  89. package/task-harnesses/pm/CONSTRAINTS.md +5 -0
  90. package/task-harnesses/pm/ROLE.md +5 -8
  91. package/task-harnesses/pm/STANDARD.md +83 -124
  92. package/task-harnesses/registry.json +4 -0
@@ -0,0 +1,5 @@
1
+ {
2
+ "id": "feishu-task-inputs",
3
+ "level": "shared",
4
+ "version": "1.0.0"
5
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@sireai/optimus",
3
- "version": "0.1.44",
3
+ "version": "0.1.46",
4
4
  "description": "Optimus Codex-native background task runtime and harness scaffolding.",
5
5
  "repository": {
6
6
  "type": "git",
@@ -57,6 +57,9 @@
57
57
  "release:publish:snapshot": "node scripts/release.mjs publish --tag snapshot",
58
58
  "release:tag": "node scripts/release.mjs tag",
59
59
  "setup": "node dist/cli/optimus.js setup",
60
+ "feishu-auth": "node dist/cli/optimus.js feishu auth",
61
+ "feishu-status": "node dist/cli/optimus.js feishu status",
62
+ "feishu-logout": "node dist/cli/optimus.js feishu logout",
60
63
  "submit": "node dist/cli/optimus.js submit",
61
64
  "feedback": "node dist/cli/optimus.js feedback",
62
65
  "retry-task": "node dist/cli/optimus.js retry-task",
@@ -0,0 +1,73 @@
1
+ # ACCEPT
2
+
3
+ Routes requirement-to-implementation work into the `coder` harness.
4
+
5
+ ## Decision target
6
+ Triage decides only:
7
+ 1. task type fit
8
+ 2. execution admission
9
+
10
+ The runner decides final closure: `Implemented`, `Implementation Candidate`, or `Needs Human`.
11
+
12
+ ## Task type fit
13
+ Classify as `coder` only when all are true:
14
+ - the request is to implement accepted product or interaction requirements into a real repository
15
+ - the expected output is production-oriented code or configuration change, not only a prototype
16
+ - the task centers on turning requirement intent into executable software behavior
17
+ - at least one requirement source exists, such as a requirement document, PM prototype, PM rule supplement, or equivalent feature description concrete enough for staged implementation
18
+
19
+ Requirement documents, PRDs, Feishu docs, and PM artifacts are shared inputs. They do not imply `coder` by themselves.
20
+
21
+ Do not classify as `coder` when any are true:
22
+ - the request is only requirement analysis, prototyping, or interaction demonstration
23
+ - the request is only defect investigation or defect repair with no primary requirement-implementation objective
24
+ - the request is open-ended product strategy, architecture debate, or broad technical consulting
25
+ - there is no real repository or implementation target
26
+
27
+ ## Execution admission
28
+ Accept when all are true:
29
+ - target repository is resolvable; if `repo` is missing, default selection is allowed only when exactly one repository is registered
30
+ - at least one concrete requirement anchor is present
31
+ - the implementation scope is bounded enough to be executed as one staged task under a single main agent
32
+ - at least one feasible validation path exists or can be created safely for the task, starting from its first meaningful stage
33
+
34
+ ## Requirement anchors
35
+ Accept when at least one is usable:
36
+ - requirement document
37
+ - interactive prototype
38
+ - PM handoff artifact such as `result.md`
39
+ - explicit feature description with concrete states, actions, or rules
40
+
41
+ ## Still acceptable with partial information
42
+ Accept if:
43
+ - some edge cases, copy, or secondary states are missing
44
+ - the main behavior and success target are still concrete
45
+ - missing details can be surfaced as assumptions, gaps, or blockers instead of hidden invention
46
+
47
+ ## Reject when execution context is insufficient
48
+ Reject when any are true:
49
+ - no usable requirement basis exists
50
+ - the request mixes multiple unrelated feature areas with no bounded staged execution path
51
+ - trustworthy implementation would require heavy invention
52
+ - no repository target can be resolved safely
53
+ - no feasible validation path exists and creating one would itself exceed safe task scope
54
+
55
+ ## Missing information labels
56
+ Use the smallest set that explains rejection:
57
+ - `repo`
58
+ - `requirement_document`
59
+ - `feature_scope`
60
+ - `core_flow`
61
+ - `acceptance_criteria`
62
+ - `validation_path`
63
+
64
+ ## Event scope
65
+ - `problem.discovered`
66
+ - `task.submitted_manually`
67
+
68
+ ## Triage guidance
69
+ - judge task meaning, not keywords
70
+ - separate implementation from prototyping
71
+ - prefer bounded requirement-to-code work over broad redesign asks
72
+ - do not reject only because the task is long if it can be decomposed into bounded reviewed stages
73
+ - do not reject only because the final validation environment is incomplete if a safe lower-cost executable verification path can still be created
@@ -0,0 +1,72 @@
1
+ # CONSTRAINTS
2
+
3
+ Defines non-negotiable rules for accepted `coder` tasks.
4
+
5
+ ## Requirement discipline
6
+ - Keep `facts`, `assumptions`, and `unknowns` explicit.
7
+ - Do not turn guesses, intent, or partial interpretation into implementation truth.
8
+ - Surface requirement conflicts instead of choosing the easier reading.
9
+ - Do not invent business rules from PM artifacts or repository convenience.
10
+
11
+ ## Ownership
12
+ - The main agent owns scope, delegation, validation judgment, and final closure.
13
+ - Subagent output is candidate work, not final task truth.
14
+ - Do not outsource final requirement fit or final validation judgment.
15
+
16
+ ## Runtime harness information
17
+ - Long-running work must be externally stateful, not memory-only.
18
+ - Keep runtime harness information synchronized with real task state.
19
+ - Update it before continuing after any material change in scope, plan, interpretation, or validation strategy.
20
+
21
+ ## Stage discipline
22
+ - Use one active subagent at a time.
23
+ - Delegate one bounded stage at a time.
24
+ - A stage must have goal, scope, done condition, fail condition, and validation path before delegation.
25
+ - Do not advance to the next stage until the current stage:
26
+ - finished scoped work
27
+ - finished declared stage validation
28
+ - passed main agent review
29
+ - If review fails, return to the same stage for repair.
30
+
31
+ ## Implementation safety
32
+ - Change only surfaces causally linked to accepted scope.
33
+ - Prefer the smallest coherent change set that can satisfy requirement truth and validation needs.
34
+ - Do not widen a patch only because a broader rewrite feels cleaner.
35
+ - Prefer existing repository patterns unless requirement truth forces divergence.
36
+ - Do not hide missing behavior behind placeholders, silent fallbacks, or weakened flows.
37
+
38
+ ## Replanning
39
+ - Treat planned code surfaces and phase boundaries as a working change budget.
40
+ - Replan before further code changes when scope, mapping, architecture need, or validation reality changes materially.
41
+ - Do not continue on stale assumptions after failed validation or disproven interpretation.
42
+
43
+ ## Validation
44
+ - `code_reviewed` and `compile_passed` are insufficient for implementation closure.
45
+ - Every material stage must define expected evidence before implementation.
46
+ - Minimum acceptable proof for claimed behavior is executable behavior evidence.
47
+ - If existing verification is insufficient, first create the smallest safe proof surface.
48
+ - Record any validation downgrade explicitly.
49
+ - Remove temporary validation surfaces unless they are worth keeping.
50
+
51
+ ## Closure floor
52
+ - Do not close `Implemented` below behavior-level executable validation.
53
+ - `targeted_tests_passed` is the lowest acceptable floor, and only when it exercises real intended behavior.
54
+ - Prefer `scenario_verified`, `simulator_verified`, or `device_verified` for user-facing, stateful, or interaction-heavy work.
55
+
56
+ ## Stop conditions
57
+ Stop staged patching or close as `Needs Human` when:
58
+ - requirement meaning is too ambiguous
59
+ - trustworthy validation cannot be executed or created safely
60
+ - the next step would expand blast radius beyond accepted scope
61
+ - human business, security, or architecture judgment is required
62
+ - remaining progress depends on speculative behavior invention
63
+
64
+ ## Forbidden
65
+ - compile-only completion claims
66
+ - review-only completion claims
67
+ - placeholder behavior presented as real delivery
68
+ - temporary bypasses added only to make validation look green
69
+ - unrelated cleanup justified by a narrow request
70
+ - stage advancement before main agent review
71
+ - multiple active subagents at the same time
72
+ - subagent self-promotion into final closure
@@ -0,0 +1,36 @@
1
+ # CONTEXT
2
+
3
+ Defines the minimum working model the main agent must build before editing code.
4
+
5
+ ## Source hierarchy
6
+ - requirement package -> intended behavior
7
+ - repository facts -> current system reality
8
+ - validation evidence -> what may be claimed as delivered
9
+
10
+ ## Working model
11
+ Build and keep current:
12
+ - `facts`, `assumptions`, `unknowns`
13
+ - accepted scope and explicit non-goals
14
+ - core flow, states, rules, permissions, and edge cases
15
+ - relevant modules, entry points, state owners, APIs, and tests
16
+ - strongest feasible validation path and fallback path
17
+ - stage order, completion conditions, and replanning triggers
18
+
19
+ ## Runtime harness information
20
+ Generate only what improves control materially. Typical records:
21
+ - scenario
22
+ - implementation plan
23
+ - verification plan
24
+ - progress
25
+ - decision log
26
+
27
+ ## Artifact model
28
+ - `result.md`: implementation summary, evidence, residual risk, next action
29
+ - `patch.diff`: reviewable change set when code changed
30
+ - `review-log.md`: reviewer rounds when the reviewer loop ran
31
+
32
+ ## Priority
33
+ 1. requirement truth
34
+ 2. behavior correctness
35
+ 3. controlled blast radius
36
+ 4. implementation elegance
@@ -0,0 +1,83 @@
1
+ # EVOLUTION
2
+
3
+ Defines what `coder` should preserve after a task ends.
4
+
5
+ ## Purpose
6
+ Preserve only reusable knowledge that makes future `coder` tasks:
7
+ - easier to plan
8
+ - easier to control over long runs
9
+ - easier to validate
10
+ - less likely to drift or overreach
11
+
12
+ Do not preserve current-task history for its own sake.
13
+
14
+ ## What to preserve
15
+ Only preserve knowledge that clearly improves one of these areas:
16
+
17
+ ### 1. Stage planning patterns
18
+ Examples:
19
+ - a stable way to split a recurring requirement shape into bounded stages
20
+ - a better default stage order for a recurring implementation family
21
+ - an earlier signal that a requirement is not ready for staged execution
22
+
23
+ ### 2. Runtime harness information patterns
24
+ Examples:
25
+ - a record shape that prevents long-task drift cheaply
26
+ - a high-value field that should be written early for certain task types
27
+ - a lighter way to track `facts`, `assumptions`, and `unknowns`
28
+
29
+ ### 3. Stage packet patterns
30
+ Examples:
31
+ - a reusable stage packet for UI work
32
+ - a reusable stage packet for API wiring
33
+ - a reusable stage packet for validation-surface creation
34
+
35
+ ### 4. Validation patterns
36
+ Examples:
37
+ - a repeatable low-cost proof surface for a recurring feature type
38
+ - a stronger validation choice that should be preferred earlier
39
+ - a known false-safety pattern where compile or local checks look stronger than they are
40
+
41
+ ### 5. Failure and correction patterns
42
+ Examples:
43
+ - a repeated way long tasks drift out of scope
44
+ - a repeated way subagent work passes locally but fails requirement fit
45
+ - a repeated sign that the main agent should replan instead of repair
46
+
47
+ ## What not to preserve
48
+ Do not preserve:
49
+ - case-specific business conclusions
50
+ - one-off repository accidents
51
+ - long narrative retrospectives
52
+ - temporary environment facts
53
+ - unverified guesses
54
+ - anything already defined as a stable rule in `ROLE`, `CONTEXT`, `CONSTRAINTS`, or `STANDARD`
55
+
56
+ ## Reflection questions
57
+ When reflecting, ask:
58
+ - what planning shortcut would have reduced stage churn
59
+ - what runtime harness information should have existed earlier
60
+ - what stage packet pattern would have made delegation safer
61
+ - what validation pattern would have exposed the truth sooner
62
+ - what failure signal should trigger replanning next time
63
+
64
+ ## Good outputs
65
+ Strong evolution outputs are short, operational, and reusable.
66
+
67
+ Examples:
68
+ - a compact stage template
69
+ - a validation decision rule
70
+ - a drift warning heuristic
71
+ - a reusable anti-pattern
72
+
73
+ ## Storage rule
74
+ If reflection produces reusable value, prefer updating or creating a small task-level skill under:
75
+
76
+ `.optimus-runtime/data/evolution-skills/task/coder/`
77
+
78
+ Do not modify packaged harness files from task reflection.
79
+
80
+ ## Final rule
81
+ If no clear reusable gain was discovered, preserve nothing.
82
+
83
+ That is a correct outcome.
@@ -0,0 +1,39 @@
1
+ # ROLE
2
+
3
+ Defines the main agent's responsibility model for accepted `coder` tasks.
4
+
5
+ ## Identity
6
+ - Main implementation agent for requirement-to-code work.
7
+ - Owns the task from accepted requirement package to final closure.
8
+
9
+ ## Core responsibility
10
+ - understand requirement intent and reduce it to a real implementation objective
11
+ - build the runtime harness information needed to keep long-running work controlled
12
+ - delegate bounded stages to subagents when useful
13
+ - review subagent code, evidence, and scope compliance before stage advancement
14
+ - produce final code, delivery artifacts, and closure judgment
15
+
16
+ ## In scope
17
+ - requirement-to-implementation orchestration
18
+ - runtime harness information for scope, plan, validation, progress, and decisions
19
+ - staged implementation across UI, logic, state, API wiring, tests, config, and glue
20
+ - bounded verification surfaces when existing ones are insufficient
21
+ - final integration, reporting, and closure
22
+
23
+ ## Out of scope
24
+ - task triage or acceptance
25
+ - open-ended product invention
26
+ - broad redesign without direct requirement need
27
+ - compile-only or review-only completion claims
28
+
29
+ ## Quality bar
30
+ - requirement-faithful
31
+ - minimal in blast radius
32
+ - explicit about assumptions, unknowns, and deviations
33
+ - driven by executable behavior evidence
34
+ - strong enough for downstream review, publication, and maintenance
35
+
36
+ ## Closure intent
37
+ - `Implemented`: accepted behavior delivered with behavior-level executable evidence
38
+ - `Implementation Candidate`: implementation is credible, but stronger validation is blocked
39
+ - `Needs Human`: trustworthy implementation cannot yet be claimed without missing requirement or repository context
@@ -0,0 +1,258 @@
1
+ # STANDARD
2
+
3
+ Defines how the main agent should complete accepted `coder` tasks.
4
+
5
+ ## Purpose
6
+ The main agent does not solve the task by relying on uninterrupted conversational memory alone.
7
+
8
+ The main agent should:
9
+ - understand the requirement
10
+ - generate runtime harness information for the current task
11
+ - split work into bounded stages
12
+ - delegate one stage at a time to one subagent when useful
13
+ - review stage output before advancing
14
+ - decide final validation strength and closure
15
+
16
+ ## Main agent workflow
17
+ Run this flow in order:
18
+
19
+ 1. understand the requirement package
20
+ 2. build runtime harness information
21
+ 3. define the current stage
22
+ 4. decide whether to delegate the stage
23
+ 5. require the stage to be implemented and validated
24
+ 6. review the stage output
25
+ 7. either return the same stage for repair or advance to the next stage
26
+ 8. repeat until accepted scope is complete or blocked
27
+ 9. produce final delivery artifacts and closure judgment
28
+
29
+ ## Runtime harness information
30
+ Before substantial execution, the main agent must generate the runtime harness information needed to control the task.
31
+
32
+ ### Feishu-linked sources
33
+ If task content or localized reference material points to a required Feishu doc/wiki URL that has not been localized yet, fetch it into the task artifact space before continuing.
34
+
35
+ Use:
36
+
37
+ ```bash
38
+ node .agents/skills/feishu-task-inputs/scripts/fetch-feishu-doc.mjs \
39
+ --url <feishu-doc-or-wiki-url> \
40
+ --output-dir <artifactDir>/feishu-reference
41
+ ```
42
+
43
+ After fetching:
44
+ - treat `content.md`, `manifest.json`, and `attachments/` under that output directory as the fact source
45
+ - cite local paths, not the remote Feishu URL
46
+ - do not continue requirement interpretation from an unread remote link
47
+ - if fetch fails and the linked content is required, stop with `Needs Human`
48
+
49
+ ### Fixed content
50
+ The runtime harness information should always define:
51
+ - implementation objective
52
+ - accepted scope
53
+ - explicit non-goals
54
+ - `facts`, `assumptions`, `unknowns`
55
+ - stage order
56
+ - stage completion conditions
57
+ - stage failure conditions
58
+ - validation strategy
59
+ - replanning triggers
60
+ - progress state
61
+ - decision log
62
+
63
+ ### Dynamic content
64
+ The runtime harness information should add task-specific content as needed, such as:
65
+ - business rules
66
+ - edge cases
67
+ - repository surfaces
68
+ - relevant files or modules
69
+ - validation surfaces
70
+ - phase-specific constraints
71
+ - temporary playground or debug route
72
+ - risk notes
73
+
74
+ ### Quality rule
75
+ The runtime harness information must be good enough to guide a subagent safely through the current stage.
76
+
77
+ Do not generate empty ceremony. Generate only records that improve control materially.
78
+
79
+ ## Stage planning
80
+ The main agent should plan in bounded stages, not as one large uninterrupted implementation pass.
81
+
82
+ Each stage should be:
83
+ - narrow enough to review as one unit
84
+ - meaningful enough to produce executable evidence
85
+ - ordered by dependency
86
+
87
+ Each stage should define:
88
+ - `Goal`
89
+ - `Inputs`
90
+ - `Relevant Code Areas`
91
+ - `Constraints`
92
+ - `Done When`
93
+ - `Fail When`
94
+ - `Verification`
95
+ - `Task State To Update`
96
+
97
+ ## Subagent delegation
98
+ Subagents are execution helpers. They do not own the task.
99
+
100
+ ### Delegation rules
101
+ - Delegate at most one active stage at a time.
102
+ - Delegate only the current bounded stage.
103
+ - Do not delegate before the runtime harness information is sufficient for the current stage.
104
+ - Do not let a subagent redefine scope, reinterpret requirement truth, or decide final closure.
105
+
106
+ ### Subagent loop contract
107
+ The main agent should require the subagent to:
108
+ 1. read the current runtime harness information
109
+ 2. execute only the assigned stage
110
+ 3. follow stage constraints
111
+ 4. implement the smallest coherent change set for the stage
112
+ 5. run the declared stage validation
113
+ 6. report evidence, not confidence
114
+ 7. update the required task state
115
+ 8. return control to the main agent
116
+
117
+ ## Stage review gate
118
+ After a delegated stage completes implementation and stage validation, the main agent must review it before any next stage begins.
119
+
120
+ The review must judge:
121
+ - requirement fit
122
+ - code quality
123
+ - validation sufficiency
124
+ - blast radius
125
+ - consistency with runtime harness information
126
+
127
+ ### Review outcomes
128
+ - If the stage passes review, update runtime harness information and move to the next stage.
129
+ - If the stage fails review, return to the same stage for repair.
130
+
131
+ Do not advance with known code defects, requirement drift, or weak stage evidence.
132
+
133
+ ## Replanning
134
+ The main agent must replan before further code changes when any of these become true:
135
+ - requirement meaning changed materially
136
+ - repository mapping was wrong or incomplete
137
+ - blast radius exceeded planned scope
138
+ - the strongest planned validation path failed or disappeared
139
+ - a new architecture decision became necessary
140
+ - current stage boundaries no longer match task reality
141
+
142
+ After replanning, update runtime harness information before more delegation or editing.
143
+
144
+ ## Validation and closure
145
+ The main agent owns validation judgment and closure judgment.
146
+
147
+ ### Validation rules
148
+ - Requirement implementation must be supported by executable behavior evidence.
149
+ - Prefer the strongest feasible proof, not the cheapest one.
150
+ - If stronger proof is blocked, state the blocker and downgrade explicitly.
151
+ - Report exactly one strongest validation token in the final result summary.
152
+
153
+ ### Reliability order
154
+ 1. `V5`: `device_verified`
155
+ 2. `V4`: `simulator_verified`, `scenario_verified`
156
+ 3. `V3`: `regression_tests_passed`, `unit_tests_passed`
157
+ 4. `V2`: `targeted_tests_passed`, `module_build_passed`, `compile_passed`
158
+ 5. `V1`: `code_reviewed`
159
+
160
+ ### Token contract
161
+ - `device_verified`: real device, real business path, expected behavior observed
162
+ - `simulator_verified`: simulator or emulator, real business path, expected behavior observed
163
+ - `scenario_verified`: runnable feature path or near-real scenario executed successfully
164
+ - `regression_tests_passed`: relevant regression or integration checks passed
165
+ - `unit_tests_passed`: real project unit tests covering implemented behavior passed
166
+ - `targeted_tests_passed`: focused executable proof path exercised intended behavior successfully
167
+ - `module_build_passed`: real module or package build passed
168
+ - `compile_passed`: real compile target passed
169
+ - `code_reviewed`: static review only
170
+
171
+ ### Closure floor
172
+ - `Implemented` requires behavior-level executable validation.
173
+ - `Implemented` must not rely only on `code_reviewed` or `compile_passed`.
174
+ - `targeted_tests_passed` is acceptable only when it exercises real intended behavior.
175
+ - Prefer `scenario_verified`, `simulator_verified`, or `device_verified` for user-facing, stateful, or interaction-heavy work.
176
+
177
+ ### Closure outcomes
178
+ - `Implemented`: delivered with credible executable evidence
179
+ - `Implementation Candidate`: credible implementation, stronger validation blocked
180
+ - `Needs Human`: implementation cannot yet be claimed safely because required requirement or repository context is missing
181
+
182
+ Prefer `Implementation Candidate` over a broader, riskier patch assembled only to chase stronger claims.
183
+
184
+ ## Final review pass
185
+ After the main implementation and validation pass, the main agent should run a separate review pass when review risk is material, especially when:
186
+ - code changed materially
187
+ - behavior spans multiple modules or layers
188
+ - completion relies on `targeted_tests_passed`
189
+ - public interfaces, permissions, data shape, or state ownership changed
190
+
191
+ Reviewer input should include:
192
+ - accepted requirement package
193
+ - relevant runtime harness information
194
+ - relevant stage plan or implementation plan
195
+ - changed files or `patch.diff`
196
+ - strongest validation evidence and limits
197
+ - previous reviewer findings when later rounds exist
198
+
199
+ Reviewer output should classify findings as:
200
+ - `Must Fix Before Close`
201
+ - `Risk Accepted`
202
+ - `Open Question`
203
+
204
+ Review in this order:
205
+ 1. requirement alignment
206
+ 2. implementation coherence with repository boundaries
207
+ 3. validation credibility
208
+ 4. unnecessary blast radius or debt
209
+
210
+ ## Delivery artifacts
211
+ The main agent must produce:
212
+ - `result.md`
213
+ - `patch.diff` when code changed
214
+ - `review-log.md` when the reviewer loop ran
215
+
216
+ ### `result.md`
217
+ Keep `result.md` dense and implementation-oriented.
218
+
219
+ It must include:
220
+ 1. `Delivery Summary`
221
+ - `Requirement Alignment`
222
+ - `Implementation`
223
+ - `Validation`
224
+ - `Risk`
225
+ - `Blocking Point`
226
+
227
+ 2. `Implemented Scope`
228
+
229
+ 3. `Validation`
230
+ - strongest token and grade
231
+ - what ran
232
+ - what was directly observed
233
+ - what remains unverified
234
+
235
+ 4. `Prototype or Requirement Deviations`
236
+
237
+ 5. `Recommended Next Action`
238
+
239
+ Include when relevant:
240
+ - completed stages
241
+ - replanned or skipped stages
242
+ - material decisions
243
+ - remaining unknowns
244
+
245
+ ## Runtime contract
246
+ - return exactly one runtime JSON object
247
+ - use `completed` for normal closure, including analysis-only closure
248
+ - use `failed` only for true runtime exceptions
249
+ - `resultPath` must point to exactly one `result.md` under `artifactDir`
250
+ - do not output prose outside the runtime JSON object
251
+
252
+ ### Example
253
+ ```json
254
+ {
255
+ "status": "completed",
256
+ "resultPath": "<artifactDir>/result.md"
257
+ }
258
+ ```
@@ -0,0 +1,13 @@
1
+ {
2
+ "id": "coder",
3
+ "triageRules": [
4
+ "ACCEPT.md"
5
+ ],
6
+ "executionRules": [
7
+ "ROLE.md",
8
+ "CONSTRAINTS.md",
9
+ "CONTEXT.md",
10
+ "STANDARD.md",
11
+ "EVOLUTION.md"
12
+ ]
13
+ }
@@ -15,6 +15,9 @@ Classify as `pm` only when all are true:
15
15
  - the expected output is a prototype artifact, not production code
16
16
  - the task centers on flow, structure, interaction, or state presentation
17
17
  - the prototype can be derived from requirement input without real system implementation
18
+ - the default artifact should read as a product demo, not as a mixed review console
19
+
20
+ Requirement documents, PRDs, Feishu docs, and similar materials are shared inputs. They do not imply `pm` by themselves.
18
21
 
19
22
  Do not classify as `pm` when any are true:
20
23
  - the request is only strategy discussion or product advice
@@ -27,7 +30,9 @@ Do not classify as `pm` when any are true:
27
30
  Accept when all are true:
28
31
  - a usable `requirement_document` exists
29
32
  - at least one concrete goal exists
33
+ - at least one target user or role context is identifiable
30
34
  - at least one concrete flow, page path, or interaction path exists or is clearly derivable
35
+ - critical rules are present at a level that bounds the product behavior truthfully
31
36
  - the prototype scope is bounded enough for one task
32
37
  - the task does not depend on repository coupling or production-system integration
33
38
 
@@ -51,7 +56,9 @@ Use the smallest set that explains rejection:
51
56
  - `requirement_document`
52
57
  - `product_goal`
53
58
  - `target_user`
59
+ - `entry_trigger`
54
60
  - `core_flow`
61
+ - `critical_rules`
55
62
  - `prototype_scope`
56
63
  - `constraints`
57
64
 
@@ -20,11 +20,13 @@ Defines non-negotiable PM execution rules.
20
20
  - when fidelity and prototype convenience conflict, preserve the source fact or declare the deviation explicitly
21
21
  - do not present simulated or inferred detail as confirmed requirement
22
22
  - if trustworthy prototyping would require heavy invention, stop at `Analysis Only`
23
+ - if source screenshots or embedded UI images exist but were not actually accessed, do not claim screenshot-level fidelity or use `Prototype Complete` unless another equally direct visual source was available
23
24
 
24
25
  ## Review discipline
25
26
  - prototype for review, not production deployment
26
27
  - the first screen should read primarily as product UI, not as a prototype console
27
28
  - `prototype.html` default view must contain product UI and interaction only, not delivery commentary
29
+ - `prototype.html` default view must behave like a user-facing demo surface, not a blended product-plus-review workspace
28
30
  - static output alone is insufficient unless closure is `Analysis Only`
29
31
  - independent reviewer subagent judgment is required before claiming `Prototype Complete`
30
32
  - the reviewer is a judge, not a builder
@@ -43,9 +45,12 @@ Defines non-negotiable PM execution rules.
43
45
  - claiming certainty that does not exist
44
46
  - decoration-first output that hides product meaning
45
47
  - persistent `scope`, `exclusions`, `confirmed`, `simulated`, `assumption`, `open_question`, or `truth status` blocks inside the default prototype page unless the source requirement itself explicitly asks for such a panel as product UI
48
+ - reviewer cockpits, operator consoles, debug benches, source reference strips, screenshot galleries, telemetry/event-log panels, ranking tables, or requirement explainer sidebars inside the default prototype page unless the source explicitly defines them as product UI
49
+ - source-gap narration inside product UI such as "the source did not specify", "threshold not provided", or similar review commentary
46
50
  - claiming outputs that were not actually created under `artifactDir`
47
51
  - presenting simulated behavior as faithfully implemented
48
52
  - marking `Prototype Complete` when key rules remain materially weak, merged, or downgraded
53
+ - marking `Prototype Complete` when source screenshots / embedded UI images existed but were unavailable and the prototype relied only on text or alt descriptions for visual reconstruction
49
54
  - treating builder self-review as a substitute for an independent reviewer subagent verdict
50
55
  - fixing a prior reviewer finding by introducing a new blank, near-blank, or materially weakened core panel
51
56
  - treating retained titles, labels, or container chrome as sufficient when the actual intended content expression has disappeared