npm - @sireai/optimus - Versions diffs - 0.1.43 → 0.1.45 - Mend

@sireai/optimus 0.1.43 → 0.1.45

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/task-harnesses/pm/STANDARD.md CHANGED Viewed

@@ -2,6 +2,10 @@
 Defines the mandatory PM execution flow, artifact contract, and delivery standard.
+### Required outputs
+- `prototype.html`: the main review artifact
+- `result.md`: dense rule supplement for downstream implementation
 ## Execution sequence
 Complete work in this order:
@@ -12,25 +16,14 @@ Complete work in this order:
 2. `Map`
 - build a requirement map before designing screens or writing HTML
-- include:
-  - product goal
-  - target user
-  - bounded scope
-  - entry point
-  - core flow
-  - key states
-  - critical rules
-  - explicit non-goals
-  - open questions
+- include: product goal, target user, bounded scope, entry point, core flow, key states, critical rules, explicit non-goals, open questions
 3. `Plan Representation`
 - for each critical rule, choose exactly one:
   - `Represented Interactively`
-  - `Represented via Annotation`
   - `Downgraded / Simulated`
   - `Not Represented`
 - use `Represented Interactively` only when the prototype will contain direct behavioral evidence
-- use `Represented via Annotation` when the rule is review-critical but not faithfully expressible through lightweight interaction
 4. `Frame`
 - define the minimum screen and state set that makes the accepted scope reviewable
@@ -40,27 +33,15 @@ Complete work in this order:
 5. `Build`
 - produce one reviewable `prototype.html`
 - make the main path, major states, and key transitions inspectable
-- add anchored review annotations where interaction is insufficient
+- keep `prototype.html` focused on product interaction; move scope, exclusion, truth-status, simulation, and open-question commentary into `result.md` unless the source explicitly defines them as in-product UI
 6. `Review`
 - explicitly spawn one reviewer subagent after the first build
-- reviewer input must include:
-  - source requirement document
-  - source URL when available
-  - accepted scope
-  - requirement map
-  - representation plan
-  - `prototype.html` when present
-  - `result.md`
-  - `annotations.json` when present
-  - previous review findings and builder revisions when this is round 2 or 3
-  - current round number and maximum round count
+- reviewer input must include the source requirement document, source URL when available, accepted scope, requirement map, representation plan, `prototype.html` when present, `result.md`, current round number, and maximum round count
+- include previous reviewer findings and builder revisions in rounds 2 and 3
 - reviewer must judge against the reviewer rubric defined below
-- record one review round log entry containing:
-  - round number
-  - reviewer verdict
-  - key gaps
-  - recommended revisions or explicit approval
+- record one review round log entry with round number, reviewer verdict, key gaps, and recommended revisions or explicit approval
+- if reviewer subagent invocation fails, log the failure, skip the reviewer loop for this run, and continue normal artifact finalization; do not hang waiting for the reviewer path to recover
 7. `Revise`
 - if the reviewer finds material gaps, revise the prototype and rerun the reviewer subagent
@@ -70,74 +51,30 @@ Complete work in this order:
   - another revise-and-review pass is unlikely to improve trustworthiness materially
 - if the final reviewer verdict still finds material gaps after the maximum rounds, downgrade closure instead of looping further
 - the builder must read the latest reviewer output before revising
-- each new review round must include:
-  - the previous reviewer findings
-  - what the builder changed
-  - what the builder intentionally did not change and why
+- each new review round must include the previous reviewer findings, what changed, and what was intentionally not changed and why
 8. `Deliver`
 - generate `result.md`
-- generate `annotations.json` whenever a prototype exists
 - generate `review-log.md` whenever the reviewer loop ran
 - declare deviations, simulations, and unresolved implementation-critical points precisely
 If execution stops early, explain which step could not be completed and why.
-## Review loop standard
-- the reviewer subagent must be explicitly requested; do not assume it will appear automatically
-- the reviewer must not rewrite the prototype directly; it judges and recommends
-- the builder applies revisions and owns the updated artifacts
-- provide the reviewer with facts first, not builder conclusions first
-- the reviewer should treat the source requirement and produced artifacts as primary evidence; builder framing is supporting context only
-- reviewer findings do not justify unlimited scope growth, denser chrome, or extra speculative screens by default
-- each review round should classify every material finding as one of:
+## Reviewer protocol
+- the reviewer subagent must be explicitly requested; it is a judge, not a builder
+- provide facts first: source requirement, source URL when available, accepted scope, produced artifacts, then builder framing
+- required reviewer inputs: requirement document, source URL when available, accepted scope, requirement map, representation plan, `prototype.html` when present, `result.md`, current round number, maximum rounds, and prior findings/revisions for rounds 2 and 3
+- reviewer findings do not justify unlimited scope growth, denser chrome, or speculative screens by default
+- each review round must classify material findings as:
   - `Must Fix Before Complete`
   - `Can Ship As Partial`
   - `Open Question`
-- examples of `Must Fix Before Complete`:
-  - missing core flow coverage
-  - important requirement content is omitted, misunderstood, or inconsistent with the source document
-  - missing or materially wrong key rule representation
-  - deeper business or interaction logic is only described textually and not demonstrated clearly enough through the prototype
-  - explicit source fact drift that changes review interpretation
-  - simulated behavior presented as if it were faithful interaction
-  - annotation layer and `annotations.json` disagree on important meaning
-  - the artifact set is too weak to guide downstream AI coding without major guessing about rules, state behavior, or implementation intent
-  - UI quality is materially below design-review expectations, especially when the requirement document already contains screenshots, diagrams, or strong structural visual cues that were not meaningfully used
-- examples of `Can Ship As Partial`:
-  - useful prototype exists, but some non-core states remain merged or downgraded
-  - some important but secondary rules remain annotation-only or simulated
-  - the core design direction is usable, but parts of the visual system or interaction polish still fall below designer-grade expectations
-- examples of `Open Question`:
-  - requirement ambiguity prevents stronger representation without invention
-## Reviewer input contract
-Each reviewer round should receive these inputs in this priority order:
-1. `Source facts`
-- requirement document
-- source URL when available
-- explicit accepted scope
-2. `Produced artifacts`
-- `prototype.html` when present
-- `result.md`
-- `annotations.json` when present
-3. `Builder framing`
-- requirement map
-- representation plan
-- declared deviations, simulations, and unresolved points
-4. `Round context`
-- current round number
-- maximum rounds: `3`
-- previous reviewer findings and builder revisions for rounds 2 and 3
-## Reviewer output contract
-Each reviewer round must produce a compact review record that the builder can consume directly.
-Minimum fields:
+- `Must Fix Before Complete` covers missing or wrong core flow/rule coverage, source inconsistency, weak logic demonstration, misleading simulation, artifact weakness that blocks downstream AI coding, or materially subpar UI fidelity against available source cues
+- `Can Ship As Partial` covers useful but incomplete review artifacts: merged non-core states, simulated secondary rules, or design quality gaps that do not destroy core reviewability
+- `Open Question` covers requirement ambiguity that cannot be resolved without invention
+## Reviewer record
+Each review round must produce a compact record with:
 - `Round`
 - `Reviewer Verdict`
 - `Coverage Assessment`
@@ -156,60 +93,48 @@ Allowed `Reviewer Verdict` values:
 - `Can Ship As Partial`
 - `Approved for Prototype Complete`
-## Builder handoff contract
-After each review round, the builder must:
-- read the full reviewer output
-- revise the prototype and companion artifacts when `Must Fix Before Complete` findings exist
-- preserve unresolved but non-blocking findings inside `result.md` when closure remains partial
-- run a full core-panel self-check before requesting the next review, not only a point fix for the previous `Must Fix`
-- prefer downgrade over another revise round when the next change would require speculative product invention, materially larger scope, or a noisier prototype that reduces reviewability
-- write a concise round handoff into `review-log.md` containing:
-  - what was changed
-  - what was intentionally left unchanged
-  - why the next review should or should not still expect a gap
-## Reviewer rubric
-The reviewer must evaluate the prototype against these principles in order:
+## Review judgment order
+The reviewer must judge in this order:
 1. `Requirement completeness and correctness`
-- the interactive prototype must reflect the requirement document fully enough for the accepted scope
-- important requirement content must not be missing, misunderstood, or inconsistent
+- accepted-scope content must be present, correctly understood, and consistent with the source
 2. `Logic demonstrability`
-- deeper logic chains, dependencies, and cause-effect behavior must be demonstrated clearly
-- the prototype must not stop at first-layer description when the requirement meaning depends on richer interaction logic
+- deeper dependencies and cause-effect behavior must be demonstrated, not left at first-layer description
 3. `Representation correctness`
-- interaction, annotation, simulation, and omission must each be used deliberately and truthfully
-- if a rule needs to be understood for review, the prototype must either demonstrate it or annotate it clearly
-- avoid reviewer-driven overbuilding: do not add extra flows, fake data breadth, or decorative complexity unless they directly improve requirement understanding
+- interaction, simulation, and omission must be deliberate and truthful
+- if a rule cannot be shown faithfully, declare the gap in `result.md`
+- avoid reviewer-driven overbuilding that adds decorative or speculative complexity without improving requirement understanding
 4. `Implementation readiness for AI coding`
-- the artifact set must give downstream AI coding enough rule context, implementation guardrails, and annotation coverage to avoid major guessing
-- `result.md` and `annotations.json` must supplement the prototype where direct interaction is insufficient
-5. `UI design quality and visual fidelity`
-- the prototype should reach designer-grade quality for its intended fidelity level
-- visual style, layout cues, screenshots, diagrams, and structural hints from the requirement document should be used as first-priority guidance when present
-- avoid generic or template-looking UI when the source document provides stronger visual direction
-- every core panel must remain visibly populated and reviewable after each revision; do not accept a panel that retains only a title, labels, or empty chrome while its intended information carrier disappears
-- for `Represented Interactively` modules, visible content must remain materially present:
-  - charts must still show visible bars, lines, points, or equivalent graphical carriers rather than only captions or axes
-  - explanatory panels such as scope / exclusion / guardrail areas must retain enough information density to communicate their product meaning, not collapse into thin placeholder copy
-- if a revision fixes one issue but causes another panel to become visually blank, materially thinner, or harder to interpret, treat it as a new blocking regression
-6. `Artifact consistency`
-- `prototype.html`, `result.md`, and `annotations.json` must agree on key rules, limitations, and truth status
-7. `Regression control after revision`
-- every later review round must re-check the whole accepted prototype surface, not only the previous `Must Fix` list
-- the reviewer must explicitly check for:
-  - newly blank or near-blank core panels
-  - graph containers whose DOM exists but whose intended visual content is no longer meaningfully visible
-  - explanatory regions whose copy became materially thinner or lost review-critical meaning
-  - regressions where a previously acceptable panel becomes weaker after a later fix
-- if such regressions appear, record them under `New Regression Since Previous Round` and classify them as `Must Fix Before Complete` when they reduce reviewability materially
-- if a proposed follow-up revision would mainly add complexity, speculative branches, or visual clutter without improving requirement truth, prefer `Prototype Partial` over further churn
+- the artifact set must give downstream AI coding enough rule context and guardrails to avoid major guessing
+- `result.md` must supplement the prototype where interaction alone is insufficient
+5. `UI fidelity and panel visibility`
+- use screenshots, diagrams, layout cues, and structural hints from the source as first-priority guidance when present
+- the prototype should read as product UI, not a delivery brief or review console
+- every core panel must stay visibly populated and reviewable after each revision
+- for `Represented Interactively` modules, intended visual carriers must remain materially visible
+  - charts still need visible bars, lines, points, or equivalent graphical carriers
+  - product-facing explanatory regions defined by the source UI must not collapse into thin placeholder copy
+- if one fix makes another panel visually blank, materially thinner, or harder to interpret, treat that as a new blocking regression
+6. `Artifact consistency and regression control`
+- `prototype.html` and `result.md` must agree on key rules, limitations, and truth status
+- later rounds must re-check the whole accepted surface, not only the previous `Must Fix` list
+- explicitly check for newly blank panels, invisible graph content, materially weakened product-facing explanatory regions, and regressions where a previously acceptable panel becomes weaker after a later fix
+- record material regressions under `New Regression Since Previous Round`
+- prefer `Prototype Partial` over further churn when the next revision would mainly add complexity, speculative branches, or visual clutter without improving requirement truth
+## Builder response to review
+After each review round, the builder must:
+- read the full reviewer output
+- revise the prototype and companion artifacts when `Must Fix Before Complete` findings exist
+- preserve unresolved but non-blocking findings in `result.md` when closure remains partial
+- run a full core-panel self-check before requesting the next review, not only a point fix for the previous `Must Fix`
+- prefer downgrade over another revise round when the next change would require speculative invention, materially larger scope, or a noisier prototype that reduces reviewability
+- write a concise handoff into `review-log.md` containing what changed, what stayed unchanged, and why the next review should or should not still expect a gap
 ## Closure policy
 - decide closure only after the final review round
@@ -234,41 +159,14 @@ The reviewer must evaluate the prototype against these principles in order:
 ## Prototype standard
 - `prototype.html` is the main review artifact
+- `prototype.html` is not the place for delivery portal content or truth-status summaries
 - the prototype must make the main user flow inspectable
 - prefer depth on the core path over shallow breadth
 - support meaningful review actions such as navigation, state changes, or key transitions
-- keep the prototype understandable when annotations are hidden or minimized
 - keep core panels visibly reviewable across revisions
 - do not count a panel as interactively represented when its main carrier is visually missing, nearly invisible, or replaced by bare labels only
 - if the accepted scope includes charted data, preserve visible graphical expression rather than text-only scaffolding
-## Annotation standard
-- use a dedicated review mode when annotation density would otherwise disrupt normal reading
-- keep review-only controls secondary, collapsible, or minimized
-- assign stable target ids
-- recommended target attribute: `data-pm-target="<id>"`
-- recommended annotation attribute: `data-pm-annotation="<id>"`
-- each annotation should have one primary target whenever possible
-- clicking an annotation should focus its target; clicking a target affordance should reveal the linked annotation
-- use stronger linkage such as numbered mapping or connector lines only when it improves readability
-### Truth semantics
-- `Confirmed`: faithful to the source requirement
-- `Simulated`: represented approximately for review
-- `Open Question`: unresolved and review-relevant
-### Good annotation targets
-- server-side rules that materially affect UX
-- formulas, thresholds, ordering, permissions, and gating
-- approximations that could be mistaken as final behavior
-- unresolved behavior reviewers must notice
-### Bad annotation targets
-- long PRD excerpts without interpretation
-- screen facts already obvious from the UI
-- visual design commentary with no product impact
-- missing core flows that should have been prototyped directly
 ## Runtime result contract
 - return exactly one runtime JSON object
 - normal completion and analysis-only closure both return `completed`
@@ -287,10 +185,10 @@ The reviewer must evaluate the prototype against these principles in order:
 ## Artifact contract
 - always generate `result.md` on normal completion
 - generate `prototype.html` for `Prototype Complete` and `Prototype Partial`
-- generate `annotations.json` whenever `prototype.html` is generated
 - generate `review-log.md` when the PM review loop ran
 - use repository-independent artifact paths
 - `result.md` is the only framework-required result artifact
+- put scope boundaries, exclusions, simulations, assumptions, truth-status notes, and open questions in `result.md`, not in the default prototype page
 - task-private files are optional and must support review directly
 ## Feishu delivery document
@@ -298,7 +196,6 @@ The reviewer must evaluate the prototype against these principles in order:
 - treat the Feishu delivery document as a delivery portal, not as a second PRD
 - detailed interaction belongs in `prototype.html`
 - rule supplements belong in `result.md`
-- structured annotation handoff belongs in `annotations.json`
 ### Required structure
 1. `需求文档`
@@ -309,95 +206,6 @@ The reviewer must evaluate the prototype against these principles in order:
 - keep artifact purpose attached to the artifact itself
 - do not duplicate artifact content in the delivery document
-## `annotations.json` contract
-- purpose: machine-readable export of anchored review annotations
-- paired artifact: normally `prototype.html`
-- top-level fields:
-  - `version`
-  - `artifact`
-  - `annotations`
-- `annotations` may be empty, but must exist when the file is generated
-### Each annotation record must include
-- `id`
-- `target`
-- `type`
-- `status`
-- `priority`
-- `title`
-- `content`
-### `target` must include
-- `elementId`
-- optional `page`
-- optional `anchor`: `top` | `right` | `bottom` | `left` | `inline` | `state`
-### Recommended enums
-- `type`:
-  - `interaction`
-  - `state`
-  - `calculation`
-  - `data_rule`
-  - `edge_case`
-  - `empty_state`
-  - `error_state`
-  - `permission`
-  - `out_of_scope`
-  - `open_question`
-- `status`:
-  - `confirmed`
-  - `assumption`
-  - `simulated`
-  - `open_question`
-- `priority`:
-  - `high`
-  - `medium`
-  - `low`
-### Optional fields
-- `secondaryTargets`
-- `structured`
-- `sourceRefs`
-- `implementationImpact`
-- `reviewerNote`
-### Generation rules
-- create one record per review-meaningful annotation, not per DOM node
-- if a rule is represented via annotation in the requirement map, it must appear here
-- keep truth status aligned with the prototype
-- use `structured` when formulas, state keys, limits, or forbidden interpretations matter
-### Example
-```json
-{
-  "version": "1.0",
-  "artifact": "prototype.html",
-  "annotations": [
-    {
-      "id": "ann-load-rate-all-rule",
-      "target": {
-        "page": "power-dashboard",
-        "elementId": "load-rate-trend-chart",
-        "anchor": "right"
-      },
-      "type": "calculation",
-      "status": "confirmed",
-      "priority": "high",
-      "title": "All Load Rate calculation rule",
-      "content": "All Load Rate must be calculated as total power divided by total capacity. Average-based calculation is not allowed.",
-      "structured": {
-        "formula": "sum(room.power) / sum(room.capacity) * 100%",
-        "forbiddenFormula": "avg(room.loadRate)"
-      },
-      "implementationImpact": "Affects trend aggregation logic, KPI cards, and chart tooltip consistency.",
-      "sourceRefs": [
-        "Load Rate Trend > All view"
-      ]
-    }
-  ]
-}
-```
 ## `result.md` contract
 Keep `result.md` dense, implementation-oriented, and in Chinese unless the task explicitly requires another language.
@@ -438,7 +246,7 @@ Keep `result.md` dense, implementation-oriented, and in Chinese unless the task
 # 闭环判定
 - Closure Level：Prototype Partial
-- 判定依据：核心路径可评审，但历史场景下 Power 统计口径仍未明确，且部分规则仍以 annotation 承载。
+- 判定依据：核心路径可评审，但历史场景下 Power 统计口径仍未明确，且部分规则仍未能通过交互完全表达。
 - 主要剩余问题：历史 Power KPI 的精确定义仍是 Open Question。
 ```
@@ -462,13 +270,13 @@ Keep `result.md` dense, implementation-oriented, and in Chinese unless the task
 ## Round 1
 - Reviewer Verdict: Must Fix Before Complete
 - Must Fix:
-  - `All Load Rate` 计算口径未在原型或 annotation 中明确。
+  - `All Load Rate` 计算口径未在原型或 `result.md` 中明确。
 - Can Ship As Partial:
   - 历史 Power KPI 口径仍可暂时保留为开放问题。
 - Open Questions:
   - `Last 90 Days` 聚合粒度是否按月展示仍待确认。
 - Builder Action:
-  - 新增 `All Load Rate` 计算口径 annotation，并调整结果说明。
+  - 补强原型交互表达，并在 `result.md` 中补充规则说明。
 ```
 ## Final self-check
@@ -477,7 +285,6 @@ Keep `result.md` dense, implementation-oriented, and in Chinese unless the task
 - final closure matches the last reviewer verdict
 - every material key rule has a declared representation mode
 - any remaining downgraded or missing behavior is disclosed in `result.md`
-- `annotations.json` matches the actual annotation layer in `prototype.html`
 - `result.md` contains no delivery-index or artifact-list sections
 - source facts are preserved or deviations are explicitly declared
 - sample data is not presented as source truth