nimai-core 0.4.6 → 0.4.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1 +1 @@
1
- {"version":3,"file":"prompts.d.ts","sourceRoot":"","sources":["../src/prompts.ts"],"names":[],"mappings":"AAAA;;;;GAIG;AAEH;;;GAGG;AACH,wBAAgB,YAAY,CAAC,OAAO,EAAE,MAAM,EAAE,cAAc,EAAE,MAAM,GAAG,MAAM,CAkC5E;AAED;;;;;;;;;;;;GAYG;AACH,wBAAgB,aAAa,CAAC,WAAW,EAAE,MAAM,GAAG,MAAM,CA6FzD;AAED;;;GAGG;AACH,wBAAgB,YAAY,CAAC,WAAW,EAAE,MAAM,GAAG,MAAM,CAgBxD"}
1
+ {"version":3,"file":"prompts.d.ts","sourceRoot":"","sources":["../src/prompts.ts"],"names":[],"mappings":"AAAA;;;;GAIG;AAEH;;;GAGG;AACH,wBAAgB,YAAY,CAAC,OAAO,EAAE,MAAM,EAAE,cAAc,EAAE,MAAM,GAAG,MAAM,CAkC5E;AAED;;;;;;;;;;;;GAYG;AACH,wBAAgB,aAAa,CAAC,WAAW,EAAE,MAAM,GAAG,MAAM,CAkGzD;AAED;;;GAGG;AACH,wBAAgB,YAAY,CAAC,WAAW,EAAE,MAAM,GAAG,MAAM,CAgBxD"}
package/dist/prompts.js CHANGED
@@ -74,15 +74,15 @@ A verdict without citation is treated as NO_GO — do not assert PASS without ev
74
74
 
75
75
  **Dimensions:**
76
76
 
77
- 1. **Binary acceptance criteria** — are all sub-task ACs measurable and unambiguous? Are any ACs pre-checked (- [x]) in the draft, which is always invalid?
77
+ 1. **Binary acceptance criteria** — are all sub-task ACs measurable and unambiguous? Are any ACs pre-checked (- [x]) in the draft, which is always invalid? If Section 1.1 defines a quantitative quality bar, Section 1.4 must include at least one AC that directly enforces that bar (same metric family and threshold intent). Otherwise, FAIL.
78
78
  2. **Scope coherence** — are in-scope and out-of-scope boundaries clearly stated and non-contradictory? Check for conflicts between conceptual terminology (e.g., state names, entity names used in descriptions) and persisted/modelled representations (e.g., enums, schemas, data shapes). Any mismatch is a HARD_FAIL.
79
79
  3. **Constraint sufficiency** — do Must / Must-Not / Prefer / Escalate constraints cover the key risks?
80
- 4. **Decomposition realism** — can each sub-task be completed within the stated 2-hour limit by a skilled agent? Check that sub-task dependencies are stated explicitly (if task B requires task A's output, that must be documented).
80
+ 4. **Decomposition realism** — can each sub-task be completed within the stated 2-hour limit by a skilled agent? Check that sub-task dependencies are stated explicitly (if task B requires task A's output, that must be documented). If Section 1.1 defines a benchmark/dataset path, at least one sub-task must explicitly own creating or curating that benchmark artifact.
81
81
  5. **Start-without-clarification viability** — can an agent begin immediately with the context provided, without asking the human for more information?
82
- 6. **Internal consistency** — are terms, names, and concepts used consistently throughout the spec? Flag any case where the same entity is described differently in different sections (e.g., "webhook event" in scope, "push notification" in ACs — are these the same thing?).
83
- 7. **Mechanism lock** — does every core flow (data pipeline, primary algorithm, key architecture choice) commit to exactly ONE implementation approach? Scan for "e.g.", "or", "could", "might", "such as" in Deliverable, Scope, Constraints, and Task decomposition sections. If any of these offer multiple options without making an explicit decision, that is a HARD_FAIL — a spec that leaves the mechanism unresolved forces the builder to choose architecture, creating drift in multi-builder workflows.
82
+ 6. **Internal consistency** — are terms, names, and concepts used consistently throughout the spec? Flag any case where the same entity is described differently in different sections (e.g., "webhook event" in scope, "push notification" in ACs — are these the same thing?). Treat data-type/precision mismatches as consistency defects (e.g., float vs decimal/cents representation). Also FAIL if obvious template artifact text remains (example guidance that should have been replaced by project-specific content).
83
+ 7. **Mechanism lock** — does every core flow (data pipeline, primary algorithm, key architecture choice) commit to exactly ONE implementation approach? Scan for "e.g.", "or", "could", "might", "such as" in Deliverable, Scope, Constraints, and Task decomposition sections. If any of these offer multiple options without making an explicit decision, that is a HARD_FAIL — a spec that leaves the mechanism unresolved forces the builder to choose architecture, creating drift in multi-builder workflows. Additional lock checks: (a) external service lock requires concrete vendor/model (env var names alone are not sufficient), (b) auth lock requires token/session type plus expiry policy, (c) endpoint lists without request/response field shapes are SOFT_FAIL.
84
84
  8. **Convergence declaration** — does the spec include a \`## Spec Convergence\` section with \`open_questions: 0\` and \`ambiguities_remaining: 0\`? If the section is absent, or either value is non-zero, or \`ready_for_build\` is "no", that is a HARD_FAIL. This is the formal GO gate — a spec with declared open questions is not agent-ready regardless of how well the other sections are written.
85
- 9. **Adversarial gap scan** — does the spec include an \`## Edge Cases and Failure Modes\` section that is substantively filled? Steelman the spec as a builder who will follow it exactly: what will you hit mid-execution that the spec does not answer? What implicit assumption is baked in but undeclared? What failure mode from the FORGE taxonomy (scope creep, hallucinated completion, intent drift, context collapse, runaway cost, overconfident output) does the spec not address? An absent or blank section is a HARD_FAIL for medium/high risk specs. Placeholder-only content (\`___\`) is treated as absent.
85
+ 9. **Adversarial gap scan** — does the spec include an \`## Edge Cases and Failure Modes\` section that is substantively filled? Steelman the spec as a builder who will follow it exactly: what will you hit mid-execution that the spec does not answer? What implicit assumption is baked in but undeclared? What failure mode from the FORGE taxonomy (scope creep, hallucinated completion, intent drift, context collapse, runaway cost, overconfident output) does the spec not address? For mobile/capture flows, explicitly check offline/no-network behavior at capture time and retry/sync handling. An absent or blank section is a HARD_FAIL for medium/high risk specs. Placeholder-only content (\`___\`) is treated as absent.
86
86
  10. **Architecture completeness** — for medium/high coding specs, does the spec include \`## 1.6 Architecture Lock\` with all required fields resolved: Persistence layer, File/object storage, External model/service + env var, API trigger flow, Entity status state machine, Auth implementation? Any missing field, blank placeholder, or \`TBD\` is a HARD_FAIL. For low/unknown/non-coding specs, this is NOTE-only advisory.
87
87
 
88
88
  ## Severity classification
@@ -96,24 +96,29 @@ A spec passes (passed: true) only if it has zero HARD_FAIL issues.
96
96
 
97
97
  ## Output format
98
98
 
99
- For each dimension, write:
99
+ For each dimension, write exactly:
100
100
  **[PASS|FAIL] Dimension name** — one-sentence rationale. *Evidence: cite the specific section/line/text.*
101
101
 
102
- Then write a consolidated remediation list for all FAIL dimensions.
102
+ Keep it brief: one sentence + one evidence citation per dimension. No preamble, no summaries, no commentary before or after the verdict block.
103
103
 
104
- Note: implementation correctness is explicitly out of scope for this review.
104
+ Then write a consolidated remediation list for all FAIL dimensions.
105
105
 
106
- ## Re-review closure requirement
106
+ After writing the 10 dimensions, run one extra ambiguity sweep:
107
+ - Ask: "Could two competent builders implement this spec in materially different ways while both claiming compliance?"
108
+ - If yes, add those gaps as issues (SOFT_FAIL or HARD_FAIL based on impact) and include exact sections to fix.
109
+ - If no, state one sentence: "No additional ambiguity gaps found in final sweep."
107
110
 
108
- If the user told you this is a re-review (the builder has fixed issues since the last verdict), check ## Review Closure before evaluating any dimension.
109
- Look for a section like:
111
+ Note: implementation correctness is explicitly out of scope for this review.
110
112
 
111
- > ## Review Closure
112
- > | Issue | fixed_in_section | exact_text_changed | proof |
113
+ ## Re-review protocol
113
114
 
114
- **Any prior HARD_FAIL with no closure entry = automatic FAIL.**
115
- Do not evaluate the other dimensions until you confirm closure entries exist for all prior HARD_FAILs.
116
- If this is a first review (no prior verdict), skip this check.
115
+ If this is a re-review (you were given the builder's fix summary):
116
+ - Do NOT require any additional sections in the spec.
117
+ - Do NOT require a Review Closure table or any tracking artifact.
118
+ - Re-evaluate each previously-failed dimension directly against the current spec content.
119
+ - Your evidence citation must quote or reference the specific text in the spec that shows the fix was applied.
120
+ - If the fix is present, mark that dimension PASS. If it is absent or incomplete, mark it FAIL.
121
+ - You are only authorized to evaluate the 10 dimensions defined above. Do not invent new dimensions, process gates, or required sections not in the FORGE spec template.
117
122
 
118
123
  ## Important
119
124
 
@@ -139,7 +144,7 @@ If the spec passes all dimensions with no HARD_FAIL issues, use:
139
144
  {"passed": true, "schema_version": "2", "issues": []}
140
145
  \`\`\`
141
146
 
142
- If and only if \`passed\` is false, immediately append this section after the verdict block:
147
+ If \`issues\` is non-empty (including SOFT_FAIL/NOTE with \`passed: true\`), immediately append this section after the verdict block:
143
148
 
144
149
  ## Paste this to your builder session:
145
150
 
@@ -1 +1 @@
1
- {"version":3,"file":"prompts.js","sourceRoot":"","sources":["../src/prompts.ts"],"names":[],"mappings":";AAAA;;;;GAIG;;AAMH,oCAkCC;AAeD,sCA6FC;AAMD,oCAgBC;AAxKD;;;GAGG;AACH,SAAgB,YAAY,CAAC,OAAe,EAAE,cAAsB;IAClE,OAAO;;;;;;;;;;;;;;;;;;;;;;;;;;;;;EA6BP,cAAc,IAAI,2FAA2F;;;iBAG9F,OAAO,EAAE,CAAC;AAC3B,CAAC;AAED;;;;;;;;;;;;GAYG;AACH,SAAgB,aAAa,CAAC,WAAmB;IAC/C,OAAO;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;EA2FP,WAAW,EAAE,CAAC;AAChB,CAAC;AAED;;;GAGG;AACH,SAAgB,YAAY,CAAC,WAAmB;IAC9C,OAAO;;;;;;;;;;;;;;EAcP,WAAW,EAAE,CAAC;AAChB,CAAC"}
1
+ {"version":3,"file":"prompts.js","sourceRoot":"","sources":["../src/prompts.ts"],"names":[],"mappings":";AAAA;;;;GAIG;;AAMH,oCAkCC;AAeD,sCAkGC;AAMD,oCAgBC;AA7KD;;;GAGG;AACH,SAAgB,YAAY,CAAC,OAAe,EAAE,cAAsB;IAClE,OAAO;;;;;;;;;;;;;;;;;;;;;;;;;;;;;EA6BP,cAAc,IAAI,2FAA2F;;;iBAG9F,OAAO,EAAE,CAAC;AAC3B,CAAC;AAED;;;;;;;;;;;;GAYG;AACH,SAAgB,aAAa,CAAC,WAAmB;IAC/C,OAAO;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;EAgGP,WAAW,EAAE,CAAC;AAChB,CAAC;AAED;;;GAGG;AACH,SAAgB,YAAY,CAAC,WAAmB;IAC9C,OAAO;;;;;;;;;;;;;;EAcP,WAAW,EAAE,CAAC;AAChB,CAAC"}
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "nimai-core",
3
- "version": "0.4.6",
3
+ "version": "0.4.8",
4
4
  "description": "Nimai core library — template loading, lint engine, context extraction. No LLM dependencies.",
5
5
  "keywords": [
6
6
  "nimai",