nimai-mcp 0.1.0 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Balagopalaji
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,277 @@
1
+ # FORGE Quick Reference
2
+ > **You are here:** This is the operational tool — the doc an agent uses during active work.
3
+ >
4
+ > **The other docs in this system:**
5
+ > - **Canonical Framework** (`FORGE-canonical.md`) — deep explanation of every concept. Go there to understand *why* something works, not just *what* to do.
6
+ > - **Spec Template** (`FORGE-spec-template.md`) — fill-in-the-blank form for a concrete project. Go there when you have a direction and need to turn it into an agent-ready brief.
7
+
8
+ ---
9
+
10
+ ## Agent Routing: What To Do Based On The Request
11
+
12
+ *Read the request and match it to a route before doing anything else.*
13
+
14
+ | If the request is… | Route |
15
+ |---|---|
16
+ | A new idea or creative problem — no direction yet | Run **Divergence → Convergence Loop** below. Stay in this doc. |
17
+ | A loose or vague request that needs structuring | Run **Self-Spec Agent** prompt (see below) to generate a draft spec. |
18
+ | A specific project ready to execute | Go to **Spec Template** → fill it out → deploy. |
19
+ | How to approach or frame something | Use **4 Layers + 5 Primitives** below. Deeper detail in Canonical doc. |
20
+ | Reviewing, validating, or debugging existing work | Run **Failure Mode Taxonomy** red-team below. |
21
+ | Something feels wrong mid-project | Check **Failure Mode Taxonomy** first, then check Intent Layer for drift. |
22
+ | A concept here that needs deeper explanation | Go to **Canonical Framework** for the full treatment. |
23
+ | Building a multi-agent or long-running system | Go to **Canonical Framework** → Execution Architecture + Extended Patterns. |
24
+
25
+ **Default rule:** Start here. Go to Canonical to understand. Go to Spec Template to execute.
26
+
27
+ ---
28
+
29
+ ## The Deeper Thesis
30
+ Agents fail not from lack of intelligence but from **underspecified control surfaces**. The four control surfaces are specification, intent, context, and prompt. Engineer all four.
31
+
32
+ ---
33
+
34
+ ## The 4 Engineering Disciplines / Control Surfaces
35
+
36
+ | Discipline | Control Surface | Skip It And… |
37
+ |---|---|---|
38
+ | **Prompt Craft** | Sub-task trigger | Agent doesn't know what to do right now |
39
+ | **Context Engineering** | Information environment | Agent works from noise or stale data |
40
+ | **Intent Engineering** | Values, trade-offs, deployment purpose | Agent resolves ambiguity the wrong way |
41
+ | **Specification Engineering** | Complete blueprint for "done" | Agent can't define done and never stops |
42
+
43
+ ---
44
+
45
+ ## The 4 Layers (in order of leverage)
46
+
47
+ | Layer | Role | Anti-Pattern |
48
+ |---|---|---|
49
+ | **1. Specification** | Defines exact "done" state + scope | Vague goals ("improve this") |
50
+ | **2. Intent** | Agent's deployment purpose + trade-off hierarchy | Agent doesn't know its own role |
51
+ | **3. Context** | Curated information environment | Dumping everything you have |
52
+ | **4. Prompt** | Fires sub-task + assigns cognitive mode | Over-engineering this layer |
53
+
54
+ ---
55
+
56
+ ## The 5 Primitives (every sub-task must pass all 5)
57
+
58
+ 1. **Self-Contained** — Zero questions needed to start
59
+ 2. **Constrained** — Must / Must-Not / Prefer / Escalate defined
60
+ 3. **Modular** — Under 2 hours; independently verifiable
61
+ 4. **Acceptance Criteria** — Binary, measurable (not "looks good")
62
+ 5. **Evaluation** — Built-in check before reporting complete
63
+
64
+ ---
65
+
66
+ ## Pre-Execution Decisions (set BEFORE decomposition)
67
+
68
+ **Risk Tier:**
69
+ | Tier | Validation |
70
+ |---|---|
71
+ | Low | Self-check only |
72
+ | Medium | Validator pass |
73
+ | High | Validator + Adversarial Reflection + Human gate |
74
+
75
+ **Resource Governance:** Model tier · Max runtime · Cost budget · Retry limit · Cost escalation trigger
76
+
77
+ ---
78
+
79
+ ## Cognitive Modes (reasoning postures, not personas)
80
+
81
+ | Mode | Use When |
82
+ |---|---|
83
+ | **Deterministic** | Implementation; answer is knowable |
84
+ | **Exploratory** | Research, brainstorming, hypothesis generation |
85
+ | **Adversarial** | Security, risk, stress-testing |
86
+ | **Synthesis** | Integrating multiple outputs |
87
+ | **Audit** | Validation only — do not produce |
88
+
89
+ **Over-specification warning:** Constrain evaluation, not divergence. In Exploratory phases, tight constraints on generation defeat the purpose. Reserve hard constraints for Adversarial and Deterministic phases.
90
+
91
+ **Mode × Risk compatibility:** Exploratory mode is a pre-commitment mode. High risk tasks should be in Deterministic or Audit by final delivery. Exploratory + High risk as a final-delivery combination is a design error.
92
+
93
+ ---
94
+
95
+ ## Divergence → Convergence Loop (brainstorming)
96
+
97
+ ```
98
+ Exploratory → Synthesis → Adversarial → Deterministic
99
+ DIVERGE CLUSTER STRESS-TEST FORMALIZE
100
+ ```
101
+ Generate wide → cluster patterns → attack top candidates → execute selected direction.
102
+
103
+ ---
104
+
105
+ ## Failure Mode Taxonomy (red-team before deployment)
106
+
107
+ | Failure | Cause | Fix |
108
+ |---|---|---|
109
+ | Scope Creep | Ambiguous boundaries | Explicit Must-Not list |
110
+ | Hallucinated Completion | Subjective criteria | Binary acceptance criteria |
111
+ | Intent Drift | Unclear trade-offs or role | Ranked priorities + deployment purpose |
112
+ | Context Collapse | Too much noise | Aggressive curation; MCP for live sources |
113
+ | Runaway Cost | No resource ceiling | Hard caps before decomposition |
114
+ | Overconfident Output | No uncertainty surfacing | Uncertainty reporting for high-stakes tasks |
115
+
116
+ *If a failure doesn't fit these categories, extend the taxonomy — log it, identify root cause, add prevention.*
117
+
118
+ ---
119
+
120
+ ## Intent = Deployment Purpose + Trade-offs
121
+ Every agent must know: **what it is · what it isn't · who consumes its output · what to optimize for when uncertain**
122
+
123
+ ## Adversarial Reflection (Medium/High risk)
124
+ Produce → steelman critique → evaluate → revise → re-validate
125
+
126
+ ## Uncertainty Reporting (non-deterministic domains)
127
+ Confidence % · uncertainty drivers · what would change the answer · alternative interpretations
128
+
129
+ ## Escalation Contract
130
+ WHEN to stop · WHAT to report (include confidence) · WHO decides (named + SLA) · WHAT if no response
131
+
132
+ ## Context Hygiene
133
+ 2k high-signal tokens > 20k low-signal tokens · MCP for live sources · specify when to re-fetch
134
+
135
+ ---
136
+
137
+ ## When To Use This Framework
138
+
139
+ The overhead should be proportional to the cost of failure. Use this as your guide:
140
+
141
+ | Situation | What to use |
142
+ |---|---|
143
+ | Task is reversible, low-stakes, under 1 hour | Routing table above + nothing else. Just start. |
144
+ | Task is consequential or takes more than 1 hour | Quickref checklist minimum. Self-Spec agent if unsure. |
145
+ | Task is high-risk, irreversible, or affects others | Full Spec Template + Adversarial Reflection + Validator |
146
+ | You're not sure | Ask: "What's the cost if this goes wrong?" Scale accordingly. |
147
+
148
+ **The philosophy:** This framework exists to absorb any project — coding, science, writing, business, anything — and give it the structure needed to succeed without fail. Use as much or as little as the project demands. The principles apply universally; the overhead scales with stakes.
149
+
150
+ ---
151
+
152
+ ## Solo Operator Mode
153
+
154
+ If you are working alone without a team, validators, or named escalation contacts, adapt as follows:
155
+
156
+ **Escalation contract:** You are the reviewer. Set a time delay instead of an SLA — *"I will not proceed on this decision for 24 hours."* Distance and time create the same circuit-break that a second person would.
157
+
158
+ **Validation:** For Medium risk, run the Adversarial Reflection yourself or have a second agent session do it cold — paste the output into a fresh context with no prior history and ask it to find flaws.
159
+
160
+ **High risk solo:** Add a mandatory pause before any irreversible action. Write out the decision, the alternatives you considered, and your confidence level before executing. This is your human gate.
161
+
162
+ **The Self-Spec + Reviewer pipeline replaces team infrastructure** — see below.
163
+
164
+ ---
165
+
166
+ ## Self-Spec Agent + Reviewer Pipeline
167
+
168
+ This is the solo operator's complete workflow. Two prompts replace an entire team process.
169
+
170
+ ---
171
+
172
+ ### Prompt 1 — Self-Spec Agent
173
+ *Use when you have a loose idea or vague request and need it turned into a complete spec.*
174
+
175
+ Paste this prompt to any capable model along with your loose request:
176
+
177
+ ```
178
+ You are a Specification Engineering agent operating under the FORGE.
179
+
180
+ Your job is to take the loose request below and generate a complete draft spec
181
+ using the framework's structure. Do not execute the request — only spec it.
182
+
183
+ For each section, fill in what you can infer and mark anything uncertain
184
+ with [NEEDS HUMAN INPUT: reason].
185
+
186
+ Generate:
187
+ 1. Final deliverable (precise, format, measurable quality bar)
188
+ 2. Scope boundaries (in / out)
189
+ 3. Agent deployment purpose (what it is, is not, who consumes output)
190
+ 4. Trade-off hierarchy (ranked: accuracy / speed / cost / safety / other)
191
+ 5. Constraint architecture (Must / Must-Not / Prefer / Escalate)
192
+ 6. Task decomposition (sub-tasks under 2 hours, with acceptance criteria)
193
+ 7. Risk tier (Low / Medium / High with reasoning)
194
+ 8. Cognitive mode per sub-task
195
+ 9. Context needed (what the executing agent requires)
196
+ 10. Proposed validator prompt (what a reviewer should check)
197
+
198
+ Loose request: [PASTE YOUR REQUEST HERE]
199
+ ```
200
+
201
+ Review the output. Resolve all [NEEDS HUMAN INPUT] flags. Adjust anything that
202
+ doesn't match your intent. Approved spec = your deployment brief.
203
+
204
+ ---
205
+
206
+ ### Prompt 2 — Reviewer / Validator Prompt Generator
207
+ *Run this after Self-Spec is approved. Paste along with the approved spec.*
208
+
209
+ ```
210
+ You are a Specification Engineering agent.
211
+
212
+ Given the approved spec below, generate a Reviewer Prompt — precise instructions
213
+ for a validator agent or solo reviewer to check the executing agent's output.
214
+
215
+ The Reviewer Prompt must:
216
+ - State exactly what is being checked and why
217
+ - List binary pass/fail criteria from the spec's acceptance criteria
218
+ - Include Adversarial Reflection sequence if risk tier is Medium or High
219
+ - Include uncertainty reporting requirements if domain is non-deterministic
220
+ - Specify what PASS looks like and what FAIL triggers (revise / escalate / abort)
221
+ - Be usable by a solo operator with no additional context
222
+
223
+ Approved spec: [PASTE APPROVED SPEC HERE]
224
+ ```
225
+
226
+ ---
227
+
228
+ ### The Complete Solo Pipeline
229
+
230
+ ```
231
+ Loose request
232
+
233
+ Self-Spec Agent → draft spec
234
+
235
+ Human reviews + resolves [NEEDS HUMAN INPUT] flags
236
+
237
+ Approved spec → Reviewer Prompt Generator → validator prompt
238
+
239
+ Deploy executing agent with approved spec
240
+
241
+ Paste output + validator prompt into fresh agent session
242
+
243
+ PASS → done | FAIL → revise or escalate
244
+ ```
245
+
246
+ ---
247
+
248
+ ## Pre-Deployment Checklist
249
+
250
+ **Specification**
251
+ - [ ] Deliverable precise; scope boundaries explicit
252
+ - [ ] Sub-tasks satisfy all 5 Primitives
253
+
254
+ **Intent**
255
+ - [ ] Each agent has explicit deployment purpose
256
+ - [ ] Trade-off hierarchy ranked; escalation contract complete
257
+
258
+ **Context**
259
+ - [ ] Context curated; MCP for live sources; freshness strategy set
260
+
261
+ **Execution** *(set before decomposition)*
262
+ - [ ] Risk tier assigned
263
+ - [ ] Resource Governance parameters set
264
+ - [ ] Cognitive mode per sub-task assigned
265
+ - [ ] Validation routes by risk tier
266
+ - [ ] Adversarial Reflection for medium/high risk
267
+ - [ ] Uncertainty reporting for non-deterministic domains
268
+
269
+ **Meta**
270
+ - [ ] Spec red-teamed against Failure Mode Taxonomy
271
+ - [ ] Planner execution plan will be saved as artifact
272
+ - [ ] Living Specification log set up (multi-session)
273
+
274
+ ---
275
+
276
+ *FORGE v1.0 — Framework for Orchestrating Reliable Generative Execution*
277
+ *System docs: Quick Reference (this doc) · Canonical Framework · Spec Template*
@@ -0,0 +1,376 @@
1
+ # FORGE Spec Template
2
+ > **You are here:** This is the Spec Template — the execution tool. Use it to turn a project direction into a complete agent-ready brief.
3
+ >
4
+ > **The other docs in this system:**
5
+ > - **Quick Reference** (`FORGE-quickref.md`) — start here if you don't have a project direction yet. If the request is a new idea or open-ended, go to the Quick Reference and run the Divergence → Convergence Loop or the Self-Spec Agent first. Come back here once you have a chosen direction.
6
+ > - **Canonical Framework** (`FORGE-canonical.md`) — go there if you encounter a concept in this template you don't fully understand. Each section maps to a section in the canonical doc.
7
+ >
8
+ > **When to be in this doc:** You have a specific project or direction. You are ready to define it completely before handing it to an agent. If you can't fill in a field, that is an unresolved decision — resolve it here, not mid-run.
9
+ >
10
+ > **Solo operator shortcut:** Instead of filling this manually, use the Self-Spec Agent prompt in the Quick Reference to generate a draft, then review and approve it here.
11
+
12
+ ---
13
+
14
+ > Fill in every field before handing to an agent. A blank field is an unresolved decision — resolve it here, not mid-run.
15
+ > *Based on the FORGE*
16
+
17
+ ---
18
+
19
+ ## 0. Pre-Flight Decisions
20
+ > These must be set first. Everything else is built on top of them.
21
+
22
+ **Risk Tier:** [ ] Low [ ] Medium [ ] High
23
+ *(If unsure, go one tier higher)*
24
+
25
+ **Primary Cognitive Mode for this project:**
26
+ [ ] Deterministic [ ] Exploratory [ ] Adversarial [ ] Synthesis [ ] Audit
27
+ [ ] Multi-phase — using Divergence → Convergence Loop (see Section 6)
28
+ *→ Unsure which mode? See Cognitive Mode table in Quick Reference or Canonical doc.*
29
+
30
+ **Resource Governance:**
31
+ - Model tier (Planner): `_______________`
32
+ - Model tier (Workers): `_______________`
33
+ - Max runtime per sub-task: `_______________`
34
+ - Total compute / cost budget: `_______________`
35
+ - Retry limit before escalation: `_______________`
36
+ - Cost threshold that triggers stop-and-report: `_______________`
37
+
38
+ ---
39
+
40
+ ## 1. Specification Layer — The Blueprint
41
+
42
+ ### 1.1 Final Deliverable
43
+ *Describe precisely. Not "a good analysis" — describe the exact artifact, format, and measurable quality.*
44
+
45
+ ```
46
+ Deliverable: _______________________________________________
47
+
48
+ Format: ___________________________________________________
49
+
50
+ Length / Size: _____________________________________________
51
+
52
+ Measurable quality bar: ____________________________________
53
+ ```
54
+
55
+ ### 1.2 Scope Boundaries
56
+ *Be explicit. Ambiguity here becomes scope creep mid-run.*
57
+
58
+ **In scope:**
59
+ - `_______________`
60
+ - `_______________`
61
+ - `_______________`
62
+
63
+ **Out of scope (Must-Not):**
64
+ - `_______________`
65
+ - `_______________`
66
+ - `_______________`
67
+
68
+ ### 1.3 Task Decomposition
69
+ *Break the project into sub-tasks. Each must satisfy the 5 Primitives and complete in under 2 hours.*
70
+ *→ Need a reminder of the 5 Primitives? See Quick Reference or Canonical doc Section "5 Primitives."*
71
+
72
+ | # | Sub-task | Cognitive Mode | Risk Tier | Acceptance Criteria | Eval Method |
73
+ |---|---|---|---|---|---|
74
+ | 1 | | | | | |
75
+ | 2 | | | | | |
76
+ | 3 | | | | | |
77
+ | 4 | | | | | |
78
+ | 5 | | | | | |
79
+
80
+ *Add rows as needed. Every sub-task needs all five columns filled.*
81
+
82
+ *Note: Sub-task risk tiers may differ from the overall project tier. A research sub-task and a production deployment sub-task in the same project have different risk profiles — assign each independently. The project-level tier sets the floor for escalation; sub-task tiers govern validation routing.*
83
+
84
+ ### 1.4 Acceptance Criteria — Master Definition of Done
85
+ *Binary. Measurable. No subjective criteria.*
86
+
87
+ The project is complete when ALL of the following are true:
88
+ - [ ] `_______________`
89
+ - [ ] `_______________`
90
+ - [ ] `_______________`
91
+
92
+ ---
93
+
94
+ ## 2. Intent Layer — The Compass
95
+
96
+ ### 2.1 Agent Deployment Purpose
97
+ *Tell the agent what it is, what it is not, and who consumes its output. One clear paragraph.*
98
+
99
+ ```
100
+ You are: ___________________________________________________
101
+
102
+ You are NOT responsible for: _______________________________
103
+
104
+ Your output is consumed by: ________________________________
105
+ (human decision-maker / another agent / automated pipeline)
106
+
107
+ Your output feeds into: ____________________________________
108
+ ```
109
+
110
+ ### 2.2 Trade-off Hierarchy
111
+ *Rank these in order of priority. The agent will use this when it hits a fork.*
112
+
113
+ Rank the following from 1 (highest) to n (lowest) for this task:
114
+
115
+ | Priority | Value |
116
+ |---|---|
117
+ | `___` | Accuracy / Correctness |
118
+ | `___` | Speed / Efficiency |
119
+ | `___` | Cost / Token economy |
120
+ | `___` | Safety / Risk avoidance |
121
+ | `___` | Novelty / Creativity |
122
+ | `___` | Completeness |
123
+ | `___` | Simplicity / Readability |
124
+ | `___` | *(add domain-specific value)* |
125
+
126
+ ### 2.3 Constraint Architecture
127
+
128
+ **Must (non-negotiable requirements):**
129
+ - `_______________`
130
+ - `_______________`
131
+
132
+ **Must-Not (hard prohibitions):**
133
+ - `_______________`
134
+ - `_______________`
135
+
136
+ **Prefer (soft preferences when trade-offs arise):**
137
+ - `_______________`
138
+ - `_______________`
139
+
140
+ **Escalate (stop and surface to human when):**
141
+ - `_______________`
142
+ - `_______________`
143
+
144
+ ### 2.4 Forbidden Approaches
145
+ *Specific methods, tools, frameworks, or reasoning patterns to avoid.*
146
+
147
+ - `_______________`
148
+ - `_______________`
149
+
150
+ ---
151
+
152
+ ## 3. Context Layer — The Environment
153
+
154
+ ### 3.1 Provided Context
155
+ *List all documents, artifacts, or data sources being provided. Mark each as AUTHORITATIVE or REFERENCE.*
156
+
157
+ | Source | Type | Authority Level | Notes |
158
+ |---|---|---|---|
159
+ | | | AUTHORITATIVE / REFERENCE | |
160
+ | | | AUTHORITATIVE / REFERENCE | |
161
+ | | | AUTHORITATIVE / REFERENCE | |
162
+
163
+ ### 3.2 Known State
164
+ *What has already been tried? What failed? What is known?*
165
+
166
+ ```
167
+ Prior attempts: ____________________________________________
168
+
169
+ Known failures / dead ends: ________________________________
170
+
171
+ Known constraints: _________________________________________
172
+
173
+ Current blockers: __________________________________________
174
+ ```
175
+
176
+ ### 3.3 Context Freshness
177
+ *Tell the agent when to trust the provided context vs. when to re-fetch or verify.*
178
+
179
+ [ ] All provided context is current — use as ground truth
180
+ [ ] The following sources may be stale and should be verified: `_______________`
181
+ [ ] The agent must re-fetch live data for: `_______________`
182
+ [ ] MCP connections available: `_______________`
183
+
184
+ ### 3.4 Domain Conventions
185
+ *Terminology, style, standards, or norms the agent must match.*
186
+
187
+ ```
188
+ Terminology to use: ________________________________________
189
+
190
+ Terminology to avoid: ______________________________________
191
+
192
+ Style / format standards: __________________________________
193
+
194
+ Domain-specific conventions: _______________________________
195
+ ```
196
+
197
+ ---
198
+
199
+ ## 4. Prompt Layer — The Trigger
200
+
201
+ ### 4.1 Opening System Instruction
202
+ *The master instruction that frames the entire run. Reference Sections 1–3 explicitly.*
203
+
204
+ ```
205
+ You are [deployment purpose from 2.1].
206
+
207
+ Your task is to produce [deliverable from 1.1].
208
+
209
+ You are operating within the following constraints: [from 2.3].
210
+
211
+ Your trade-off priority order is: [from 2.2].
212
+
213
+ The context you have been provided is: [from 3.1].
214
+
215
+ You will complete this task by executing the following sub-tasks in order: [from 1.3].
216
+
217
+ For each sub-task, your completion criterion is: [from 1.4].
218
+ ```
219
+
220
+ ### 4.2 Per-Sub-Task Prompt Template
221
+ *Use this template for each sub-task trigger.*
222
+
223
+ ```
224
+ Sub-task [#]: [name]
225
+ Cognitive mode: [mode]
226
+ Your input: [what you're working from]
227
+ Your output: [exact format and content required]
228
+ Acceptance criteria: [binary criteria from 1.3]
229
+ Evaluation: [how this will be checked]
230
+ Resource cap: [max runtime / tokens for this sub-task]
231
+ ```
232
+
233
+ ---
234
+
235
+ ## 5. Governance & Validation
236
+
237
+ ### 5.1 Escalation Contract
238
+
239
+ **Escalation triggers** *(from 2.3 — repeated here for executor clarity)*:
240
+ - `_______________`
241
+ - `_______________`
242
+
243
+ **Who reviews escalations:**
244
+ ```
245
+ Name / Role: _______________________________________________
246
+ Contact: __________________________________________________
247
+ Response SLA: ______________________________________________
248
+ ```
249
+
250
+ **If no response arrives within SLA, the agent should:**
251
+ [ ] Hold and wait
252
+ [ ] Attempt alternative path: `_______________`
253
+ [ ] Abort and report
254
+
255
+ ### 5.2 Adversarial Reflection Trigger
256
+ *(Required for Medium and High risk tasks)*
257
+
258
+ [ ] Not required (Low risk)
259
+ [ ] Required after sub-task(s): `_______________`
260
+ [ ] Required before final delivery
261
+
262
+ The agent should critique: `_______________`
263
+ The revision threshold is: `_______________` *(what level of critique warrants a revision?)*
264
+
265
+ ### 5.3 Uncertainty Reporting
266
+ *(Required for non-deterministic domains)*
267
+
268
+ [ ] Not required
269
+ [ ] Required — agent must report:
270
+ - Confidence estimate (0–100%) with justification
271
+ - Primary uncertainty drivers
272
+ - What data would most reduce uncertainty
273
+ - Alternative plausible interpretations
274
+
275
+ ---
276
+
277
+ ## 6. Brainstorming Mode (Divergence → Convergence)
278
+ *Complete this section only if primary mode is multi-phase brainstorming.*
279
+
280
+ **Phase 1 — DIVERGE (Exploratory Mode)**
281
+ ```
282
+ Generate: _________________________________________________
283
+ Constraint: No filtering or judgment at this stage.
284
+ Output format: ____________________________________________
285
+ Volume target: ____________________________________________
286
+ ```
287
+
288
+ **Phase 2 — CLUSTER (Synthesis Mode)**
289
+ ```
290
+ Organize the output of Phase 1 by: ________________________
291
+ Identify: _________________________________________________
292
+ Output format: ____________________________________________
293
+ ```
294
+
295
+ **Phase 3 — STRESS-TEST (Adversarial Mode)**
296
+ ```
297
+ Attack the top [n] candidates from Phase 2.
298
+ Criteria to stress-test against: __________________________
299
+ Output format: ____________________________________________
300
+ What constitutes a fatal flaw: ____________________________
301
+ ```
302
+
303
+ **Phase 4 — FORMALIZE (Deterministic Mode)**
304
+ ```
305
+ Selected direction: ______________________________________
306
+ Formalize as: ____________________________________________
307
+ Acceptance criteria: _____________________________________
308
+ ```
309
+
310
+ ---
311
+
312
+ ## 7. Domain-Specific Additions
313
+
314
+ ### If Coding:
315
+ - Language / framework / version: `_______________`
316
+ - Performance targets: `_______________`
317
+ - API / interface contracts: `_______________`
318
+ - Security requirements: `_______________`
319
+ - Test coverage expectation: `_______________`
320
+
321
+ ### If Science / Research:
322
+ - Null hypothesis: `_______________`
323
+ - Falsification criteria: `_______________`
324
+ - Authorized data sources: `_______________`
325
+ - Forbidden data sources: `_______________`
326
+ - Statistical significance threshold: `_______________`
327
+ - Correction method: `_______________`
328
+ - Reproducibility requirements: `_______________`
329
+
330
+ ### If Writing / Content:
331
+ - Audience: `_______________`
332
+ - Purpose: `_______________`
333
+ - Structure: `_______________`
334
+ - Word count / length: `_______________`
335
+ - Style reference: `_______________`
336
+ - Mandatory inclusions: `_______________`
337
+ - Mandatory exclusions: `_______________`
338
+
339
+ ### If Business / Strategy:
340
+ - Decision-maker: `_______________`
341
+ - Decision to be made: `_______________`
342
+ - Options to evaluate: `_______________`
343
+ - Recommendation format: `_______________`
344
+ - Regulatory Must-Nots: `_______________`
345
+ - Stakeholder sensitivities: `_______________`
346
+
347
+ ---
348
+
349
+ ## 8. Spec Validation (Red-Team Checklist)
350
+ *Complete before handing to the agent. Every unchecked box is a known risk.*
351
+
352
+ **5 Primitives — does every sub-task have:**
353
+ - [ ] A self-contained problem statement (zero questions needed to start)
354
+ - [ ] A Constraint Architecture (Must / Must-Not / Prefer / Escalate)
355
+ - [ ] A runtime under 2 hours
356
+ - [ ] Binary acceptance criteria
357
+ - [ ] A built-in evaluation method
358
+
359
+ **Failure Mode Taxonomy — does this spec prevent:**
360
+ - [ ] Scope Creep — explicit Must-Not list and scope boundaries
361
+ - [ ] Hallucinated Completion — binary acceptance criteria defined
362
+ - [ ] Intent Drift — ranked trade-offs and deployment purpose explicit
363
+ - [ ] Context Collapse — context curated; noise removed
364
+ - [ ] Runaway Cost — resource caps set before decomposition
365
+ - [ ] Overconfident Output — uncertainty reporting required (if applicable)
366
+
367
+ **Final gate:**
368
+ - [ ] Risk tier and resource governance set *before* decomposition
369
+ - [ ] Every agent has a deployment purpose statement
370
+ - [ ] Escalation contract complete (who, when, SLA, no-response behavior)
371
+ - [ ] Planner execution plan will be saved as artifact
372
+
373
+ ---
374
+
375
+ *FORGE Spec Template v1.0 — companion to the FORGE system. A blank field is an unresolved decision.*
376
+ *System docs: Quick Reference · Canonical Framework · Spec Template (this doc)*
@@ -1 +1 @@
1
- {"version":3,"file":"prompts.d.ts","sourceRoot":"","sources":["../src/prompts.ts"],"names":[],"mappings":"AAoBA,eAAO,MAAM,UAAU,QAAkB,CAAC"}
1
+ {"version":3,"file":"prompts.d.ts","sourceRoot":"","sources":["../src/prompts.ts"],"names":[],"mappings":"AAyBA,eAAO,MAAM,UAAU,QAAkB,CAAC"}
package/dist/prompts.js CHANGED
@@ -41,6 +41,11 @@ const path = __importStar(require("path"));
41
41
  * FORGE-quickref.md. Works regardless of where the package is installed.
42
42
  */
43
43
  function findForgeRoot() {
44
+ // 1. Check bundled data/ directory (works when installed via npm)
45
+ const bundled = path.join(__dirname, '..', 'data');
46
+ if (fs.existsSync(path.join(bundled, 'FORGE-quickref.md')))
47
+ return bundled;
48
+ // 2. Walk up from __dirname (works in monorepo dev)
44
49
  let dir = __dirname;
45
50
  for (let i = 0; i < 10; i++) {
46
51
  if (fs.existsSync(path.join(dir, 'FORGE-quickref.md')))
@@ -1 +1 @@
1
- {"version":3,"file":"prompts.js","sourceRoot":"","sources":["../src/prompts.ts"],"names":[],"mappings":";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;AAAA,uCAAyB;AACzB,2CAA6B;AAE7B;;;GAGG;AACH,SAAS,aAAa;IACpB,IAAI,GAAG,GAAG,SAAS,CAAC;IACpB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,EAAE,EAAE,CAAC,EAAE,EAAE,CAAC;QAC5B,IAAI,EAAE,CAAC,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,EAAE,mBAAmB,CAAC,CAAC;YAAE,OAAO,GAAG,CAAC;QACnE,MAAM,MAAM,GAAG,IAAI,CAAC,OAAO,CAAC,GAAG,CAAC,CAAC;QACjC,IAAI,MAAM,KAAK,GAAG;YAAE,MAAM;QAC1B,GAAG,GAAG,MAAM,CAAC;IACf,CAAC;IACD,MAAM,IAAI,KAAK,CACb,sFAAsF,CACvF,CAAC;AACJ,CAAC;AAEY,QAAA,UAAU,GAAG,aAAa,EAAE,CAAC"}
1
+ {"version":3,"file":"prompts.js","sourceRoot":"","sources":["../src/prompts.ts"],"names":[],"mappings":";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;AAAA,uCAAyB;AACzB,2CAA6B;AAE7B;;;GAGG;AACH,SAAS,aAAa;IACpB,kEAAkE;IAClE,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,IAAI,EAAE,MAAM,CAAC,CAAC;IACnD,IAAI,EAAE,CAAC,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,OAAO,EAAE,mBAAmB,CAAC,CAAC;QAAE,OAAO,OAAO,CAAC;IAE3E,oDAAoD;IACpD,IAAI,GAAG,GAAG,SAAS,CAAC;IACpB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,EAAE,EAAE,CAAC,EAAE,EAAE,CAAC;QAC5B,IAAI,EAAE,CAAC,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,EAAE,mBAAmB,CAAC,CAAC;YAAE,OAAO,GAAG,CAAC;QACnE,MAAM,MAAM,GAAG,IAAI,CAAC,OAAO,CAAC,GAAG,CAAC,CAAC;QACjC,IAAI,MAAM,KAAK,GAAG;YAAE,MAAM;QAC1B,GAAG,GAAG,MAAM,CAAC;IACf,CAAC;IACD,MAAM,IAAI,KAAK,CACb,sFAAsF,CACvF,CAAC;AACJ,CAAC;AAEY,QAAA,UAAU,GAAG,aAAa,EAAE,CAAC"}
package/package.json CHANGED
@@ -1,34 +1,47 @@
1
1
  {
2
2
  "name": "nimai-mcp",
3
- "version": "0.1.0",
3
+ "version": "0.1.2",
4
4
  "description": "Nimai MCP server — exposes nimai_spec, nimai_review, nimai_validate, nimai_new as tools. No internal LLM calls.",
5
- "keywords": ["nimai", "forge", "mcp", "model-context-protocol", "ai", "llm", "specification", "claude", "codex"],
5
+ "keywords": [
6
+ "nimai",
7
+ "forge",
8
+ "mcp",
9
+ "model-context-protocol",
10
+ "ai",
11
+ "llm",
12
+ "specification",
13
+ "claude",
14
+ "codex"
15
+ ],
6
16
  "license": "MIT",
7
17
  "repository": {
8
18
  "type": "git",
9
19
  "url": "https://github.com/Balagopalaji/nimai.git",
10
20
  "directory": "packages/mcp"
11
21
  },
12
- "files": ["dist/"],
22
+ "files": [
23
+ "dist/",
24
+ "data/"
25
+ ],
13
26
  "main": "dist/index.js",
14
27
  "types": "dist/index.d.ts",
15
28
  "bin": {
16
29
  "nimai-mcp": "dist/index.js"
17
30
  },
18
- "scripts": {
19
- "build": "tsc",
20
- "test": "vitest run",
21
- "dev": "tsc --watch",
22
- "clean": "rm -rf dist"
23
- },
24
31
  "devDependencies": {
25
32
  "@types/node": "^22.0.0",
26
33
  "typescript": "^5.4.5",
27
34
  "vitest": "^1.6.0"
28
35
  },
29
36
  "dependencies": {
30
- "nimai-core": "workspace:*",
31
37
  "@modelcontextprotocol/sdk": "^1.0.0",
32
- "zod": "^3.23.8"
38
+ "zod": "^3.23.8",
39
+ "nimai-core": "0.1.0"
40
+ },
41
+ "scripts": {
42
+ "build": "tsc",
43
+ "test": "vitest run",
44
+ "dev": "tsc --watch",
45
+ "clean": "rm -rf dist"
33
46
  }
34
- }
47
+ }