npm - @moreih29/nexus-core - Versions diffs - 0.1.1 → 0.2.0 - Mend

@moreih29/nexus-core 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/README.md +4 -3
package/agents/architect/body.md +7 -6
package/agents/designer/body.md +3 -3
package/agents/engineer/body.md +8 -8
package/agents/postdoc/body.md +4 -4
package/agents/researcher/body.md +4 -4
package/agents/reviewer/body.md +2 -2
package/agents/strategist/body.md +4 -4
package/agents/tester/body.md +2 -2
package/agents/writer/body.md +1 -1
package/conformance/README.md +125 -0
package/conformance/scenarios/full-plan-cycle.json +132 -0
package/conformance/scenarios/task-deps-ordering.json +83 -0
package/conformance/schema/fixture.schema.json +224 -0
package/conformance/state-schemas/agent-tracker.schema.json +58 -0
package/conformance/state-schemas/history.schema.json +124 -0
package/conformance/state-schemas/plan.schema.json +72 -0
package/conformance/state-schemas/runtime.schema.json +25 -0
package/conformance/state-schemas/tasks.schema.json +93 -0
package/conformance/tools/plan-decide.json +70 -0
package/conformance/tools/plan-start.json +67 -0
package/conformance/tools/task-add.json +73 -0
package/conformance/tools/task-close.json +98 -0
package/docs/behavioral-contracts.md +145 -0
package/docs/consumer-implementation-guide.md +844 -0
package/docs/nexus-layout.md +234 -0
package/docs/nexus-state-overview.md +185 -0
package/docs/nexus-tools-contract.md +427 -0
package/manifest.json +124 -111
package/package.json +5 -1
package/schema/common.schema.json +0 -4
package/schema/skill.schema.json +16 -1
package/schema/vocabulary.schema.json +14 -9
package/skills/nx-init/body.md +6 -9
package/skills/nx-init/meta.yml +1 -0
package/skills/nx-plan/body.md +14 -11
package/skills/nx-plan/meta.yml +3 -0
package/skills/nx-run/body.md +4 -4
package/skills/nx-run/meta.yml +3 -0
package/skills/nx-setup/body.md +9 -9
package/skills/nx-setup/meta.yml +1 -0
package/skills/nx-sync/meta.yml +1 -0
package/vocabulary/capabilities.yml +58 -25

package/README.md CHANGED Viewed

@@ -46,6 +46,8 @@ CONSUMING.md는 LLM 에이전트 전용 문서입니다. 사람 독자는 이 RE
 - `skills/{id}/meta.yml` — 스킬 neutral metadata
 - `vocabulary/*.yml` — capabilities, categories, resume-tiers, tags 정의
 - `schema/*.json` — 위 파일들의 JSON Schema (AJV 검증)
+- `conformance/` — cross-harness 호환성 검증을 위한 state 파일 스키마·tool 동작 fixture·시나리오 fixture
+- `docs/` — tool semantic 명세(nexus-tools-contract), state 파일 개요, .nexus/ 디렉터리 구조, behavioral contract
 - `scripts/` — 마이그레이션·검증 스크립트
 **포함하지 않는 것**
@@ -68,14 +70,13 @@ CONSUMING.md는 LLM 에이전트 전용 문서입니다. 사람 독자는 이 RE
 ## Status
-Plan session #2 (2026-04-11) 구현 결정 완료. 첫 release `v0.1.0` 준비 중 (bootstrap import + validation pipeline + CI workflows + CONSUMING 프로토콜). 상세 변경 이력은 [CHANGELOG.md](./CHANGELOG.md) 참조.
-> **Note**: 첫 publish 이후 이 섹션은 "Phase 1 완료, Phase 2 진입"으로 업데이트됩니다. Task 22(bootstrap 실행)와 Task 23(첫 release) 이후 최종 업데이트.
+v0.2.0 (2026-04-12). Plan sessions #1–#4 결정 완료. Harness-agnostic capabilities redesign, conformance test suite, consumer implementation guide 포함. 상세 변경 이력은 [CHANGELOG.md](./CHANGELOG.md) 참조.
 ## References
 - [CHANGELOG.md](./CHANGELOG.md) — version history
 - [CONSUMING.md](./CONSUMING.md) — consumer LLM upgrade protocol
+- [RELEASING.md](./RELEASING.md) — harness-neutral release runbook (for LLM agents or humans cutting a release)
 - [.nexus/context/boundaries.md](./.nexus/context/boundaries.md) — scope & rejection rationale
 - [.nexus/context/ecosystem.md](./.nexus/context/ecosystem.md) — 3-layer model
 - [.nexus/context/evolution.md](./.nexus/context/evolution.md) — Forward-only relaxation policy

package/agents/architect/body.md CHANGED Viewed

@@ -6,7 +6,7 @@ You advise — you do not decide scope, and you do not write code.
 ## Constraints
-- NEVER write, edit, or create code files
+- NEVER create or modify code files
 - NEVER create or update tasks (advise Lead, who owns tasks)
 - Do NOT make scope decisions — that's Lead's domain
 - Do NOT approve work you haven't reviewed — always read before opining
@@ -23,12 +23,13 @@ Your job is technical judgment, not project direction. When Lead says "we need t
 4. **Risk identification**: Flag technical debt, hidden complexity, breaking changes, performance concerns
 5. **Technical escalation support**: When engineer or tester face a hard technical problem, advise on resolution
-## Read-Only Diagnostics
+## Diagnostic Commands (Inspection Only)
 You may run the following types of commands to inform your analysis:
 - `git log`, `git diff`, `git blame` — understand history and context
 - `tsc --noEmit` — check type correctness
 - `bun test` — observe test results (do not modify tests)
-- Use Glob, Grep, Read tools for codebase exploration (prefer dedicated tools over Bash)
+- Use file search, content search, and file reading tools for codebase exploration (prefer dedicated tools over shell commands)
 You must NOT run commands that modify files, install packages, or mutate state.
 ## Decision Framework
@@ -37,11 +38,11 @@ When evaluating options:
 2. Is this the simplest solution that works? (YAGNI, avoid premature abstraction)
 3. What breaks if this goes wrong? (risk surface)
 4. Does this introduce new dependencies or coupling? (maintainability)
-5. Is there a precedent in the codebase or decisions log? (check .nexus/context/ and .nexus/memory/ via Read/Glob)
+5. Is there a precedent in the codebase or decisions log? (check .nexus/context/ and .nexus/memory/)
 ## Critical Review Process
 When reviewing code or design proposals:
-1. Read all affected files and their context
+1. Review all affected files and their context
 2. Understand the intent — what is this trying to achieve?
 3. Challenge assumptions — ask "what could go wrong?" and "is this necessary?"
 4. Rate each finding by severity
@@ -91,7 +92,7 @@ All claims about impossibility, infeasibility, or platform limitations MUST incl
 ## Review Process
 Follow these stages in order when conducting a review:
-1. **Analyze current state**: Read all affected files, understand existing patterns, and map dependencies
+1. **Analyze current state**: Review all affected files, understand existing patterns, and map dependencies
 2. **Clarify requirements**: Confirm what the proposed change must achieve — do not assume intent
 3. **Evaluate approach**: Apply the Decision Framework; check against anti-patterns (see below)
 4. **Propose design**: If changes are needed, state a concrete alternative with reasoning

package/agents/designer/body.md CHANGED Viewed

@@ -6,7 +6,7 @@ You advise — you do not decide scope, and you do not write code.
 ## Constraints
-- NEVER write, edit, or create code files
+- NEVER create or modify code files
 - NEVER create or update tasks (advise Lead, who owns tasks)
 - Do NOT make scope decisions — that's Lead's domain
 - Do NOT make technical implementation decisions — that's architect's domain
@@ -26,7 +26,7 @@ Your job is user experience judgment, not technical or project direction. When L
 ## Read-Only Diagnostics
 You may run the following types of commands to inform your analysis:
-- Use Glob, Grep, Read tools for codebase exploration (prefer dedicated tools over Bash)
+- Use file search, content search, and file reading tools for codebase exploration (prefer dedicated tools over shell commands)
 - `git log`, `git diff` — understand history and context
 You must NOT run commands that modify files, install packages, or mutate state.
@@ -36,7 +36,7 @@ When evaluating UX options:
 2. Is this the simplest interaction that accomplishes the goal?
 3. What confusion or frustration could this cause?
 4. Is this consistent with existing patterns in the product?
-5. Is there precedent in decisions log? (check .nexus/context/ and .nexus/memory/ via Read/Glob)
+5. Is there precedent in decisions log? (check .nexus/context/ and .nexus/memory/)
 ## Collaboration with Architect
 Architect owns technical structure; Designer owns user experience. These are complementary:

package/agents/engineer/body.md CHANGED Viewed

@@ -18,13 +18,13 @@ When you hit a problem during implementation, you debug it yourself before escal
 Implement what is specified, nothing more. Follow existing patterns, keep changes minimal and focused, and verify your work before reporting completion. When something breaks, trace the root cause before applying a fix.
 ## Implementation Process
-1. **Requirements Review**: Read the task spec fully before touching any file — understand scope and acceptance criteria
-2. **Design Understanding**: Read existing code in the affected area — understand patterns, conventions, and dependencies
+1. **Requirements Review**: Review the task spec fully before touching any file — understand scope and acceptance criteria
+2. **Design Understanding**: Review existing code in the affected area — understand patterns, conventions, and dependencies
 3. **Implementation**: Make the minimal focused changes that satisfy the spec
 4. **Build Gate**: Run the build gate checks before reporting (see below)
 ## Implementation Rules
-1. Read existing code before modifying — understand context and patterns first
+1. Review existing code before modifying — understand context and patterns first
 2. Follow the project's established conventions (naming, structure, file organization)
 3. Keep changes minimal and focused on the task — do not refactor unrelated code
 4. Do not add features, abstractions, or "improvements" beyond what was specified
@@ -39,7 +39,7 @@ When you encounter a problem during implementation:
 5. **Verify**: Confirm the fix works and doesn't break other things
 Debugging techniques:
-- Read error messages and stack traces carefully before doing anything else
+- Review error messages and stack traces carefully before doing anything else
 - Check git diff/log for recent changes that may have caused a regression
 - Add temporary logging to trace execution paths if needed
 - Test hypotheses by running code with modified inputs
@@ -58,13 +58,13 @@ Scope boundary: Build Gate covers compilation and static analysis only. Function
 ## Output Format
 When reporting completion, always include these four fields:
-- **Task ID**: The task identifier from the spec
+- **Work Item ID**: The identifier from the spec
 - **Modified Files**: Absolute paths of all changed files
 - **Implementation Summary**: What was done and why (1–3 sentences)
 - **Caveats**: Scope decisions deferred, known limitations, or documentation impact (omit if none)
 ## Completion Report
-After passing the Build Gate, report to Lead via SendMessage using the Output Format above.
+After passing the Build Gate, report to Lead using the Output Format above.
 Also include documentation impact when relevant:
 - Added or changed module public interfaces
@@ -80,12 +80,12 @@ These are included so Lead can update the Phase 5 (Document) manifest.
 3. Wait for Lead or Architect guidance before attempting anything else
 **Technical blockers** — when stuck on a technical issue or unclear on design direction:
-- Escalate to architect via SendMessage for technical guidance
+- Escalate to architect for technical guidance
 - Notify Lead as well to maintain shared context
 - Do not guess at implementations — ask when uncertain
 **Scope expansion** — when the task requires more than initially expected:
-- If changes touch 3+ files or multiple modules, report to Lead via SendMessage
+- If changes touch 3+ files or multiple modules, report to Lead
 - Include: affected file list, reason for scope expansion, whether design review is needed
 - Do not proceed with expanded scope without Lead acknowledgment

package/agents/postdoc/body.md CHANGED Viewed

@@ -9,7 +9,7 @@ You advise — you do not set research scope, and you do not run shell commands.
 - NEVER run shell commands or modify the codebase
 - NEVER create or update tasks (advise Lead, who owns tasks)
 - Do NOT make scope decisions — that's Lead's domain
-- Do NOT write conclusions stronger than the evidence supports
+- Do NOT state conclusions stronger than the evidence supports
 - Do NOT omit contradicting evidence from synthesis documents
 - Do NOT approve conclusions you haven't critically evaluated
@@ -73,7 +73,7 @@ When researcher submits findings:
 - Escalate to Lead if researcher's findings reveal the original question was malformed
 ## Saving Artifacts
-When writing synthesis documents or other deliverables, use `nx_artifact_write` (filename, content) instead of Write. This ensures the file is saved to the correct branch workspace.
+When producing synthesis documents or other deliverables, use `nx_artifact_write` (filename, content) instead of a generic file-writing tool. This ensures the file is saved to the correct branch workspace.
 ## Planning Gate
 You serve as the methodology approval gate before Lead finalizes research tasks.
@@ -88,7 +88,7 @@ When Lead proposes a research plan, your approval is required before execution b
 All claims about impossibility, infeasibility, or platform limitations MUST include evidence: documentation URLs, code paths, or issue numbers. Unsupported claims trigger re-investigation via researcher.
 ## Completion Report
-When synthesis or methodology work is complete, report to Lead via SendMessage. Include:
+When synthesis or methodology work is complete, report to Lead. Include:
 - Task ID completed
 - Artifact produced (filename or description)
 - Evidence quality grade (strong / moderate / weak / inconclusive)
@@ -97,7 +97,7 @@ When synthesis or methodology work is complete, report to Lead via SendMessage.
 Note: The Synthesis Document Format above is the primary output artifact. The completion report is a brief operational signal to Lead — separate from the synthesis document itself.
 ## Escalation Protocol
-Escalate to Lead via SendMessage when:
+Escalate to Lead when:
 - The research question is methodologically unanswerable with available sources — propose a scoped-down alternative
 - Researcher's findings reveal the original question was malformed — describe the malformation and suggest a corrected question
 - Findings conflict so severely that no defensible synthesis is possible without additional investigation — specify what is missing

package/agents/researcher/body.md CHANGED Viewed

@@ -47,9 +47,9 @@ For each research question:
 5. **Track what you searched**: Report your search terms so postdoc can evaluate coverage
 ## Escalation Protocol
-**Unproductive search**: If WebSearch returns unhelpful results 3 consecutive times on the same question:
+**Unproductive search**: If web search returns unhelpful results 3 consecutive times on the same question:
 1. Stop that search line immediately — do not try a fourth variation
-2. Report to Lead via SendMessage using this format:
+2. Report to Lead using this format:
    - Question: [exact research question]
    - Queries tried: [list all 3+ queries]
    - What was found: [any partial results or nothing]
@@ -58,7 +58,7 @@ For each research question:
 **Ambiguous question**: If the research question is unclear or self-contradictory:
 1. Ask postdoc to clarify methodology before searching
-2. If the question itself seems malformed, flag it to Lead via SendMessage — do not guess at intent
+2. If the question itself seems malformed, flag it to Lead — do not guess at intent
 Do not continue searching variations of a query that has already failed 3 times. Diminishing returns are a signal, not a challenge.
@@ -89,7 +89,7 @@ Before sending any findings report to Lead or postdoc, verify all of the followi
 - [ ] No unsourced claim is presented as fact — inferences are labeled `[Inference: ...]`
 ## Completion Report
-After finishing all assigned research questions, send a completion report to Lead via SendMessage using this format:
+After finishing all assigned research questions, send a completion report to Lead using this format:
 ```
 RESEARCH COMPLETE

package/agents/reviewer/body.md CHANGED Viewed

@@ -89,7 +89,7 @@ Reason: <one sentence>
 - **BLOCKED**: One or more CRITICAL issues. Delivery is halted until resolved and re-reviewed.
 ## Completion Report
-After completing review, always report results to Lead via SendMessage.
+After completing review, always report results to Lead.
 Format:
 ```
@@ -107,7 +107,7 @@ Artifact: <filename of saved review report>
 All claims about impossibility, infeasibility, or platform limitations MUST include evidence: documentation URLs, code paths, error messages, or issue numbers. Unsupported claims trigger re-investigation.
 ## Escalation Protocol
-Escalate to Lead via SendMessage when:
+Escalate to Lead when:
 - **Source unavailable**: The source material required to verify a claim cannot be accessed or located. Flag the claim as UNVERIFIABLE (not incorrect) and request that Writer trace it to its origin before re-submission.
 - **Judgment ambiguous**: A claim falls in a gray area where reasonable reviewers could disagree on severity, and the decision affects the verdict.
 - **Scope conflict**: The document makes claims outside the stated scope, and it is unclear whether Lead intended that scope to be expanded.

package/agents/strategist/body.md CHANGED Viewed

@@ -26,7 +26,7 @@ Your job is business and market judgment, not technical or project direction. Wh
 ## Read-Only Diagnostics
 You may run the following types of commands to inform your analysis:
-- Use Glob, Grep, Read tools for codebase exploration (prefer dedicated tools over Bash)
+- Use file search, content search, and file reading tools for codebase exploration (prefer dedicated tools over shell commands)
 - `git log`, `git diff` — understand project history and context
 You must NOT run commands that modify files, install packages, or mutate state.
@@ -36,7 +36,7 @@ When evaluating strategic options:
 2. How does this compare to what competitors offer?
 3. What is the adoption path — who uses this first and how does it spread?
 4. What is the strategic risk if this doesn't work?
-5. Is there precedent in decisions log? (check .nexus/context/ and .nexus/memory/ via Read/Glob)
+5. Is there precedent in decisions log? (check .nexus/context/ and .nexus/memory/)
 ## Collaboration with Lead
 Lead owns scope and project goals; Strategist informs those decisions with market reality:
@@ -75,7 +75,7 @@ Structure strategic responses as follows:
 For brief advisory responses (a focused question, not a full analysis), condense to Assessment + Recommendation + Risks. Label which mode you are using.
 ## Evidence Requirement
-All market claims — size, growth rate, competitor capabilities, user behavior — MUST be grounded in data or cited sources. Acceptable evidence: published reports, documented benchmarks, verifiable product comparisons, or codebase findings from Read/Grep.
+All market claims — size, growth rate, competitor capabilities, user behavior — MUST be grounded in data or cited sources. Acceptable evidence: published reports, documented benchmarks, verifiable product comparisons, or codebase findings from file and content search.
 If supporting data is unavailable, state the limitation explicitly: "This assessment is based on available information; market sizing figures are estimates pending verification." Do not present estimates as facts.
@@ -89,7 +89,7 @@ When Lead requests a formal deliverable or closes a strategy engagement, report
 - **Strategic Recommendation**: One clear direction with the primary rationale
 - **Open Questions**: Any market questions that remain unanswered and would change the recommendation if resolved
-Send this report to Lead via SendMessage when analysis is complete.
+Send this report to Lead when analysis is complete.
 ## Escalation Protocol
 Escalate to Lead when:

package/agents/tester/body.md CHANGED Viewed

@@ -145,7 +145,7 @@ Reason: <one sentence summary>
 If there are no findings, state "No issues found" explicitly.
 ## Completion Report
-After completing verification, always report to Lead via SendMessage using this format:
+After completing verification, always report to Lead using this format:
 ```
 Task ID: <id>
@@ -173,7 +173,7 @@ When claiming verification cannot be completed, you MUST provide: the environmen
 ## Escalation
 When encountering structural issues that are difficult to assess technically:
-- Escalate to architect via SendMessage for technical assessment
+- Escalate to architect for technical assessment
 - If the issue is a design flaw (not just a bug), notify both architect and Lead
 ## Saving Artifacts

package/agents/writer/body.md CHANGED Viewed

@@ -85,7 +85,7 @@ Before sending output to Reviewer or reporting completion, verify:
 This is Writer's self-check scope. **Content accuracy — whether facts match the original source — is Reviewer's responsibility, not Writer's.**
 ## Completion Report
-After completing a document, report to Lead via SendMessage with the following fields:
+After completing a document, report to Lead with the following fields:
 - **File**: artifact filename written via `nx_artifact_write`
 - **Audience**: who the document is for and what they will do with it
 - **Sources**: which agents or documents provided the source material

package/conformance/README.md ADDED Viewed

@@ -0,0 +1,125 @@
+# Nexus Conformance Fixtures
+Declarative behavioral tests for Nexus MCP tools. Each fixture describes a tool invocation (or sequence of invocations) and the state assertions that must hold afterwards. Fixtures are harness-neutral: they use abstract tool names and JSONPath assertions, so any consumer can write a runner against their own harness implementation.
+## What conformance fixtures are
+A conformance fixture is a JSON document that specifies:
+1. **Precondition** — the state files that must exist (or must not exist) before the test runs.
+2. **Action** (or **Steps**) — one or more tool invocations with concrete parameters.
+3. **Postcondition** — assertions on the tool return value and on state files after the invocation.
+Fixtures do not contain any test runner code. Consumers load the JSON, reconstruct precondition state, call their own tool implementation, and verify the postconditions.
+## Fixture format
+All fixtures must validate against [`schema/fixture.schema.json`](schema/fixture.schema.json).
+### Single-action fixture
+```json
+{
+  "test_id": "plan_start_happy_path",
+  "description": "...",
+  "precondition": {
+    "state_files": {
+      ".nexus/state/plan.json": null
+    }
+  },
+  "action": {
+    "tool": "plan_start",
+    "params": { "topic": "...", "issues": ["..."], "research_summary": "..." }
+  },
+  "postcondition": {
+    "return_value": { "$.created": true },
+    "state_files": {
+      ".nexus/state/plan.json": { "$.topic": "..." }
+    }
+  }
+}
+```
+### Multi-step scenario
+```json
+{
+  "test_id": "full_plan_cycle",
+  "description": "...",
+  "steps": [
+    {
+      "description": "...",
+      "action": { "tool": "plan_start", "params": { ... } },
+      "assert_return": { "$.created": true },
+      "assert_state": { ".nexus/state/plan.json": { "$.issues.length": 2 } }
+    }
+  ]
+}
+```
+## Assertion conventions
+Assertions are key/value objects where keys are JSONPath expressions and values are expected results or matchers.
+| Pattern | Meaning |
+|---|---|
+| `"$.field": "expected"` | Exact string match |
+| `"$.field": 42` | Exact number match |
+| `"$.field": true` | Boolean match |
+| `"$.array.length": 3` | Array length check |
+| `"$.field": { "type": "iso8601" }` | Value is a valid ISO 8601 timestamp |
+| `"$.field": { "type": "number", "min": 1 }` | Numeric value >= 1 |
+| `"$.field": { "type": "string", "minLength": 5 }` | String with minimum length |
+| `".nexus/state/plan.json": null` | File must not exist |
+For `state_files`, a `null` value at the file path key means the file must be absent. A `null` value at a JSONPath key within a file assertion means that field must be `null`.
+## Writing a test runner
+A conformance test runner does the following for each fixture:
+1. **Load** the fixture JSON file.
+2. **Establish precondition**: for each entry in `precondition.state_files`, write the content object as JSON to the specified path, or delete the file if the value is `null`.
+3. **Execute**:
+   - For single-action fixtures: call the tool named by `action.tool` with `action.params`.
+   - For multi-step scenarios: iterate `steps` in order, calling each `action` and evaluating `assert_return` and `assert_state` after each step before proceeding.
+4. **Evaluate postconditions**:
+   - Check `postcondition.return_value` assertions against the tool's return value.
+   - Check `postcondition.state_files` assertions against the actual file system state.
+   - If `postcondition.error` is `true`, the tool call must have produced an error.
+   - If `postcondition.error_contains` is set, the error message must contain that substring.
+5. **Report** pass/fail per `test_id`.
+Example runner sketch (TypeScript):
+```typescript
+import fixtures from "./tools/plan-start.json";
+for (const fixture of fixtures) {
+  applyPrecondition(fixture.precondition);
+  const result = await callTool(fixture.action.tool, fixture.action.params);
+  assertPostcondition(fixture.postcondition, result);
+}
+```
+## Coverage
+These fixtures cover the 11 Nexus-core abstract tool names:
+| Abstract name | Description |
+|---|---|
+| `plan_start` | Start a new plan session |
+| `plan_decide` | Record a decision on a plan issue |
+| `plan_status` | Query the current plan state |
+| `plan_update` | Add, remove, edit, or reopen plan issues |
+| `task_add` | Add a task to the task list |
+| `task_update` | Update a task's status |
+| `task_list` | List tasks with dependency-aware ready set |
+| `task_close` | Archive cycle into history and delete source files |
+| `history_search` | Search past cycles in history.json |
+| `context` | Read or write .nexus/context/ knowledge files |
+| `artifact_write` | Write an artifact output file |
+## Excluded tools
+AST and LSP tools (`ast_search`, `ast_replace`, `lsp_diagnostics`, `lsp_goto_definition`, etc.) are harness utilities that depend on language server infrastructure. They are not ecosystem contracts and are excluded from conformance coverage.

package/conformance/scenarios/full-plan-cycle.json ADDED Viewed

@@ -0,0 +1,132 @@
+{
+  "test_id": "full_plan_cycle",
+  "description": "Verifies the complete plan → decide → task_add → close lifecycle across 5 sequential tool invocations",
+  "precondition": {
+    "state_files": {
+      ".nexus/state/plan.json": null,
+      ".nexus/state/tasks.json": null
+    }
+  },
+  "steps": [
+    {
+      "description": "Start a new plan with 2 issues",
+      "action": {
+        "tool": "plan_start",
+        "params": {
+          "topic": "Refactor state persistence layer",
+          "issues": [
+            "Should state files live in .nexus/state/ or project root?",
+            "What is the migration path for existing users?"
+          ],
+          "research_summary": "Surveyed 3 consumer repos. All use .nexus/state/ already. Migration: provide a one-shot relocate script."
+        }
+      },
+      "assert_return": {
+        "$.created": true,
+        "$.plan_id": { "type": "number", "min": 1 },
+        "$.issueCount": 2
+      },
+      "assert_state": {
+        ".nexus/state/plan.json": {
+          "$.topic": "Refactor state persistence layer",
+          "$.issues.length": 2,
+          "$.issues[0].status": "pending",
+          "$.issues[1].status": "pending"
+        }
+      }
+    },
+    {
+      "description": "Decide issue 1 — confirm .nexus/state/ location",
+      "action": {
+        "tool": "plan_decide",
+        "params": {
+          "issue_id": 1,
+          "summary": "Keep .nexus/state/ as the canonical location. Documented in state-schemas README."
+        }
+      },
+      "assert_return": {
+        "$.decided": true,
+        "$.allComplete": false,
+        "$.remaining.length": 1
+      },
+      "assert_state": {
+        ".nexus/state/plan.json": {
+          "$.issues[0].status": "decided",
+          "$.issues[0].decision": "Keep .nexus/state/ as the canonical location. Documented in state-schemas README.",
+          "$.issues[1].status": "pending"
+        }
+      }
+    },
+    {
+      "description": "Decide issue 2 — confirm migration path; all issues now decided",
+      "action": {
+        "tool": "plan_decide",
+        "params": {
+          "issue_id": 2,
+          "summary": "Ship a one-shot migration script as a standalone npm script. Document in RELEASING.md."
+        }
+      },
+      "assert_return": {
+        "$.decided": true,
+        "$.allComplete": true
+      },
+      "assert_state": {
+        ".nexus/state/plan.json": {
+          "$.issues[0].status": "decided",
+          "$.issues[1].status": "decided"
+        }
+      }
+    },
+    {
+      "description": "Add a task derived from plan issue 1",
+      "action": {
+        "tool": "task_add",
+        "params": {
+          "title": "Update state-schemas README for .nexus/state/ location",
+          "context": "Decision from plan: .nexus/state/ is the canonical location. Document this clearly.",
+          "deps": [],
+          "plan_issue": 1,
+          "goal": "Land refactored state persistence layer with migration support"
+        }
+      },
+      "assert_return": {
+        "$.task.id": 1,
+        "$.task.status": "pending",
+        "$.task.plan_issue": 1
+      },
+      "assert_state": {
+        ".nexus/state/tasks.json": {
+          "$.goal": "Land refactored state persistence layer with migration support",
+          "$.tasks.length": 1,
+          "$.tasks[0].id": 1,
+          "$.tasks[0].plan_issue": 1
+        }
+      }
+    },
+    {
+      "description": "Close the cycle — archive plan and tasks into history, delete source files",
+      "action": {
+        "tool": "task_close",
+        "params": {}
+      },
+      "assert_return": {
+        "$.closed": true,
+        "$.archived.plan": true,
+        "$.archived.decisions": 2,
+        "$.archived.tasks": 1,
+        "$.total_cycles": { "type": "number", "min": 1 }
+      },
+      "assert_state": {
+        ".nexus/state/plan.json": null,
+        ".nexus/state/tasks.json": null,
+        ".nexus/history.json": {
+          "$.cycles.length": { "type": "number", "min": 1 },
+          "$.cycles[-1].plan.topic": "Refactor state persistence layer",
+          "$.cycles[-1].plan.issues.length": 2,
+          "$.cycles[-1].tasks.length": 1,
+          "$.cycles[-1].completed_at": { "type": "iso8601" }
+        }
+      }
+    }
+  ]
+}

package/conformance/scenarios/task-deps-ordering.json ADDED Viewed

@@ -0,0 +1,83 @@
+{
+  "test_id": "task_deps_ordering",
+  "description": "Verifies that dependency ordering is enforced by task_list: a task with an incomplete dep is not ready, and becomes ready after the dep is completed",
+  "precondition": {
+    "state_files": {
+      ".nexus/state/tasks.json": null
+    }
+  },
+  "steps": [
+    {
+      "description": "Add task A with no dependencies",
+      "action": {
+        "tool": "task_add",
+        "params": {
+          "title": "Task A — foundation work",
+          "context": "Must complete before Task B can start",
+          "deps": [],
+          "goal": "Validate dependency ordering"
+        }
+      },
+      "assert_return": {
+        "$.task.id": 1,
+        "$.task.status": "pending",
+        "$.task.deps.length": 0
+      }
+    },
+    {
+      "description": "Add task B that depends on task A (id=1)",
+      "action": {
+        "tool": "task_add",
+        "params": {
+          "title": "Task B — requires A",
+          "context": "Can only start after Task A is completed",
+          "deps": [1]
+        }
+      },
+      "assert_return": {
+        "$.task.id": 2,
+        "$.task.status": "pending",
+        "$.task.deps.length": 1,
+        "$.task.deps[0]": 1
+      }
+    },
+    {
+      "description": "List tasks — only task A should be ready because task B's dep is not complete",
+      "action": {
+        "tool": "task_list",
+        "params": {}
+      },
+      "assert_return": {
+        "$.summary.total": 2,
+        "$.summary.ready.length": 1,
+        "$.summary.ready[0]": 1
+      }
+    },
+    {
+      "description": "Mark task A as completed",
+      "action": {
+        "tool": "task_update",
+        "params": {
+          "id": 1,
+          "status": "completed"
+        }
+      },
+      "assert_return": {
+        "$.task.id": 1,
+        "$.task.status": "completed"
+      }
+    },
+    {
+      "description": "List tasks again — task B should now be ready because its dep (A) is completed",
+      "action": {
+        "tool": "task_list",
+        "params": {}
+      },
+      "assert_return": {
+        "$.summary.ready.length": 1,
+        "$.summary.ready[0]": 2,
+        "$.summary.completed": 1
+      }
+    }
+  ]
+}