npm - mindforge-cc - Versions diffs - 4.3.0 → 5.1.0 - Mend

mindforge-cc 4.3.0 → 5.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (150) hide show

package/.agent/skills/mindforge-neural-orchestrator/references/codex-tools.md ADDED Viewed

@@ -0,0 +1,100 @@
+# Codex Tool Mapping
+Skills use Claude Code tool names. When you encounter these in a skill, use your platform equivalent:
+| Skill references | Codex equivalent |
+|-----------------|------------------|
+| `Task` tool (dispatch subagent) | `spawn_agent` (see [Named agent dispatch](#named-agent-dispatch)) |
+| Multiple `Task` calls (parallel) | Multiple `spawn_agent` calls |
+| Task returns result | `wait` |
+| Task completes automatically | `close_agent` to free slot |
+| `TodoWrite` (task tracking) | `update_plan` |
+| `Skill` tool (invoke a skill) | Skills load natively — just follow the instructions |
+| `Read`, `Write`, `Edit` (files) | Use your native file tools |
+| `Bash` (run commands) | Use your native shell tools |
+## Subagent dispatch requires multi-agent support
+Add to your Codex config (`~/.codex/config.toml`):
+```toml
+[features]
+multi_agent = true
+```
+This enables `spawn_agent`, `wait`, and `close_agent` for skills like `dispatching-parallel-agents` and `mindforge-swarm-execution`.
+## Named agent dispatch
+Claude Code skills reference named agent types like `mindforge:code-reviewer`.
+Codex does not have a named agent registry — `spawn_agent` creates generic agents
+from built-in roles (`default`, `explorer`, `worker`).
+When a skill says to dispatch a named agent type:
+1. Find the agent's prompt file (e.g., `agents/code-reviewer.md` or the skill's
+   local prompt template like `code-quality-reviewer-prompt.md`)
+2. Read the prompt content
+3. Fill any template placeholders (`{BASE_SHA}`, `{WHAT_WAS_IMPLEMENTED}`, etc.)
+4. Spawn a `worker` agent with the filled content as the `message`
+| Skill instruction | Codex equivalent |
+|-------------------|------------------|
+| `Task tool (mindforge:code-reviewer)` | `spawn_agent(agent_type="worker", message=...)` with `code-reviewer.md` content |
+| `Task tool (general-purpose)` with inline prompt | `spawn_agent(message=...)` with the same prompt |
+### Message framing
+The `message` parameter is user-level input, not a system prompt. Structure it
+for maximum instruction adherence:
+```
+Your task is to perform the following. Follow the instructions below exactly.
+<agent-instructions>
+[filled prompt content from the agent's .md file]
+</agent-instructions>
+Execute this now. Output ONLY the structured response following the format
+specified in the instructions above.
+```
+- Use task-delegation framing ("Your task is...") rather than persona framing ("You are...")
+- Wrap instructions in XML tags — the model treats tagged blocks as authoritative
+- End with an explicit execution directive to prevent summarization of the instructions
+### When this workaround can be removed
+This approach compensates for Codex's plugin system not yet supporting an `agents`
+field in `plugin.json`. When `RawPluginManifest` gains an `agents` field, the
+plugin can symlink to `agents/` (mirroring the existing `skills/` symlink) and
+skills can dispatch named agent types directly.
+## Environment Detection
+Skills that create worktrees or finish branches should detect their
+environment with read-only git commands before proceeding:
+```bash
+GIT_DIR=$(cd "$(git rev-parse --git-dir)" 2>/dev/null && pwd -P)
+GIT_COMMON=$(cd "$(git rev-parse --git-common-dir)" 2>/dev/null && pwd -P)
+BRANCH=$(git branch --show-current)
+```
+- `GIT_DIR != GIT_COMMON` → already in a linked worktree (skip creation)
+- `BRANCH` empty → detached HEAD (cannot branch/push/PR from sandbox)
+See `using-git-worktrees` Step 0 and `mindforge-ship_extended`
+Step 1 for how each skill uses these signals.
+## Codex App Finishing
+When the sandbox blocks branch/push operations (detached HEAD in an
+externally managed worktree), the agent commits all work and informs
+the user to use the App's native controls:
+- **"Create branch"** — names the branch, then commit/push/PR via App UI
+- **"Hand off to local"** — transfers work to the user's local checkout
+The agent can still run tests, stage files, and output suggested branch
+names, commit messages, and PR descriptions for the user to copy.

package/.agent/skills/mindforge-neural-orchestrator/references/gemini-tools.md ADDED Viewed

@@ -0,0 +1,33 @@
+# Gemini CLI Tool Mapping
+Skills use Claude Code tool names. When you encounter these in a skill, use your platform equivalent:
+| Skill references | Gemini CLI equivalent |
+|-----------------|----------------------|
+| `Read` (file reading) | `read_file` |
+| `Write` (file creation) | `write_file` |
+| `Edit` (file editing) | `replace` |
+| `Bash` (run commands) | `run_shell_command` |
+| `Grep` (search file content) | `grep_search` |
+| `Glob` (search files by name) | `glob` |
+| `TodoWrite` (task tracking) | `write_todos` |
+| `Skill` tool (invoke a skill) | `activate_skill` |
+| `WebSearch` | `google_web_search` |
+| `WebFetch` | `web_fetch` |
+| `Task` tool (dispatch subagent) | No equivalent — Gemini CLI does not support subagents |
+## No subagent support
+Gemini CLI has no equivalent to Claude Code's `Task` tool. Skills that rely on subagent dispatch (`mindforge-swarm-execution`, `dispatching-parallel-agents`) will fall back to single-session execution via `mindforge-execute-phase_extended`.
+## Additional Gemini CLI tools
+These tools are available in Gemini CLI but have no Claude Code equivalent:
+| Tool | Purpose |
+|------|---------|
+| `list_directory` | List files and subdirectories |
+| `save_memory` | Persist facts to GEMINI.md across sessions |
+| `ask_user` | Request structured input from the user |
+| `tracker_create_task` | Rich task management (create, update, list, visualize) |
+| `enter_plan_mode` / `exit_plan_mode` | Switch to read-only research mode before making changes |

package/.agent/skills/mindforge-parallel-mesh_extended/SKILL.md ADDED Viewed

@@ -0,0 +1,182 @@
+---
+name: dispatching-parallel-agents
+description: Use when facing 2+ independent tasks that can be worked on without shared state or sequential dependencies
+---
+# Dispatching Parallel Agents
+## Overview
+You delegate tasks to specialized agents with isolated context. By precisely crafting their instructions and context, you ensure they stay focused and succeed at their task. They should never inherit your session's context or history — you construct exactly what they need. This also preserves your own context for coordination work.
+When you have multiple unrelated failures (different test files, different subsystems, different bugs), investigating them sequentially wastes time. Each investigation is independent and can happen in parallel.
+**Core principle:** Dispatch one agent per independent problem domain. Let them work concurrently.
+## When to Use
+```dot
+digraph when_to_use {
+    "Multiple failures?" [shape=diamond];
+    "Are they independent?" [shape=diamond];
+    "Single agent investigates all" [shape=box];
+    "One agent per problem domain" [shape=box];
+    "Can they work in parallel?" [shape=diamond];
+    "Sequential agents" [shape=box];
+    "Parallel dispatch" [shape=box];
+    "Multiple failures?" -> "Are they independent?" [label="yes"];
+    "Are they independent?" -> "Single agent investigates all" [label="no - related"];
+    "Are they independent?" -> "Can they work in parallel?" [label="yes"];
+    "Can they work in parallel?" -> "Parallel dispatch" [label="yes"];
+    "Can they work in parallel?" -> "Sequential agents" [label="no - shared state"];
+}
+```
+**Use when:**
+- 3+ test files failing with different root causes
+- Multiple subsystems broken independently
+- Each problem can be understood without context from others
+- No shared state between investigations
+**Don't use when:**
+- Failures are related (fix one might fix others)
+- Need to understand full system state
+- Agents would interfere with each other
+## The Pattern
+### 1. Identify Independent Domains
+Group failures by what's broken:
+- File A tests: Tool approval flow
+- File B tests: Batch completion behavior
+- File C tests: Abort functionality
+Each domain is independent - fixing tool approval doesn't affect abort tests.
+### 2. Create Focused Agent Tasks
+Each agent gets:
+- **Specific scope:** One test file or subsystem
+- **Clear goal:** Make these tests pass
+- **Constraints:** Don't change other code
+- **Expected output:** Summary of what you found and fixed
+### 3. Dispatch in Parallel
+```typescript
+// In Claude Code / AI environment
+Task("Fix agent-tool-abort.test.ts failures")
+Task("Fix batch-completion-behavior.test.ts failures")
+Task("Fix tool-approval-race-conditions.test.ts failures")
+// All three run concurrently
+```
+### 4. Review and Integrate
+When agents return:
+- Read each summary
+- Verify fixes don't conflict
+- Run full test suite
+- Integrate all changes
+## Agent Prompt Structure
+Good agent prompts are:
+1. **Focused** - One clear problem domain
+2. **Self-contained** - All context needed to understand the problem
+3. **Specific about output** - What should the agent return?
+```markdown
+Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:
+1. "should abort tool with partial output capture" - expects 'interrupted at' in message
+2. "should handle mixed completed and aborted tools" - fast tool aborted instead of completed
+3. "should properly track pendingToolCount" - expects 3 results but gets 0
+These are timing/race condition issues. Your task:
+1. Read the test file and understand what each test verifies
+2. Identify root cause - timing issues or actual bugs?
+3. Fix by:
+   - Replacing arbitrary timeouts with event-based waiting
+   - Fixing bugs in abort implementation if found
+   - Adjusting test expectations if testing changed behavior
+Do NOT just increase timeouts - find the real issue.
+Return: Summary of what you found and what you fixed.
+```
+## Common Mistakes
+**❌ Too broad:** "Fix all the tests" - agent gets lost
+**✅ Specific:** "Fix agent-tool-abort.test.ts" - focused scope
+**❌ No context:** "Fix the race condition" - agent doesn't know where
+**✅ Context:** Paste the error messages and test names
+**❌ No constraints:** Agent might refactor everything
+**✅ Constraints:** "Do NOT change production code" or "Fix tests only"
+**❌ Vague output:** "Fix it" - you don't know what changed
+**✅ Specific:** "Return summary of root cause and changes"
+## When NOT to Use
+**Related failures:** Fixing one might fix others - investigate together first
+**Need full context:** Understanding requires seeing entire system
+**Exploratory debugging:** You don't know what's broken yet
+**Shared state:** Agents would interfere (editing same files, using same resources)
+## Real Example from Session
+**Scenario:** 6 test failures across 3 files after major refactoring
+**Failures:**
+- agent-tool-abort.test.ts: 3 failures (timing issues)
+- batch-completion-behavior.test.ts: 2 failures (tools not executing)
+- tool-approval-race-conditions.test.ts: 1 failure (execution count = 0)
+**Decision:** Independent domains - abort logic separate from batch completion separate from race conditions
+**Dispatch:**
+```
+Agent 1 → Fix agent-tool-abort.test.ts
+Agent 2 → Fix batch-completion-behavior.test.ts
+Agent 3 → Fix tool-approval-race-conditions.test.ts
+```
+**Results:**
+- Agent 1: Replaced timeouts with event-based waiting
+- Agent 2: Fixed event structure bug (threadId in wrong place)
+- Agent 3: Added wait for async tool execution to complete
+**Integration:** All fixes independent, no conflicts, full suite green
+**Time saved:** 3 problems solved in parallel vs sequentially
+## Key Benefits
+1. **Parallelization** - Multiple investigations happen simultaneously
+2. **Focus** - Each agent has narrow scope, less context to track
+3. **Independence** - Agents don't interfere with each other
+4. **Speed** - 3 problems solved in time of 1
+## Verification
+After agents return:
+1. **Review each summary** - Understand what changed
+2. **Check for conflicts** - Did agents edit same code?
+3. **Run full suite** - Verify all fixes work together
+4. **Spot check** - Agents can make systematic errors
+## Real-World Impact
+From debugging session (2025-10-03):
+- 6 failures across 3 files
+- 3 agents dispatched in parallel
+- All investigations completed concurrently
+- All fixes integrated successfully
+- Zero conflicts between agent changes

package/.agent/skills/mindforge-plan-phase_extended/SKILL.md ADDED Viewed

@@ -0,0 +1,152 @@
+---
+name: mindforge-plan-phase_extended
+description: Use when you have a spec or requirements for a multi-step task, before touching code
+---
+# Writing Plans
+## Overview
+Write comprehensive implementation plans assuming the engineer has zero context for our codebase and questionable taste. Document everything they need to know: which files to touch for each task, code, testing, docs they might need to check, how to test it. Give them the whole plan as bite-sized tasks. DRY. YAGNI. TDD. Frequent commits.
+Assume they are a skilled developer, but know almost nothing about our toolset or problem domain. Assume they don't know good test design very well.
+**Announce at start:** "I'm using the mindforge-plan-phase_extended skill to create the implementation plan."
+**Context:** This should be run in a dedicated worktree (created by brainstorming skill).
+**Save plans to:** `docs/mindforge/plans/YYYY-MM-DD-<feature-name>.md`
+- (User preferences for plan location override this default)
+## Scope Check
+If the spec covers multiple independent subsystems, it should have been broken into sub-project specs during brainstorming. If it wasn't, suggest breaking this into separate plans — one per subsystem. Each plan should produce working, testable software on its own.
+## File Structure
+Before defining tasks, map out which files will be created or modified and what each one is responsible for. This is where decomposition decisions get locked in.
+- Design units with clear boundaries and well-defined interfaces. Each file should have one clear responsibility.
+- You reason best about code you can hold in context at once, and your edits are more reliable when files are focused. Prefer smaller, focused files over large ones that do too much.
+- Files that change together should live together. Split by responsibility, not by technical layer.
+- In existing codebases, follow established patterns. If the codebase uses large files, don't unilaterally restructure - but if a file you're modifying has grown unwieldy, including a split in the plan is reasonable.
+This structure informs the task decomposition. Each task should produce self-contained changes that make sense independently.
+## Bite-Sized Task Granularity
+**Each step is one action (2-5 minutes):**
+- "Write the failing test" - step
+- "Run it to make sure it fails" - step
+- "Implement the minimal code to make the test pass" - step
+- "Run the tests and make sure they pass" - step
+- "Commit" - step
+## Plan Document Header
+**Every plan MUST start with this header:**
+```markdown
+# [Feature Name] Implementation Plan
+> **For agentic workers:** REQUIRED SUB-SKILL: Use mindforge:swarm-execution (recommended) or mindforge:execute-phase_extended to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+**Goal:** [One sentence describing what this builds]
+**Architecture:** [2-3 sentences about approach]
+**Tech Stack:** [Key technologies/libraries]
+---
+```
+## Task Structure
+````markdown
+### Task N: [Component Name]
+**Files:**
+- Create: `exact/path/to/file.py`
+- Modify: `exact/path/to/existing.py:123-145`
+- Test: `tests/exact/path/to/test.py`
+- [ ] **Step 1: Write the failing test**
+```python
+def test_specific_behavior():
+    result = function(input)
+    assert result == expected
+```
+- [ ] **Step 2: Run test to verify it fails**
+Run: `pytest tests/path/test.py::test_name -v`
+Expected: FAIL with "function not defined"
+- [ ] **Step 3: Write minimal implementation**
+```python
+def function(input):
+    return expected
+```
+- [ ] **Step 4: Run test to verify it passes**
+Run: `pytest tests/path/test.py::test_name -v`
+Expected: PASS
+- [ ] **Step 5: Commit**
+```bash
+git add tests/path/test.py src/path/file.py
+git commit -m "feat: add specific feature"
+```
+````
+## No Placeholders
+Every step must contain the actual content an engineer needs. These are **plan failures** — never write them:
+- "TBD", "TODO", "implement later", "fill in details"
+- "Add appropriate error handling" / "add validation" / "handle edge cases"
+- "Write tests for the above" (without actual test code)
+- "Similar to Task N" (repeat the code — the engineer may be reading tasks out of order)
+- Steps that describe what to do without showing how (code blocks required for code steps)
+- References to types, functions, or methods not defined in any task
+## Remember
+- Exact file paths always
+- Complete code in every step — if a step changes code, show the code
+- Exact commands with expected output
+- DRY, YAGNI, TDD, frequent commits
+## Self-Review
+After writing the complete plan, look at the spec with fresh eyes and check the plan against it. This is a checklist you run yourself — not a subagent dispatch.
+**1. Spec coverage:** Skim each section/requirement in the spec. Can you point to a task that implements it? List any gaps.
+**2. Placeholder scan:** Search your plan for red flags — any of the patterns from the "No Placeholders" section above. Fix them.
+**3. Type consistency:** Do the types, method signatures, and property names you used in later tasks match what you defined in earlier tasks? A function called `clearLayers()` in Task 3 but `clearFullLayers()` in Task 7 is a bug.
+If you find issues, fix them inline. No need to re-review — just fix and move on. If you find a spec requirement with no task, add the task.
+## Execution Handoff
+After saving the plan, offer execution choice:
+**"Plan complete and saved to `docs/mindforge/plans/<filename>.md`. Two execution options:**
+**1. Subagent-Driven (recommended)** - I dispatch a fresh subagent per task, review between tasks, fast iteration
+**2. Inline Execution** - Execute tasks in this session using mindforge-execute-phase_extended, batch execution with checkpoints
+**Which approach?"**
+**If Subagent-Driven chosen:**
+- **REQUIRED SUB-SKILL:** Use mindforge:swarm-execution
+- Fresh subagent per task + two-stage review
+**If Inline Execution chosen:**
+- **REQUIRED SUB-SKILL:** Use mindforge:execute-phase_extended
+- Batch execution with checkpoints for review

package/.agent/skills/mindforge-plan-phase_extended/plan-document-reviewer-prompt.md ADDED Viewed

@@ -0,0 +1,49 @@
+# Plan Document Reviewer Prompt Template
+Use this template when dispatching a plan document reviewer subagent.
+**Purpose:** Verify the plan is complete, matches the spec, and has proper task decomposition.
+**Dispatch after:** The complete plan is written.
+```
+Task tool (general-purpose):
+  description: "Review plan document"
+  prompt: |
+    You are a plan document reviewer. Verify this plan is complete and ready for implementation.
+    **Plan to review:** [PLAN_FILE_PATH]
+    **Spec for reference:** [SPEC_FILE_PATH]
+    ## What to Check
+    | Category | What to Look For |
+    |----------|------------------|
+    | Completeness | TODOs, placeholders, incomplete tasks, missing steps |
+    | Spec Alignment | Plan covers spec requirements, no major scope creep |
+    | Task Decomposition | Tasks have clear boundaries, steps are actionable |
+    | Buildability | Could an engineer follow this plan without getting stuck? |
+    ## Calibration
+    **Only flag issues that would cause real problems during implementation.**
+    An implementer building the wrong thing or getting stuck is an issue.
+    Minor wording, stylistic preferences, and "nice to have" suggestions are not.
+    Approve unless there are serious gaps — missing requirements from the spec,
+    contradictory steps, placeholder content, or tasks so vague they can't be acted on.
+    ## Output Format
+    ## Plan Review
+    **Status:** Approved | Issues Found
+    **Issues (if any):**
+    - [Task X, Step Y]: [specific issue] - [why it matters for implementation]
+    **Recommendations (advisory, do not block approval):**
+    - [suggestions for improvement]
+```
+**Reviewer returns:** Status, Issues (if any), Recommendations