npm - agent-mind - Versions diffs - 1.0.0 - Mend

agent-mind 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

package/LICENSE +21 -0
package/README.md +229 -0
package/bin/cli.js +38 -0
package/package.json +33 -0
package/src/commands/doctor.js +269 -0
package/src/commands/init.js +345 -0
package/src/commands/meta.js +34 -0
package/src/commands/upgrade.js +177 -0
package/src/index.js +18 -0
package/src/utils/detect-tools.js +62 -0
package/src/utils/inject-adapter.js +65 -0
package/src/utils/template.js +103 -0
package/src/utils/version.js +71 -0
package/template/.am-tools/compact.sh +171 -0
package/template/.am-tools/guide.md +274 -0
package/template/.am-tools/health-check.sh +165 -0
package/template/.am-tools/validate.sh +174 -0
package/template/BOOT.md +71 -0
package/template/README.md +109 -0
package/template/VERSION.md +57 -0
package/template/adapters/claude.md +56 -0
package/template/adapters/codex.md +33 -0
package/template/adapters/cursor.md +35 -0
package/template/adapters/gemini.md +32 -0
package/template/config.md +33 -0
package/template/history/episodes/_index.md +13 -0
package/template/history/maintenance-log.md +9 -0
package/template/history/reflections/_index.md +11 -0
package/template/knowledge/domains/_template/failures/_index.md +19 -0
package/template/knowledge/domains/_template/patterns.md +21 -0
package/template/knowledge/insights.md +23 -0
package/template/knowledge/stack/_template.md +20 -0
package/template/protocols/compaction.md +101 -0
package/template/protocols/maintenance.md +99 -0
package/template/protocols/memory-ops.md +89 -0
package/template/protocols/quality-gate.md +66 -0
package/template/protocols/workflow.md +81 -0
package/template/workspace/.gitkeep +0 -0

package/template/adapters/cursor.md ADDED Viewed

@@ -0,0 +1,35 @@
+# Cursor Integration
+## Setup
+Create `.cursor/rules/agent-mind.md` in your project root:
+```markdown
+## Agent Mind Memory System
+This project uses Agent Mind for structured memory management.
+At the start of every session, read `.agent-mind/BOOT.md` and follow its protocols.
+Use `.agent-mind/workspace/` as working memory for the current task.
+After completing a task, follow `.agent-mind/protocols/compaction.md`.
+When asked about memory health, follow `.agent-mind/protocols/maintenance.md`.
+```
+Alternatively, add the snippet to an existing `.cursorrules` file in your project root.
+## How It Works
+- Cursor reads `.cursor/rules/*.md` and `.cursorrules` at session start
+- The snippet points Cursor to the `.agent-mind/` system
+- Cursor can read project files, so all `.agent-mind/` content is accessible
+## Coexistence
+- Cursor has its own rules system and "Memory Bank" community patterns
+- Agent Mind provides a more structured approach with explicit protocols
+- If you use Cursor's Memory Bank pattern (productContext.md etc.), Agent Mind's `knowledge/` serves a similar but more rigorous purpose
+- You can use both — they don't conflict
+## Cursor-Specific Tips
+- Cursor's composer and chat modes both read rules files
+- In composer mode, the full workflow protocol may be too heavy. Consider a lighter version for quick edits.
+- Cursor works well with the warm-tier knowledge files — it can load domain patterns alongside your code context

package/template/adapters/gemini.md ADDED Viewed

@@ -0,0 +1,32 @@
+# Gemini CLI Integration
+## Setup
+Add this block to your project's `GEMINI.md`:
+```markdown
+## Agent Mind Memory System
+This project uses Agent Mind for structured memory management.
+At the start of every session, read `.agent-mind/BOOT.md` and follow its protocols.
+Use `.agent-mind/workspace/` as working memory for the current task.
+After completing a task, follow `.agent-mind/protocols/compaction.md`.
+When asked about memory health, follow `.agent-mind/protocols/maintenance.md`.
+```
+## How It Works
+- Gemini CLI reads `GEMINI.md` at session start
+- The snippet points Gemini to the `.agent-mind/` system
+- Gemini CLI can read files from the filesystem, so all content is accessible
+## Coexistence
+- Gemini CLI has its own instruction system via GEMINI.md
+- Agent Mind provides structured persistent memory that Gemini lacks natively
+- Particularly valuable for Gemini since its native memory across sessions is limited
+## Gemini-Specific Tips
+- Gemini's instruction following varies by model tier. Gemini Pro follows protocols well. Flash may skip steps.
+- Keep the GEMINI.md snippet short — let BOOT.md handle the detailed instructions
+- Gemini handles structured markdown well. The format of Agent Mind files is compatible.

package/template/config.md ADDED Viewed

@@ -0,0 +1,33 @@
+# Agent Mind — Configuration
+## Project
+- **Name:** {{PROJECT_NAME}}
+- **Description:** {{PROJECT_DESCRIPTION}}
+- **Created:** {{DATE}}
+## Domains
+Knowledge domains relevant to this project. Each domain listed here should have a folder in `knowledge/domains/`.
+{{DOMAINS}}
+## Stack
+Technologies used in this project. Each entry can have a matching file in `knowledge/stack/`.
+{{STACK}}
+## Agent Preferences
+- **Primary agent:** {{PRIMARY_AGENT}}
+- **Thinking depth:** adaptive (scale protocol depth to task size)
+- **Memory writes:** quality-gated (all writes to knowledge/ pass through protocols/quality-gate.md)
+- **Maintenance frequency:** every 2 weeks or on request
+## Project Context
+Add project-specific context the agent should know across all sessions. Things like: architecture decisions, team conventions, important constraints, deployment targets.
+(Add as the project progresses. Keep under 50 lines — this loads every session as hot memory.)
+## Notes
+- This file loads every session (hot tier). Keep it concise.
+- For domain-specific knowledge, use `knowledge/domains/` instead.
+- For tech-specific knowledge, use `knowledge/stack/` instead.
+- This file is for project-wide context that doesn't fit elsewhere.

package/template/history/episodes/_index.md ADDED Viewed

@@ -0,0 +1,13 @@
+# Episode Index
+Task history. One line per completed task. Append-only — never delete entries, only archive old ones during maintenance.
+**Format:** `YYYY-MM-DD | domain(s) | outcome | task-slug | One-line summary`
+**Outcomes:** completed | failed | abandoned
+---
+<!-- Episodes will accumulate here as tasks are completed and compacted.
+Search this file to find related past work before starting a new task.
+During maintenance, entries older than 90 days may be moved to _archive.md. -->

package/template/history/maintenance-log.md ADDED Viewed

@@ -0,0 +1,9 @@
+# Maintenance Log
+Record of every maintenance action taken on the memory system.
+**Format:** `YYYY-MM-DD | actions taken | triggered by (human request / scheduled / agent-initiated)`
+---
+<!-- Log maintenance actions here. This provides an audit trail for how the memory system has evolved. -->

package/template/history/reflections/_index.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Reflection Index
+Failure analysis log. One line per failed task that was analyzed. Each entry has a matching detailed reflection file.
+**Format:** `YYYY-MM-DD | domain | slug | What went wrong (one line)`
+---
+<!-- Reflections accumulate here when tasks fail and are analyzed during compaction.
+These are Reflexion-style self-critiques: what went wrong, why, what to do differently.
+Search this when debugging similar problems or when a domain keeps having issues. -->

package/template/knowledge/domains/_template/failures/_index.md ADDED Viewed

@@ -0,0 +1,19 @@
+# [Domain Name] — Known Failures
+One-line per failure. Scan this during Phase 3 (Think Critically) to catch known problems before they happen.
+**Format:** `date | slug | trigger-condition | one-line summary`
+<!-- Add failures below. Each should have a matching detailed file: [slug].md
+Example:
+2026-03-15 | jwt-token-expiry | task involves JWT + multiple sessions | Token expiry race condition when user has multiple tabs open
+For each failure in this index, there should be a detailed file with:
+- What was built
+- What was undefined at spec time
+- What broke
+- Root cause
+- What fixed it
+- Detection condition (how to spot this in future tasks)
+-->

package/template/knowledge/domains/_template/patterns.md ADDED Viewed

@@ -0,0 +1,21 @@
+# [Domain Name] — Patterns
+Proven approaches for this domain. Each entry was quality-gated and verified.
+## How to Use This File
+- Load this when a task touches this domain (see `protocols/workflow.md` Phase 2)
+- Each pattern describes WHAT works, WHEN to use it, and WHY
+- Patterns with dates are newer and more relevant
+- Check `failures/_index.md` alongside this — knowing what breaks is as valuable as knowing what works
+---
+<!-- Add patterns below. Format:
+### [Pattern Name]
+**When:** [conditions when this pattern applies]
+**What:** [the approach/technique/solution]
+**Why:** [reasoning — why this works, what it prevents]
+**Added:** YYYY-MM-DD | From: [originating task slug]
+-->

package/template/knowledge/insights.md ADDED Viewed

@@ -0,0 +1,23 @@
+# Cross-Domain Insights
+Generalizable learnings that apply across multiple domains. Managed by vote count — high-vote insights are promoted to domain patterns, low-vote insights are pruned.
+**Operations:** ADD (new insight) | UPVOTE (confirmed) | DOWNVOTE (contradicted) | PROMOTE (votes>5 → move to domain patterns) | REMOVE (votes<-2 after 10+ tasks)
+**Format:**
+```
+### [Insight title]
+- **Insight:** [the learning]
+- **Domains:** [which domains this applies to]
+- **Votes:** [number]
+- **Added:** YYYY-MM-DD | **Last touched:** YYYY-MM-DD
+- **Evidence:** [brief: what tasks confirmed or contradicted this]
+```
+---
+<!-- No insights yet. They emerge from work.
+As you complete tasks and follow the compaction protocol, insights will accumulate here.
+High-vote insights get promoted to domain patterns. Low-vote insights get pruned.
+This file is the system's learning frontier — where new knowledge is tested before becoming established. -->

package/template/knowledge/stack/_template.md ADDED Viewed

@@ -0,0 +1,20 @@
+# [Technology Name] — Stack Knowledge
+What the agent should know when working with this technology in this project.
+## Project-Specific Setup
+<!-- How this tech is configured in THIS project. Versions, config, conventions. -->
+## Patterns
+<!-- Proven approaches specific to this technology. -->
+## Gotchas
+<!-- Common pitfalls. Things that look right but aren't. -->
+## Conventions
+<!-- Team/project conventions for this technology. Naming, structure, etc. -->
+<!--
+Added: YYYY-MM-DD
+Keep this file under 200 lines. If it grows beyond that, split into sub-files.
+-->

package/template/protocols/compaction.md ADDED Viewed

@@ -0,0 +1,101 @@
+# Compaction Protocol
+Run this after every completed task. Goal: capture what matters, discard noise, keep memory healthy. This is the learning loop — the mechanism that makes the system smarter over time.
+Backed by: ExpeL (cross-task learning, +31% on ALFWorld), Reflexion (failure analysis, +22% on AlfWorld), SimpleMem (quality-gated writes, 26.4% improvement over Mem0).
+---
+## Step 1: Create Episode Summary
+Create a new file: `history/episodes/YYYY-MM/[task-slug].md`
+Use this format:
+```
+# [Task Slug]
+**Date:** YYYY-MM-DD
+**Domain(s):** [domains touched]
+**Outcome:** completed | failed | abandoned
+**Summary:** [2-3 sentences: what was done, what was the result]
+**Key insight:** [1 sentence: the most important thing learned, or "none"]
+**Assumptions made:** [brief list, or "none"]
+```
+Add a one-line entry to `history/episodes/_index.md`:
+```
+YYYY-MM-DD | domain(s) | outcome | task-slug | One-line summary
+```
+**Every task gets an episode.** Even trivial ones get a one-liner in the index.
+## Step 2: Quality Gate
+Before writing ANYTHING to `knowledge/`, pass through `protocols/quality-gate.md`.
+Ask three questions:
+1. **Is it new?** Not already captured in knowledge/.
+2. **Is it generalizable?** Applies beyond this specific task.
+3. **Was the outcome verified?** Tests passed, human confirmed, or logic holds.
+If yes to all three → proceed to Step 3.
+If uncertain → tag `[UNVERIFIED]` and proceed.
+If no → stop here. The episode summary is enough.
+## Step 3: Extract Learnings
+### Path A: Task Completed Successfully
+**Check insights:** Does this task confirm an existing insight in `knowledge/insights.md`?
+- Yes → UPVOTE that insight (increment vote count)
+- No → Is there a new generalizable learning? → ADD with `votes: 1`
+**Check patterns:** Did you use an approach worth remembering?
+- If new reusable pattern → append to `knowledge/domains/[domain]/patterns.md`
+- Include: what the pattern is, when to use it, date, originating task
+**Check for promotion:** Any insight in insights.md with votes > 5?
+- Yes → move it to the relevant domain's patterns.md (it's proven enough)
+### Path B: Task Failed
+**Write reflection** to `history/reflections/YYYY-MM-DD-[slug].md`:
+```
+# Reflection: [Task Slug]
+**Date:** YYYY-MM-DD
+**What was attempted:** [brief description]
+**What went wrong:** [what actually happened]
+**Root cause:** [why it happened — not symptoms, the actual cause]
+**What to do differently:** [concrete change for next time]
+**Detection condition:** [how to spot this failure pattern in future tasks]
+```
+Add entry to `history/reflections/_index.md`:
+```
+YYYY-MM-DD | domain | slug | One-line: what went wrong
+```
+**Update failure library:** Check `knowledge/domains/[domain]/failures/`:
+- New failure pattern → create entry file, add to `_index.md`
+- Known failure that wasn't caught → update its detection conditions
+- Failure in `_index.md` format: `date | slug | trigger condition | one-line summary`
+**Check insights:**
+- Should have prevented this? → UPVOTE the relevant insight
+- New insight from this failure? → ADD with `votes: 1`
+### Path C: Task Abandoned
+Just log the episode (Step 1) with outcome "abandoned" and a note on why. No knowledge extraction. Abandoned tasks don't teach reliably — the outcome is unknown.
+## Step 4: Clear Workspace
+After Steps 1-3 are complete:
+- All valuable information is now in `history/` or `knowledge/`
+- Delete all files from `workspace/`
+- Workspace is now clean for the next task
+## Scaling
+**Quick task:** Step 1 (1-line index entry only), Step 4. Skip Steps 2-3.
+**Normal task:** Full Steps 1-4.
+**Failed task:** Full Steps 1-4 with emphasis on Step 3 Path B.

package/template/protocols/maintenance.md ADDED Viewed

@@ -0,0 +1,99 @@
+# Maintenance Protocol
+Memory degrades over time without maintenance. Stale insights mislead. Oversized files get partially ignored. Wrong patterns compound errors. This protocol catches those problems.
+Run this when:
+- The human asks for a memory health check
+- You notice memory is getting large or stale during normal work
+- It's been 2+ weeks since last maintenance (check `history/maintenance-log.md`)
+- After a cluster of failed tasks (something might be wrong with knowledge/)
+---
+## Step 1: Size Check
+Check every file against its size limit (from `protocols/memory-ops.md`):
+| File | Max | Action if exceeded |
+|------|-----|-------------------|
+| BOOT.md | 150 lines | Must trim. This file's adherence matters most. |
+| Each protocol file | 200 lines | Split or trim. |
+| Each domain patterns.md | 200 lines | Archive older/less-used patterns to `_archive.md` |
+| Each failure _index.md | 100 lines | Archive old entries |
+| insights.md | 100 entries | Prune lowest-vote entries |
+| episode _index.md | Unlimited | Archive entries older than 90 days to `_archive.md` |
+Flag any oversize files with exact line counts.
+## Step 2: Stale Memory Check
+- **Zero-vote insights** untouched for 30+ days → flag for review. Are they worth keeping?
+- **[UNVERIFIED] entries** older than 14 days → ask human to verify or remove
+- **Domain patterns** not referenced by any task in 60+ days → flag as potentially stale
+- **Stack knowledge** for tech no longer in `config.md` → flag for archival
+## Step 3: Contradiction Check
+This is the most important step. Bad memories compound.
+- Did any recent task **fail** in a domain where `patterns.md` was loaded?
+  → The loaded pattern might have been wrong. Cross-reference the failure with the pattern.
+- Are there insights with **negative votes**?
+  → List them with vote counts. Recommend removal for votes < -2.
+- Are there **contradictory entries** — two patterns that give conflicting advice?
+  → Flag both with the contradiction. Human decides which to keep.
+- Did the agent **ignore a failure pattern** that turned out to be relevant?
+  → The failure's detection conditions need updating.
+## Step 4: Growth Review
+- How many episodes were created since last maintenance?
+- How many new knowledge entries were written?
+- How many insights were added vs promoted vs removed?
+- Is the system learning? (Are insights getting upvoted? Are failures being caught?)
+- Is the system degrading? (Increasing failure rate? Patterns not helping?)
+## Step 5: Produce Report
+Create a report for the human. Don't act on it — present it.
+```markdown
+## Memory Health Report — YYYY-MM-DD
+### Overall Status: [Healthy | Needs Attention | Issues Found]
+### Size Audit
+- [File]: [current] / [max] lines — [OK | OVER — recommend: trim/split/archive]
+### Stale Entries (need human decision)
+- [Entry description] — last relevant: [date] — recommend: [keep/remove/verify]
+### Suspicious Patterns (possible memory poisoning)
+- [Pattern] in [domain] — evidence: [what went wrong] — recommend: [review/remove/update]
+### Contradictions Found
+- [Pattern A] vs [Pattern B] — recommend: [human decides]
+### Growth Summary
+- Episodes: [N] new since last maintenance
+- Knowledge writes: [N] (passed gate: [N], tagged unverified: [N])
+- Insights: [N] added, [N] upvoted, [N] downvoted, [N] promoted, [N] removed
+### Recommendations
+1. [Specific actionable recommendation]
+2. [...]
+```
+## Step 6: Execute Approved Changes
+After the human reviews the report:
+- Execute only the changes they approve
+- Log what was done to `history/maintenance-log.md`:
+  ```
+  YYYY-MM-DD | Actions: [what was done] | Triggered by: [human request / scheduled]
+  ```
+## Critical Rule
+You NEVER autonomously delete or modify `knowledge/` during maintenance.
+You analyze. You report. You recommend. You wait. The human decides.

package/template/protocols/memory-ops.md ADDED Viewed

@@ -0,0 +1,89 @@
+# Memory Operations Protocol
+How to read from and write to each memory tier. The goal: load the right context at the right time, write only what deserves to persist.
+---
+## Reading Memory
+### Hot Tier — Always Loaded
+These load at session start. No decision needed.
+- `BOOT.md` — operating instructions
+- `config.md` — project context
+- `protocols/*` — operating procedures (loaded as needed during work)
+### Warm Tier — Loaded by Relevance
+Load these based on what the current task needs.
+- `knowledge/domains/[domain]/patterns.md` — when task touches that domain
+- `knowledge/domains/[domain]/failures/_index.md` — scan before implementation
+- `knowledge/stack/[tech].md` — when task involves that technology
+- `knowledge/insights.md` — scan for applicable cross-domain learnings
+**How to decide what to load:**
+1. Identify the task's domain(s) from the description
+2. Load matching domain patterns + failure indexes
+3. Load relevant stack knowledge
+4. Scan insights.md for entries tagged with matching domains
+5. Write what you loaded and why to `workspace/context.md`
+**Don't overload context.** If you match 5+ domains, prioritize the 2-3 most relevant. Loading too much degrades performance. Research shows context rot is real — more isn't better.
+### Cold Tier — Searched on Demand
+Only access when you specifically need historical context.
+- `history/episodes/_index.md` — search for related past work
+- `history/episodes/YYYY-MM/[slug].md` — read specific episode details
+- `history/reflections/_index.md` — search for relevant failure analysis
+- `knowledge/domains/[domain]/failures/[slug].md` — detailed failure context
+---
+## Writing Memory
+### workspace/ (Working Memory)
+- **When:** During any active task
+- **Rules:** Write freely. This is scratch space. Cleared after compaction.
+- **Files:** task.md, context.md, questions.md, assumptions.md, decisions.md, progress.md
+- **No gate required.** This is ephemeral.
+### history/episodes/ (Episodic Memory)
+- **When:** During compaction (protocols/compaction.md) only
+- **Rules:** Append-only. Never edit or delete existing episodes.
+- **Format:** Add entry to `_index.md`, create episode file in `YYYY-MM/`
+- **No gate required.** Every completed task gets an episode. But keep summaries concise (5-10 lines).
+### history/reflections/ (Failure Analysis)
+- **When:** During compaction, only when a task failed
+- **Rules:** Append-only. Follow the reflection format in compaction.md.
+- **Format:** Add entry to `_index.md`, create reflection file
+### knowledge/ (Semantic Memory)
+- **When:** During compaction, and ONLY after passing quality gate
+- **Rules:** MUST pass `protocols/quality-gate.md` before writing
+- **Prefer updates over creation.** If a pattern already exists, update it rather than creating a new entry.
+- **Include provenance.** Every entry should note the date and originating task.
+- **Tag uncertainty.** If outcome wasn't verified, tag `[UNVERIFIED]`.
+### knowledge/insights.md (Cross-Domain Learnings)
+- **Operations:**
+  - `ADD` — new generalizable learning, set `votes: 1`
+  - `UPVOTE` — task confirms existing insight, `votes + 1`
+  - `DOWNVOTE` — task contradicts existing insight, `votes - 1`
+  - `PROMOTE` — insight with `votes > 5` moves to relevant domain's patterns.md
+  - `REMOVE` — insight with `votes < -2` after appearing in 10+ tasks
+---
+## File Size Limits
+Keep these in check. Oversized files degrade agent performance.
+| File | Max Size | Action if exceeded |
+|------|----------|-------------------|
+| BOOT.md | 150 lines | Trim or split into referenced files |
+| Each protocol file | 200 lines | Split into sub-protocols |
+| Each domain patterns.md | 200 lines | Archive older patterns, keep most relevant |
+| Each failure _index.md | 100 lines | Archive old entries to _archive.md |
+| insights.md | 100 entries | Prune low-vote entries, promote high-vote ones |
+| Episode _index.md | Unlimited | But archive entries older than 90 days |
+These limits come from research: files under 200 lines achieve >92% instruction adherence. Beyond that, agents start skipping content.

package/template/protocols/quality-gate.md ADDED Viewed

@@ -0,0 +1,66 @@
+# Quality Gate Protocol
+Not everything deserves to be remembered. Bad memories poison the system. Research confirms: agents using naive "remember everything" strategies show sustained performance decline after an initial improvement. This gate prevents that.
+Inspired by: SimpleMem (quality-gated writes, 26.4% improvement over Mem0), Xiong et al. (self-degradation in long-running agents).
+---
+## The Three Questions
+Before writing anything to `knowledge/`, answer these:
+### 1. Is it new?
+- Does this information already exist in `knowledge/`?
+- If it's a **duplicate** → don't write. The existing entry is enough.
+- If it's a **correction** → edit the existing entry. Note the date and why it changed.
+- If it's an **extension** → update the existing entry with the new information.
+### 2. Is it generalizable?
+- Will this apply to future tasks beyond this specific one?
+- **Generalizable:** "Always validate JWT expiry with a clock skew buffer" — applies to any JWT implementation
+- **Not generalizable:** "The user table has a column called `display_name`" — specific to this project
+- Project-specific facts belong in `config.md` or episode summaries, not in `knowledge/`
+### 3. Was the outcome verified?
+- Did the approach actually work? Evidence:
+  - Tests passed
+  - Human confirmed the result
+  - The logic holds under scrutiny
+  - The approach was used successfully in production
+- **Unverified outcomes** (agent finished, but no confirmation) should be tagged `[UNVERIFIED]`
+## The Decision
+| Question 1 (New?) | Question 2 (General?) | Question 3 (Verified?) | Action |
+|---|---|---|---|
+| Yes | Yes | Yes | Write to knowledge/ |
+| Yes | Yes | Uncertain | Write with `[UNVERIFIED]` tag |
+| Yes | No | Any | Don't write. Episode summary is enough. |
+| No (duplicate) | Any | Any | Don't write. |
+| No (correction) | Yes | Yes | Edit existing entry. |
+| No (extension) | Yes | Yes | Update existing entry. |
+## Memory Poisoning Prevention
+The biggest risk to this system is a wrong entry in `knowledge/`. A bad pattern will be loaded and applied to every future task in that domain. Defenses:
+1. **Provenance:** Every entry in `knowledge/` includes the date and originating task. So you can trace where a questionable pattern came from.
+2. **Uncertainty tagging:** If you're not sure, tag `[UNVERIFIED]`. The maintenance protocol reviews these.
+3. **Contradiction detection:** If a task fails and the failure matches a pattern you loaded from `knowledge/`, that pattern might be wrong. Flag it immediately — don't wait for maintenance.
+4. **Vote decay:** Insights in `insights.md` with negative votes after multiple tasks are likely wrong. Remove at `votes < -2` after 10+ task appearances.
+5. **Human review:** During maintenance (`protocols/maintenance.md`), surface all `[UNVERIFIED]` entries and suspicious patterns for human decision.
+## Insight Voting Rules
+`knowledge/insights.md` uses a vote system to surface what's true and prune what's not:
+- **ADD:** New generalizable learning. Set `votes: 1`. Tag with relevant domain(s).
+- **UPVOTE:** A subsequent task confirms this insight. `votes + 1`.
+- **DOWNVOTE:** A subsequent task contradicts this insight. `votes - 1`.
+- **PROMOTE:** `votes > 5` → move to the relevant domain's `patterns.md`. It's proven.
+- **REMOVE:** `votes < -2` after the insight has existed for 10+ tasks. It's wrong or useless.

package/template/protocols/workflow.md ADDED Viewed

@@ -0,0 +1,81 @@
+# Workflow Protocol
+How you approach every task. Not a rigid pipeline — a thinking process. Scale depth to task size. A 5-minute fix doesn't need a full spec. A multi-day feature does.
+---
+## Phase 1: Understand
+Before anything else, understand what's really being asked.
+- What is the actual goal? (Not just what was said — what is the human trying to achieve?)
+- What is the scope? (What's included? What's explicitly not?)
+- Is this a new task, or continuation of something in `workspace/`?
+Write your understanding to `workspace/task.md`. Keep it to 5-15 lines. If you can't summarize it concisely, you don't understand it yet.
+## Phase 2: Load Context
+Check what you already know that's relevant.
+1. **Domain knowledge**: Scan `knowledge/domains/` — does this task touch a known domain?
+   - If yes: read that domain's `patterns.md` and `failures/_index.md`
+2. **Stack knowledge**: Scan `knowledge/stack/` — does the task involve a known technology?
+   - If yes: read the relevant stack file
+3. **Cross-domain insights**: Check `knowledge/insights.md` for applicable learnings
+4. **Past work**: Search `history/episodes/_index.md` — done something similar before?
+   - If yes: read that episode for context
+Don't load everything. Load what's relevant. Write what you loaded and why to `workspace/context.md`. This audit trail matters for maintenance.
+## Phase 3: Think Critically
+This is where the real value is. Before doing any work:
+**Check failures:** For each matched domain, scan the failures index. Does this task have conditions that match a known failure pattern? If yes, explicitly address it in your approach.
+**Identify unknowns:**
+- **BLOCKING unknowns** — you cannot proceed without the answer
+  - Write to `workspace/questions.md`
+  - HALT. Surface these to the human. Do not continue until resolved.
+- **Assumable unknowns** — reasonable defaults exist
+  - Write to `workspace/assumptions.md` with the default you're choosing and why
+**Check edges:** What's the simplest thing that could go wrong?
+- What if the input is empty, null, huge, or malformed?
+- What if this runs concurrently? What about race conditions?
+- What does this interact with that could break?
+- What's the failure mode? How does the human know something went wrong?
+## Phase 4: Work
+Now do the actual work. As you work:
+- Write key decisions to `workspace/decisions.md` with your reasoning
+- If you discover new unknowns mid-work, go back to Phase 3
+- If something breaks that matches a known failure pattern, note it — your knowledge was correct
+- If something breaks that's NOT in your failure library, note it — this is learning fuel for Phase 5
+## Phase 5: Capture
+After the task is done (or failed), follow `protocols/compaction.md` to:
+1. Summarize what happened → episode log
+2. Extract any insights worth remembering → quality gate
+3. If failed: write a reflection (what went wrong, why, what to do differently)
+4. Clear `workspace/`
+---
+## Scaling to Task Size
+**Quick task (< 30 min):**
+Phase 1 (2-3 lines). Phase 2 (quick scan). Phase 3 (mental check only). Phase 4. Phase 5 (1-line episode entry).
+**Medium task (1-4 hours):**
+Full Phases 1-5. Write task.md before coding. Check failures properly.
+**Large task (multi-day):**
+Full Phases 1-5 with deep Phase 3. Break into sub-tasks. Multiple episode entries. Maintain workspace/ across sessions.
+The protocol adapts to the work. The principle doesn't change: understand before you build, load what you know, think critically, capture what you learn.

package/template/workspace/.gitkeep ADDED Viewed

File without changes