npm - gaia-framework - Versions diffs - 1.60.0 → 1.61.0 - Mend

gaia-framework 1.60.0 → 1.61.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/CLAUDE.md +1 -1
package/README.md +1 -1
package/_gaia/_config/global.yaml +1 -1
package/_gaia/_config/manifest.yaml +1 -1
package/_gaia/core/config.yaml +1 -1
package/_gaia/core/engine/workflow.xml +1 -10
package/_gaia/lifecycle/skills/memory-management.md +133 -9
package/_gaia/lifecycle/workflows/4-implementation/val-refresh-ground-truth/checklist.md +15 -0
package/_gaia/lifecycle/workflows/4-implementation/val-refresh-ground-truth/instructions.xml +153 -57
package/_gaia/lifecycle/workflows/4-implementation/val-refresh-ground-truth/workflow.yaml +5 -0
package/gaia-install.sh +1 -1
package/package.json +1 -1
package/_gaia/lifecycle/skills/memory-management-cross-agent.md +0 -218

package/CLAUDE.md CHANGED Viewed

@@ -1,5 +1,5 @@
-# GAIA Framework v1.60.0
+# GAIA Framework v1.61.0
 This project uses the **GAIA** (Generative Agile Intelligence Architecture) framework — an AI agent framework for Claude Code that orchestrates software product development through 26 specialized agents, 65 workflows, and 8 shared skills.

package/README.md CHANGED Viewed

@@ -460,7 +460,7 @@ The single source of truth is `_gaia/_config/global.yaml`:
 ```yaml
 framework_name: "GAIA"
-framework_version: "1.60.0"
+framework_version: "1.61.0"
 user_name: "your-name"
 project_name: "your-project"
 ```

package/_gaia/_config/global.yaml CHANGED Viewed

@@ -3,7 +3,7 @@
 # After modifying this file, run /gaia-build-configs to regenerate resolved configs.
 framework_name: "GAIA"
-framework_version: "1.60.0"
+framework_version: "1.61.0"
 # User settings
 user_name: "jlouage"

package/_gaia/_config/manifest.yaml CHANGED Viewed

@@ -3,7 +3,7 @@
 modules:
   - name: core
-    version: "1.13.0"
+    version: "1.12.0"
     path: "_gaia/core"
     type: built-in
   - name: lifecycle

package/_gaia/core/config.yaml CHANGED Viewed

@@ -2,7 +2,7 @@
 inherits: "{project-root}/_gaia/_config/global.yaml"
 module_name: "core"
-module_version: "1.13.0"
+module_version: "1.12.0"
 engine_path: "{project-root}/_gaia/core/engine/workflow.xml"
 tasks_path: "{project-root}/_gaia/core/tasks"

package/_gaia/core/engine/workflow.xml CHANGED Viewed

@@ -68,17 +68,8 @@ execution modes (normal/yolo/planning), checkpoints, and quality gates.
     <action if="agent is 'dev-*' (wildcard)">Check if workflow.yaml declares agent_auto_detect: true. If true, execute the agent_auto_detect_strategy from workflow.yaml to determine the stack. If auto-detection succeeds, resolve to {project-root}/_gaia/dev/agents/{detected_stack}-dev.md and inform user: "Auto-detected developer: {agent_name} based on {detection_source}." If auto-detection fails or agent_auto_detect is false/absent: ask user which stack developer to use (typescript, angular, flutter, java, python, mobile, go), then resolve to {project-root}/_gaia/dev/agents/{stack}-dev.md</action>
     <action if="agent file resolved">Read the agent persona file — adopt its persona, name, communication style, rules, and domain knowledge for the duration of this workflow</action>
     <action if="agent file resolved">Do NOT execute the agent's activation protocol, greeting, or menu — the workflow engine controls step execution</action>
+    <action if="agent file resolved">Do NOT load the agent's memory sidecar — the workflow engine manages its own checkpoints</action>
     <action if="agent file resolved">Check for {project-root}/_gaia/_config/agents/{agent-id}.customize.yaml. If found, load and merge overrides (persona_overrides, menu_additions, menu_removals, skill_additions, rule_additions, skill_overrides, skill_section_overrides). For dev agents, also check all-dev.customize.yaml and merge (agent-specific takes precedence over all-dev).</action>
-    <!-- Memory Load Protocol (E9-S7) — Load agent memory sidecar files after persona, before instructions -->
-    <action if="agent file resolved">Read {project-root}/_memory/config.yaml to resolve agent tier and sidecar path. If the file does not exist, log "Memory config not found — skipping memory loading." and skip all remaining memory load actions in this step.</action>
-    <action if="agent file resolved">Resolve the agent's tier classification: look up the resolved agent ID in _memory/config.yaml tier lists — check tiers.tier_1.agents, tiers.tier_2.agents, and tiers.tier_3.agents in that order. If the agent ID is not found in any tier list (untiered), treat as Tier 3 by default — load decision-log.md only if it exists, with no budget enforcement.</action>
-    <action if="agent file resolved">Resolve sidecar path: read agents.{agent-id}.sidecar from _memory/config.yaml, construct absolute path as {project-root}/_memory/{sidecar_value}/. If the agent entry is not found in the agents section, construct path as {project-root}/_memory/{agent-id}-sidecar/.</action>
-    <action if="agent file resolved">Check if the sidecar directory exists. If the directory does not exist or is empty, skip all file loading silently — no error, no warning. Proceed to Step 4.</action>
-    <action if="agent file resolved and tier == 1">Load canonical sidecar files (silently skip any that do not exist): ground-truth.md, decision-log.md, conversation-context.md (3 files). These provide the agent's persistent ground truth, decision history, and recent conversation context.</action>
-    <action if="agent file resolved and tier == 2">Load canonical sidecar files (silently skip any that do not exist): decision-log.md, conversation-context.md (2 files). Tier 2 agents do not have ground-truth.md.</action>
-    <action if="agent file resolved and tier == 3 or untiered">Load canonical sidecar file (silently skip if it does not exist): decision-log.md (1 file). Tier 3 and untiered agents receive minimal memory context.</action>
-    <action if="agent file resolved and tier in [1, 2]">Estimate token budget usage: sum the byte lengths of all loaded sidecar files, divide by the token_approximation value (default: 4 chars per token, from _memory/config.yaml archival.token_approximation). Compare against the agent's session_budget from the tier definition in _memory/config.yaml. If usage reaches or exceeds the budget_warn_at threshold (0.8 / 80% from _memory/config.yaml archival.budget_warn_at): display warning "Memory budget at {N}% for {agent_name} — {used}K of {budget}K tokens." For Tier 3 and untiered agents (session_budget: null), skip the token budget check entirely.</action>
   </step>
   <step n="4" title="Load Instructions">

package/_gaia/lifecycle/skills/memory-management.md CHANGED Viewed

@@ -1,8 +1,8 @@
 ---
 name: memory-management
-version: '1.1'
+version: '1.0'
 applicable_agents: [all]
-description: 'Core memory operations: session load/save, decision formatting, stale detection, deduplication, context summarization. Cross-agent sections in memory-management-cross-agent.md.'
+description: 'Session load/save, decision formatting, stale detection, deduplication, context summarization'
 ---
 <!-- SECTION: decision-formatting -->
@@ -96,13 +96,13 @@ Persist agent session data to sidecar files. Agent-agnostic — takes sidecar pa
    - Read entire file, replace in memory, write entire file back
 **Token budget enforcement (before write):**
-- Before writing, invoke the `budget-monitoring` section (from `memory-management-cross-agent.md`) to check projected usage against tier budget
-- Pass: sidecar_path, tier_budget, projected_size (current size + new entries size)
-- `budget-monitoring` returns the threshold status: ok, warn, alert, or archive_needed
-- If `archive_needed`: move oldest N entries to `{sidecar_path}/{archive_subdir}/` subdirectory to make room, then re-check
-- If user declines archival: force save anyway, exceeding the budget
+- Calculate projected file size after append (current size + new entries size)
+- Approximate tokens: character count / 4
+- If projected size exceeds `tier_budget`: warn with current/max budget and offer options:
+  - **Archive oldest entries** — move oldest N entries to `archive/` subdirectory to make room
+  - **Force save** — save anyway, exceeding the budget
 - Never silently truncate or block a save operation
-- See `budget-monitoring` section in `memory-management-cross-agent.md` for threshold definitions and config source
+- Warn at 80% of budget ("approaching limit"), warn at 90% ("near limit"), trigger archival prompt at 100%
 <!-- END SECTION -->
 <!-- SECTION: context-summarization -->
@@ -195,4 +195,128 @@ Detect and merge duplicate decision entries within a decision log.
 **Output:** List of duplicate pairs with recommended action (auto-archive or review).
 <!-- END SECTION -->
-<!-- Cross-agent extensions (cross-reference-loading, budget-monitoring) are in memory-management-cross-agent.md -->
+<!-- SECTION: cross-ref-loading -->
+## Cross-Reference Loading (ADR-015)
+Load another agent's sidecar files as read-only cross-references. All cross-ref loading is JIT (just-in-time) — never preloaded at session start.
+### Schema: `<memory-reads>` Tag
+Agent persona files declare cross-references using this XML schema inside the `<agent>` block:
+```xml
+<memory-reads>
+  <cross-ref agent="{agent-id}" file="{file-name}" mode="{recent|full|summary}" required="{true|false}" />
+  <!-- Additional cross-ref entries -->
+</memory-reads>
+```
+**Attributes:**
+- `agent` (required) — the agent ID whose sidecar to read (e.g., "architect", "validator")
+- `file` (required) — the sidecar file to read (e.g., "decision-log", "ground-truth", "conversation-context")
+- `mode` (required) — loading mode: `recent`, `full`, or `summary`
+- `required` (optional, default: "true") — if "false", skip gracefully when the sidecar is absent
+### Read-Only Access: `load_cross_ref()`
+Cross-references are loaded through `load_cross_ref()`, which is a **read-only** path — separate from `load_own()` (the agent's own sidecar, which supports read/write).
+**Parameters:**
+- `sidecar_path` — absolute path to the **target** agent's sidecar directory (e.g., `_memory/architect-sidecar/`)
+- `file_name` — which file to read (e.g., "decision-log.md", "ground-truth.md")
+- `mode` — loading mode: `recent`, `full`, or `summary`
+- `budget_remaining` — remaining token budget for cross-references
+**Write-Guard:**
+Any attempt to write to a sidecar that is not the current agent's own sidecar MUST raise a **hard error** (not a warning). The write-guard blocks all write operations through `load_cross_ref()`. Only `load_own()` permits writes, and only to the agent's own sidecar directory.
+**Procedure:**
+1. Validate the target sidecar path is NOT the current agent's own sidecar (self-reference guard — see Validation section below)
+2. Check budget_remaining BEFORE loading — if insufficient, apply progressive downgrade
+3. Read the target file from disk (read-only — no file creation, no modification)
+4. Apply mode filtering (see Loading Modes below)
+5. Return filtered content as read-only data
+### Loading Modes
+**`recent` mode** — Load entries from the last 2 sprints only:
+- Parse decision-log entries using ADR-016 format
+- Filter by Sprint field: keep entries where sprint matches current sprint or previous sprint
+- If sprint metadata is absent from entries, fall back to date-based filtering (last 30 days)
+- Approximate token cost: character count / 4
+**`full` mode** — Load the entire file content:
+- Read full file, respect budget cap
+- If file exceeds budget_remaining: truncate oldest entries first (keep most recent)
+- Approximate token cost: character count / 4
+**`summary` mode** — Load section headers only:
+- Parse the file for markdown headers (lines starting with `#`, `##`, `###`)
+- Stop at section boundaries — do not load body content
+- Minimal token cost
+### JIT Loading
+Cross-references are loaded **just-in-time** during workflow step execution — NOT at agent activation or session start. Loading is triggered only when a workflow step explicitly requires cross-agent context.
+**Trigger:** The workflow engine checks `<memory-reads>` declarations and loads cross-references only when the current step's execution context requires cross-agent data. No cross-references are preloaded eagerly.
+### Budget Enforcement
+**Pre-load budget check:** Before each cross-reference load, calculate:
+- Tokens already consumed by own sidecar
+- Tokens already consumed by previously loaded cross-references
+- Estimated tokens for the next cross-reference (based on mode and file size)
+**Progressive downgrade chain:** When budget is insufficient for the requested mode:
+1. `full` → downgrade to `recent`
+2. `recent` → downgrade to `summary`
+3. `summary` → downgrade to `skip` (do not load, log warning)
+Log every downgrade as a warning in the session checkpoint:
+`"Cross-ref downgraded: {agent}/{file} from {original_mode} to {new_mode} — budget {used}/{total}"`
+**Val's 50% cross-ref budget cap:**
+Val (validator) has a `cross_ref_budget_cap` of 0.5 in `_memory/config.yaml`, meaning cross-references may consume at most 50% of Val's 300K session budget (= 150K tokens max for cross-refs). Val's load priority order: architect → pm → sm. If the 150K cap is hit mid-load, remaining cross-references are downgraded progressively.
+**Tier budget ceilings:**
+- Tier 1 agents: 300K session budget (own sidecar + cross-refs combined)
+- Tier 2 agents: 100K session budget (own sidecar + cross-refs combined)
+- Tier 3 agents: no explicit budget enforcement
+### Graceful Error Handling
+**Missing sidecar directory or file:**
+- If the target sidecar directory does not exist: log a warning ("Cross-ref skipped: {agent} sidecar directory not found"), skip this cross-reference, and continue workflow execution
+- If the target file within the sidecar does not exist: log a warning ("Cross-ref skipped: {agent}/{file} not found"), skip, continue
+**Malformed or corrupt sidecar files:**
+- If the target file exists but cannot be parsed (malformed markdown, corrupt content, unparseable entries): log a warning ("Cross-ref skipped: {agent}/{file} is malformed"), skip this cross-reference, and continue
+- Never halt the workflow due to a cross-reference loading failure
+**Absent optional cross-references:**
+- If `<cross-ref required="false">` and the target is absent: skip silently (no warning)
+- If `<cross-ref required="true">` (default) and the target is absent: log warning but still continue
+### Consistency Validation
+At session start, validate that agent persona `<memory-reads>` declarations are consistent with `_memory/config.yaml` cross_references matrix:
+1. Parse the agent's `<memory-reads>` block from the persona file
+2. Load the agent's entry from `_memory/config.yaml` → `cross_references.{agent_id}.reads_from`
+3. Compare: every entry in config.yaml should have a matching `<cross-ref>` in the persona file
+4. If mismatch found: produce a **warning** (not an error) — `_memory/config.yaml` is the authoritative source. The persona file declaration should mirror it.
+**Self-reference guard:**
+If an agent's `<memory-reads>` declares a `<cross-ref>` where the `agent` attribute matches the current agent's own ID, reject it with a validation error at session start. An agent must not reference its own sidecar as a cross-reference — own-sidecar access is through `load_own()`.
+### Checkpoint Provenance
+After loading cross-references, record in the session checkpoint:
+- Which cross-references were loaded (agent, file, mode)
+- File checksums (`shasum -a 256`) of each loaded cross-reference file
+- Any downgrades or skips that occurred
+- Total tokens consumed by cross-references
+This enables `/gaia-resume` to detect stale cross-references when resuming a session.
+<!-- END SECTION -->

package/_gaia/lifecycle/workflows/4-implementation/val-refresh-ground-truth/checklist.md CHANGED Viewed

@@ -42,6 +42,21 @@ validation-target: 'Ground truth refresh workflow'
 - [ ] ground-truth-management sections loaded JIT
 - [ ] Sections: full-refresh, incremental-refresh, entry-structure, conflict-resolution, token-budget
+## --agent Parameter (E9-S11)
+- [ ] workflow.yaml declares agent parameter with flag --agent and allowed values [val, theo, derek, nate, all]
+- [ ] Default agent is val (backward compatible — no --agent behaves identically to pre-E9-S11)
+- [ ] Invalid agent names produce clear error with valid values list
+- [ ] Per-agent sidecar initialization creates missing ground-truth.md for any Tier 1 agent
+- [ ] Theo inventory scans filesystem structure + architecture.md
+- [ ] Derek inventory scans prd.md + epics-and-stories.md + sprint-status.yaml
+- [ ] Nate inventory scans sprint-status.yaml + story files in implementation-artifacts/
+- [ ] Val inventory uses existing 6-target scan (unchanged)
+- [ ] Decision log entries route to target agent's own decision-log.md
+- [ ] Per-agent ground_truth_budget enforced (Val 200K, Theo 150K, Derek 100K, Nate 100K)
+- [ ] --agent all runs val → theo → derek → nate sequentially
+- [ ] --agent all continues on per-agent failure and reports which succeeded/failed
+- [ ] --agent all presents combined summary with per-agent status
 ## Integration
 - [ ] Manifest entry exists in workflow-manifest.csv
 - [ ] Works identically standalone or as sub-step

package/_gaia/lifecycle/workflows/4-implementation/val-refresh-ground-truth/instructions.xml CHANGED Viewed

@@ -7,95 +7,166 @@
   <mandate>Behavior must be identical whether called standalone or as sub-step from another workflow</mandate>
 </critical>
-<step n="1" title="Initialize Validator Sidecar">
-  <action>Check if {memory_path}/validator-sidecar/ directory exists</action>
-  <action if="directory missing">Create {memory_path}/validator-sidecar/ directory</action>
-  <action if="ground-truth.md missing">Create {memory_path}/validator-sidecar/ground-truth.md with empty header containing last-refresh timestamp set to "never"</action>
-  <action if="decision-log.md missing">Create {memory_path}/validator-sidecar/decision-log.md with header: "# Val Decision Log" and empty entries</action>
-  <action if="conversation-context.md missing">Create {memory_path}/validator-sidecar/conversation-context.md with header: "# Val Conversation Context" and empty rolling state</action>
+<step n="1" title="Resolve Agent Target">
+  <action>Parse $ARGUMENTS for --agent value. If --agent is absent, default to "val" for backward compatibility.</action>
+  <action>Validate agent name against allowed values: val, theo, derek, nate, all.
+    If the agent name is not in the allowed values list: HALT with error:
+    "Unknown agent '{agent_name}'. Valid values: val, theo, derek, nate, all."</action>
+  <action>Resolve target sidecar path and inventory source files based on agent:
+    - val: sidecar = {memory_path}/validator-sidecar/, inventory = 6-target scan (existing)
+    - theo: sidecar = {memory_path}/architect-sidecar/, inventory = filesystem structure + {planning_artifacts}/architecture.md
+    - derek: sidecar = {memory_path}/pm-sidecar/, inventory = {planning_artifacts}/prd.md + {planning_artifacts}/epics-and-stories.md + {implementation_artifacts}/sprint-status.yaml
+    - nate: sidecar = {memory_path}/sm-sidecar/, inventory = {implementation_artifacts}/sprint-status.yaml + story files in {implementation_artifacts}/
+    - all: run sequentially for val, theo, derek, nate (see Step 13)</action>
+  <action if="agent == all">Set agent_queue = [val, theo, derek, nate]. Proceed to Step 13 for orchestration.</action>
 </step>
-<step n="2" title="Determine Refresh Mode">
+<step n="2" title="Initialize Agent Sidecar">
+  <action>Check if the resolved target sidecar directory exists (e.g., {memory_path}/validator-sidecar/ for val, {memory_path}/architect-sidecar/ for theo, {memory_path}/pm-sidecar/ for derek, {memory_path}/sm-sidecar/ for nate)</action>
+  <action if="directory missing">Create the resolved sidecar directory</action>
+  <action if="ground-truth.md missing">Create ground-truth.md in the resolved sidecar directory with empty header containing last-refresh timestamp set to "never"</action>
+  <action if="decision-log.md missing">Create decision-log.md in the resolved sidecar directory with header: "# {agent_display_name} Decision Log" and empty entries</action>
+  <action if="conversation-context.md missing">Create conversation-context.md in the resolved sidecar directory with header: "# {agent_display_name} Conversation Context" and empty rolling state</action>
+</step>
+<step n="3" title="Determine Refresh Mode">
   <action>Check if --incremental flag was passed</action>
   <action if="incremental">Set mode to incremental — only scan files modified since last-refresh timestamp from ground-truth.md header</action>
   <action if="not incremental">Set mode to full — scan all targets completely. Full refresh catches deletions and renames that incremental would miss.</action>
 </step>
-<step n="3" title="Load Entry Structure">
+<step n="4" title="Load Entry Structure">
   <action>Load ground-truth-management skill section: entry-structure (JIT)</action>
   <action>Use entry-structure format for all ground truth entries written in subsequent steps</action>
 </step>
-<step n="4" title="Parse Previous State">
-  <action>Read existing {memory_path}/validator-sidecar/ground-truth.md</action>
+<step n="5" title="Parse Previous State">
+  <action>Read existing ground-truth.md from the resolved target sidecar directory</action>
   <action>Extract last-refresh timestamp from header</action>
   <action>Parse all existing entries into a lookup map keyed by file path for diff comparison</action>
   <action if="incremental mode">Filter scan targets to only files modified after last-refresh timestamp</action>
 </step>
-<step n="5" title="Scan Inventory Targets">
+<step n="6" title="Scan Inventory Targets">
   <action>Load ground-truth-management skill section based on mode: full-refresh or incremental-refresh (JIT)</action>
-  <action>Scan the following 6 inventory targets, showing section-by-section progress to the user after each target completes.</action>
-  <action title="Exclusion list">
-    ALWAYS exclude these directories and files from scanning — they are framework internals, not project code:
-    _gaia/, .claude/, bin/, _memory/, node_modules/, .git/, build/, dist/, .DS_Store, *.lock
+  <action if="agent == val">
+    Scan the following 6 inventory targets (existing Val scan — unchanged for backward compatibility), showing section-by-section progress to the user after each target completes.
+    <action title="Exclusion list">
+      ALWAYS exclude these directories and files from scanning — they are framework internals, not project code:
+      _gaia/, .claude/, bin/, _memory/, node_modules/, .git/, build/, dist/, .DS_Store, *.lock
+    </action>
+    <action title="Target 1: Project Source Files">
+      Scan {project-path}/**/* (excluding the exclusion list above)
+      Extract: file inventory, directory structure, languages used, entry points
+      Report progress: "Scanning project source files... found N files across N directories."
+    </action>
+    <action title="Target 2: Project Config Files">
+      Scan {project-path}/*.{json,yaml,yml,toml,xml,env.example} (root-level config files)
+      Extract: config keys, settings, dependency declarations
+      Report progress: "Scanning project config files... found N config files."
+    </action>
+    <action title="Target 3: Project Package Manifests">
+      Scan {project-path}/**/package.json, pubspec.yaml, pom.xml, build.gradle, requirements.txt, Cargo.toml, go.mod, Gemfile, *.csproj (whichever exist)
+      Extract: dependencies, versions, scripts, build targets
+      Report progress: "Scanning package manifests... found N manifests."
+    </action>
+    <action title="Target 4: Planning Artifacts">
+      Scan {project-root}/docs/planning-artifacts/*.md
+      Extract: artifact name, type, date
+      Report progress: "Scanning planning artifacts... found N artifacts."
+    </action>
+    <action title="Target 5: Implementation Artifacts">
+      Scan {project-root}/docs/implementation-artifacts/*.md
+      Extract: artifact name, type, story key if applicable
+      Report progress: "Scanning implementation artifacts... found N artifacts."
+    </action>
+    <action title="Target 6: Test Artifacts">
+      Scan {project-root}/docs/test-artifacts/*.md
+      Extract: artifact name, type, coverage area
+      Report progress: "Scanning test artifacts... found N artifacts."
+    </action>
   </action>
-  <action title="Target 1: Project Source Files">
-    Scan {project-path}/**/* (excluding the exclusion list above)
-    Extract: file inventory, directory structure, languages used, entry points
-    Report progress: "Scanning project source files... found N files across N directories."
-  </action>
+  <action if="agent == theo">
+    Scan theo-specific inventory targets (architecture ground truth):
-  <action title="Target 2: Project Config Files">
-    Scan {project-path}/*.{json,yaml,yml,toml,xml,env.example} (root-level config files)
-    Extract: config keys, settings, dependency declarations
-    Report progress: "Scanning project config files... found N config files."
-  </action>
+    <action title="Target 1: Filesystem Structure">
+      Scan {project-path}/ for directory structure, file types, and project layout.
+      Extract: tech stack, components, module boundaries, entry points, build configuration.
+      Report progress: "Scanning filesystem structure for Theo... found N directories, N files."
+    </action>
-  <action title="Target 3: Project Package Manifests">
-    Scan {project-path}/**/package.json, pubspec.yaml, pom.xml, build.gradle, requirements.txt, Cargo.toml, go.mod, Gemfile, *.csproj (whichever exist)
-    Extract: dependencies, versions, scripts, build targets
-    Report progress: "Scanning package manifests... found N manifests."
+    <action title="Target 2: Architecture Document">
+      Scan {planning_artifacts}/architecture.md
+      Extract: ADRs, architectural decisions, component diagrams, dependency info, integration points, tech stack decisions.
+      Report progress: "Scanning architecture.md for Theo... extracted N ADRs, N components."
+    </action>
   </action>
-  <action title="Target 4: Planning Artifacts">
-    Scan {project-root}/docs/planning-artifacts/*.md
-    Extract: artifact name, type, date
-    Report progress: "Scanning planning artifacts... found N artifacts."
+  <action if="agent == derek">
+    Scan derek-specific inventory targets (product ground truth):
+    <action title="Target 1: Product Requirements">
+      Scan {planning_artifacts}/prd.md
+      Extract: functional requirements, non-functional requirements, user stories overview, feature list, product goals.
+      Report progress: "Scanning prd.md for Derek... extracted N requirements."
+    </action>
+    <action title="Target 2: Epics and Stories">
+      Scan {planning_artifacts}/epics-and-stories.md
+      Extract: epic list, story breakdown, acceptance criteria summaries, dependency graph, sizing data.
+      Report progress: "Scanning epics-and-stories.md for Derek... extracted N epics, N stories."
+    </action>
+    <action title="Target 3: Sprint Status">
+      Scan {implementation_artifacts}/sprint-status.yaml
+      Extract: current sprint state, story statuses, velocity data, blocked items, completion rates.
+      Report progress: "Scanning sprint-status.yaml for Derek... extracted sprint state."
+    </action>
   </action>
-  <action title="Target 5: Implementation Artifacts">
-    Scan {project-root}/docs/implementation-artifacts/*.md
-    Extract: artifact name, type, story key if applicable
-    Report progress: "Scanning implementation artifacts... found N artifacts."
-  </action>
+  <action if="agent == nate">
+    Scan nate-specific inventory targets (sprint ground truth):
+    <action title="Target 1: Sprint Status">
+      Scan {implementation_artifacts}/sprint-status.yaml
+      Extract: sprint metadata, story statuses, velocity metrics, blocked items, wave assignments.
+      Report progress: "Scanning sprint-status.yaml for Nate... extracted sprint state."
+    </action>
-  <action title="Target 6: Test Artifacts">
-    Scan {project-root}/docs/test-artifacts/*.md
-    Extract: artifact name, type, coverage area
-    Report progress: "Scanning test artifacts... found N artifacts."
+    <action title="Target 2: Story Files">
+      Scan all story files in {implementation_artifacts}/ matching pattern *-*.md (story files)
+      Extract: story statuses, completion rates, subtask progress, blockers, review gate states.
+      Report progress: "Scanning story files in implementation-artifacts for Nate... found N stories."
+    </action>
   </action>
 </step>
-<step n="6" title="Compare and Detect Changes">
-  <action>Compare scan results against previous state from Step 4</action>
+<step n="7" title="Compare and Detect Changes">
+  <action>Compare scan results against previous state from Step 5</action>
   <action>Classify each entry as: ADDED (new file not in previous state), UPDATED (file exists but metadata changed), UNCHANGED (no changes detected)</action>
   <action if="full mode">For entries in previous state not found in scan results: mark as REMOVED with detection date (e.g., "REMOVED (file deleted, detected 2026-03-19)"). Do NOT silently delete entries.</action>
   <action if="incremental mode">Skip deletion detection — incremental mode cannot detect deletions. This is a documented limitation.</action>
   <action>Load ground-truth-management skill section: conflict-resolution (JIT) if any conflicts are detected between scan results and existing entries</action>
 </step>
-<step n="7" title="Write Ground Truth">
-  <action>Update {memory_path}/validator-sidecar/ground-truth.md with all scan results</action>
+<step n="8" title="Write Ground Truth">
+  <action>Update ground-truth.md in the resolved target agent's sidecar directory with all scan results</action>
   <action>Write header with last-refresh timestamp set to current date/time</action>
-  <action>Organize entries by category: Agents, Workflows, Skills, Commands, Manifests, Config, Artifacts</action>
+  <action>Organize entries by category appropriate to the target agent</action>
   <action>Include verified counts, locations, and structural patterns for each category</action>
   <action>Preserve REMOVED entries with their detection dates — do not purge</action>
 </step>
-<step n="8" title="Generate Diff Report">
+<step n="9" title="Generate Diff Report">
   <action>Generate diff/delta report summarizing changes since last refresh</action>
   <action>Include counts by category: added, removed, updated entries</action>
   <action>Include total entry count across all categories</action>
@@ -103,23 +174,48 @@
   <action>Present the full diff report to the user</action>
 </step>
-<step n="9" title="Log to Decision Log">
-  <action>Append the diff/delta report to {memory_path}/validator-sidecar/decision-log.md</action>
-  <action>Include date, refresh mode (full or incremental), and summary</action>
-  <action>Format: "## Refresh — {date} ({mode})\n{summary}"</action>
+<step n="10" title="Log to Decision Log">
+  <action>Append the diff/delta report to the target agent's own decision-log.md in the resolved sidecar directory.
+    Route to the correct file based on resolved agent target:
+    - val: {memory_path}/validator-sidecar/decision-log.md
+    - theo: {memory_path}/architect-sidecar/decision-log.md
+    - derek: {memory_path}/pm-sidecar/decision-log.md
+    - nate: {memory_path}/sm-sidecar/decision-log.md
+    Never write to another agent's decision-log.md — this would violate cross-agent write isolation.</action>
+  <action>Include date, refresh mode (full or incremental), target agent name, and summary</action>
+  <action>Format: "## Refresh — {date} ({mode}) — Agent: {agent_name}\n{summary}"</action>
 </step>
-<step n="10" title="Check Token Budget">
+<step n="11" title="Check Token Budget">
   <action>Load ground-truth-management skill section: token-budget (JIT)</action>
-  <action>Estimate token count of ground-truth.md</action>
-  <action>If token count exceeds budget threshold: trigger archival of oldest REMOVED entries per token-budget skill guidance</action>
+  <action>Load the correct per-agent ground_truth_budget from {memory_path}/config.yaml based on target agent:
+    - Val: 200K (200,000 tokens)
+    - Theo: 150K (150,000 tokens)
+    - Derek: 100K (100,000 tokens)
+    - Nate: 100K (100,000 tokens)
+    These are distinct from the 300K session_budget — ground_truth_budget controls only ground-truth.md size.</action>
+  <action>Estimate token count of the target agent's ground-truth.md</action>
+  <action>If token count exceeds the per-agent budget threshold: trigger archival of oldest REMOVED entries per token-budget skill guidance</action>
+  <action if="agent == all">Report per-agent usage separately (e.g., "Theo: 42K/150K, Derek: 18K/100K")</action>
   <action>Report token usage to user</action>
 </step>
-<step n="11" title="Present Results">
+<step n="12" title="Present Results">
   <template-output file="{memory_path}/validator-sidecar/ground-truth.md">
-    Ground truth refreshed with verified inventory of all framework components.
+    Ground truth refreshed for the target agent with verified inventory.
+    Output path is resolved at runtime based on --agent parameter (defaults to validator-sidecar).
     Includes last-refresh timestamp, categorized entries, and REMOVED markers for deleted files.
   </template-output>
 </step>
+<step n="13" title="Orchestrate --agent all" if="agent == all">
+  <action>Run refresh sequentially for each agent in order: val, theo, derek, nate.
+    Each agent's refresh completes fully (Steps 2-12) before the next begins.
+    No cross-contamination of sidecar writes — each agent writes only to its own sidecar.</action>
+  <action>On per-agent failure (e.g., missing source file like prd.md): log the error with reason, continue with remaining agents. Do not halt the entire sequence.</action>
+  <action>After all agents complete: present a combined summary with per-agent status:
+    - Which agents succeeded and their entry counts
+    - Which agents failed and the failure reasons
+    Format: "Refresh complete. Results: Val: OK (N entries), Theo: OK (N entries), Derek: FAILED (prd.md not found), Nate: OK (N entries)."</action>
+</step>
 </workflow>

package/_gaia/lifecycle/workflows/4-implementation/val-refresh-ground-truth/workflow.yaml CHANGED Viewed

@@ -9,6 +9,11 @@ installed_path: "{project-root}/_gaia/lifecycle/workflows/4-implementation/val-r
 instructions: "{installed_path}/instructions.xml"
 validation: "{installed_path}/checklist.md"
 parameters:
+  agent:
+    flag: "--agent"
+    description: "Which Tier 1 agent's ground truth to refresh. Defaults to val for backward compatibility."
+    default: "val"
+    allowed_values: [val, theo, derek, nate, all]
   incremental:
     flag: "--incremental"
     description: "Only scan files modified since last refresh timestamp. Full refresh is default."

package/gaia-install.sh CHANGED Viewed

@@ -6,7 +6,7 @@ set -euo pipefail
 # Installs, updates, validates, and reports on GAIA installations.
 # ─────────────────────────────────────────────────────────────────────────────
-readonly VERSION="1.60.0"
+readonly VERSION="1.61.0"
 readonly GITHUB_REPO="https://github.com/jlouage/Gaia-framework.git"
 readonly MANIFEST_REL="_gaia/_config/manifest.yaml"

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gaia-framework",
-  "version": "1.60.0",
+  "version": "1.61.0",
   "description": "GAIA — Generative Agile Intelligence Architecture installer",
   "bin": {
     "gaia-framework": "./bin/gaia-framework.js"

package/_gaia/lifecycle/skills/memory-management-cross-agent.md DELETED Viewed

@@ -1,218 +0,0 @@
----
-name: memory-management-cross-agent
-version: '1.0'
-applicable_agents: [all]
-description: 'Cross-agent memory extensions: cross-reference loading (ADR-015) and budget monitoring (ADR-014). Split from memory-management.md per 300-line skill limit.'
----
-<!-- SECTION: cross-reference-loading -->
-## Cross-Reference Loading (ADR-015)
-Load another agent's sidecar files as read-only cross-references. All cross-ref loading is JIT (just-in-time) — never preloaded at session start.
-### Schema: `<memory-reads>` Tag
-Agent persona files declare cross-references using this XML schema inside the `<agent>` block:
-```xml
-<memory-reads>
-  <cross-ref agent="{agent-id}" file="{file-name}" mode="{recent|full|summary}" required="{true|false}" />
-</memory-reads>
-```
-**Attributes:**
-- `agent` (required) — the agent ID whose sidecar to read (e.g., "architect", "validator")
-- `file` (required) — the sidecar file to read (e.g., "decision-log", "ground-truth", "conversation-context")
-- `mode` (required) — loading mode: `recent`, `full`, or `summary`
-- `required` (optional, default: "true") — if "false", skip gracefully when the sidecar is absent
-### Read-Only Access: `load_cross_ref()`
-Cross-references are loaded through `load_cross_ref()`, which is a **read-only** path — separate from `load_own()` (the agent's own sidecar, which supports read/write).
-**Parameters:**
-- `sidecar_path` — absolute path to the **target** agent's sidecar directory (e.g., `_memory/architect-sidecar/`)
-- `file_name` — which file to read (e.g., "decision-log.md", "ground-truth.md")
-- `mode` — loading mode: `recent`, `full`, or `summary`
-- `budget_remaining` — remaining token budget for cross-references
-**Write-Guard:**
-Any attempt to write to a sidecar that is not the current agent's own sidecar MUST raise a **hard error** (not a warning). The write-guard blocks all write operations through `load_cross_ref()`. Only `load_own()` permits writes, and only to the agent's own sidecar directory.
-**Procedure:**
-1. Validate the target sidecar path is NOT the current agent's own sidecar (self-reference guard)
-2. Check budget_remaining BEFORE loading — if insufficient, apply progressive downgrade
-3. Read the target file from disk (read-only — no file creation, no modification)
-4. Apply mode filtering (see Loading Modes below)
-5. Calculate token estimate: character count / 4 (using `token_approximation` from `_memory/config.yaml`)
-6. Return filtered content as read-only data with token estimate for caller budget deduction
-### Loading Modes
-**`recent` mode** — Load entries from the last 2 sprints only:
-- Parse decision-log entries using ADR-016 format
-- Filter by Sprint field: keep entries where sprint matches current sprint or previous sprint
-- If sprint metadata is absent from entries, fall back to date-based filtering (last 30 days)
-- Approximate token cost: character count / 4
-**`full` mode** — Load the entire file content:
-- Read full file, respect budget cap
-- If file exceeds budget_remaining: truncate oldest entries first (keep most recent)
-- Approximate token cost: character count / 4
-**`summary` mode** — Load section headers only:
-- Parse the file for markdown headers (lines starting with `#`, `##`, `###`)
-- Stop at section boundaries — do not load body content
-- Minimal token cost
-### JIT Loading
-Cross-references are loaded **just-in-time** during workflow step execution — NOT at agent activation or session start. Loading is triggered only when a workflow step explicitly requires cross-agent context.
-**Trigger:** The workflow engine checks `<memory-reads>` declarations and loads cross-references only when the current step's execution context requires cross-agent data. No cross-references are preloaded eagerly.
-### Budget Enforcement
-**Pre-load budget check:** Before each cross-reference load, calculate:
-- Tokens already consumed by own sidecar
-- Tokens already consumed by previously loaded cross-references
-- Estimated tokens for the next cross-reference (based on mode and file size)
-**Per-agent `cross_ref_budget_cap`:** Some agents have a `cross_ref_budget_cap` in `_memory/config.yaml` (e.g., validator: 0.5). This caps cross-reference token consumption at that fraction of the agent's session budget. If loading would exceed the cap, halt loading and downgrade remaining cross-refs progressively.
-**Progressive downgrade chain:** When budget is insufficient for the requested mode:
-1. `full` → downgrade to `recent`
-2. `recent` → downgrade to `summary`
-3. `summary` → downgrade to `skip` (do not load, log warning)
-Log every downgrade as a warning in the session checkpoint:
-`"Cross-ref downgraded: {agent}/{file} from {original_mode} to {new_mode} — budget {used}/{total}"`
-**Val's 50% cross-ref budget cap:**
-Val (validator) has a `cross_ref_budget_cap` of 0.5 in `_memory/config.yaml`, meaning cross-references may consume at most 50% of Val's 300K session budget (= 150K tokens max for cross-refs). Val's load priority order: architect, pm, sm. If the 150K cap is hit mid-load, remaining cross-references are downgraded progressively.
-**Tier budget ceilings:**
-- Tier 1 agents: 300K session budget (own sidecar + cross-refs combined)
-- Tier 2 agents: 100K session budget (own sidecar + cross-refs combined)
-- Tier 3 agents: no explicit budget enforcement
-### Graceful Error Handling
-**Missing sidecar directory or file:**
-- If the target sidecar directory does not exist: log a warning ("Cross-ref skipped: {agent} sidecar directory not found"), skip this cross-reference, and continue workflow execution without error
-- If the target file within the sidecar does not exist: log a warning ("Cross-ref skipped: {agent}/{file} not found"), skip, continue
-**Empty sidecar directory:**
-- If the target sidecar directory exists but contains no files: return empty result gracefully with no error — matching the contract of the `session-load` section
-**Malformed or corrupt sidecar files:**
-- If the target file exists but cannot be parsed (malformed markdown, corrupt content, unparseable entries): log a warning ("Cross-ref skipped: {agent}/{file} is malformed"), skip this cross-reference, and continue
-- Never halt the workflow due to a cross-reference loading failure
-**Absent optional cross-references:**
-- If `<cross-ref required="false">` and the target is absent: skip silently (no warning)
-- If `<cross-ref required="true">` (default) and the target is absent: log warning but still continue
-**Agent not in cross-reference matrix:**
-- If the calling agent has no entry in `_memory/config.yaml` `cross_references`, return empty result with no error
-### Consistency Validation
-At session start, validate that agent persona `<memory-reads>` declarations are consistent with `_memory/config.yaml` cross_references matrix:
-1. Parse the agent's `<memory-reads>` block from the persona file
-2. Load the agent's entry from `_memory/config.yaml` → `cross_references.{agent_id}.reads_from`
-3. Compare: every entry in config.yaml should have a matching `<cross-ref>` in the persona file
-4. If mismatch found: produce a **warning** (not an error) — `_memory/config.yaml` is the authoritative source
-**Self-reference guard:**
-If an agent's `<memory-reads>` declares a `<cross-ref>` where the `agent` attribute matches the current agent's own ID, reject it with a validation error at session start. An agent must not reference its own sidecar as a cross-reference — own-sidecar access is through `load_own()`.
-### Checkpoint Provenance
-After loading cross-references, record in the session checkpoint:
-- Which cross-references were loaded (agent, file, mode)
-- File checksums (`shasum -a 256`) of each loaded cross-reference file
-- Any downgrades or skips that occurred
-- Total tokens consumed by cross-references
-This enables `/gaia-resume` to detect stale cross-references when resuming a session.
-<!-- END SECTION -->
-<!-- SECTION: budget-monitoring -->
-## Budget Monitoring (ADR-014)
-Shared utility for checking token budget usage against tier-defined thresholds. Used by `session-save` and other memory operations to determine when archival is needed.
-### Parameters
-- `sidecar_path` — absolute path to the agent's sidecar directory
-- `tier_budget` — session token budget for this agent's tier (from `_memory/config.yaml`). If null or absent, the agent is untiered.
-- `current_usage` — current token count (own sidecar files + cross-refs loaded in this session)
-- `projected_addition` — token count of content about to be written/loaded
-### Threshold Definitions (config-driven)
-All thresholds are read from `_memory/config.yaml` `archival` block at runtime. Never hardcode these values.
-- `budget_warn_at` — fraction of budget at which a warning is returned (e.g., 0.8 = 80%)
-- `budget_alert_at` — fraction of budget at which an alert is returned (e.g., 0.9 = 90%)
-- `budget_archive_at` — fraction of budget at which archival is triggered (e.g., 1.0 = 100%)
-- `token_approximation` — characters per token for size estimation (e.g., 4)
-- `archive_subdir` — subdirectory name within sidecar for archived entries (e.g., "archive")
-### Procedure
-1. Read `_memory/config.yaml` archival block to get threshold values
-2. Calculate `projected_total = current_usage + projected_addition`
-3. Calculate `usage_ratio = projected_total / tier_budget`
-4. Return threshold status:
-   - `usage_ratio < budget_warn_at` → status: `ok` (no action needed)
-   - `usage_ratio >= budget_warn_at AND < budget_alert_at` → status: `warn` ("approaching budget limit")
-   - `usage_ratio >= budget_alert_at AND < budget_archive_at` → status: `alert` ("near budget limit")
-   - `usage_ratio >= budget_archive_at` → status: `archive_needed` ("budget exceeded, archival required")
-### Archival Protocol
-When status is `archive_needed`:
-1. Identify the oldest N entries in `decision-log.md` that would free sufficient tokens
-2. Move those entries to `{sidecar_path}/{archive_subdir}/` (e.g., `_memory/architect-sidecar/archive/`)
-3. Create the archive subdirectory if it does not exist (`mkdir -p`)
-4. Use atomic read-modify-write pattern: read full file, remove archived entries, write full file back
-5. Re-check budget after archival — if still over budget, archive more entries (up to 3 iterations)
-6. Archive subdirectory is gitignored — archived entries are historical, not version-controlled
-### Token Estimation
-Token count is approximated using the `token_approximation` value from `_memory/config.yaml` (default: 4 chars per token). No tokenizer library is used — this aligns with ADR-005 (zero runtime dependencies).
-Formula: `tokens = character_count / token_approximation`
-### Running Session Total
-Track cumulative token usage across the session:
-- Own sidecar files loaded via `session-load`
-- Cross-references loaded via `cross-reference-loading`
-- New entries being written via `session-save`
-The combined total is checked against the tier budget ceiling.
-### Untiered Agent Handling
-If `tier_budget` is null, absent, or the agent has no tier assignment in `_memory/config.yaml` (applies to 9 untiered agents: analyst, data-engineer, performance, ux-designer, brainstorming-coach, design-thinking-coach, innovation-strategist, presentation-designer, problem-solver):
-- Skip all budget enforcement — no threshold checks, no archival trigger
-- Return status: `no_budget` with no error
-- This is a no-op: the caller proceeds without any budget constraint
-- Never raise an error for untiered agents
-### Return Value
-```yaml
-status: ok | warn | alert | archive_needed | no_budget
-usage_ratio: 0.75          # current usage as fraction of budget (null if no_budget)
-tokens_used: 225000        # current token count (null if no_budget)
-tokens_budget: 300000      # tier budget ceiling (null if no_budget)
-tokens_projected: 230000   # projected total after addition (null if no_budget)
-message: "Budget at 75% — no action needed"
-```
-<!-- END SECTION -->