npm - ultimate-pi - Versions diffs - 0.1.2 → 0.1.4 - Mend

ultimate-pi 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (516) hide show

package/vault/wiki/concepts/agent-search-enforcement.md ADDED Viewed

@@ -0,0 +1,126 @@
+---
+type: concept
+status: developing
+created: 2026-04-30
+updated: 2026-04-30
+tags:
+  - agentic-harness
+  - tool-enforcement
+  - semantic-search
+  - mcp
+related:
+  - "[[ck-tool]]"
+  - "[[mcp-tool-routing]]"
+  - "[[agentic-harness-context-enforcement]]"
+  - "[[Research: semantic code search tools]]"
+title: "agent search enforcement"
+---# agent search enforcement
+Strategies to force AI coding agents to use semantic code search tools (ck, vgrep) instead of raw `grep`, `cat`, and pipe commands.
+## Problem
+AI coding agents default to shell tools: `grep -r "pattern" .`, `cat file | grep foo`, `find . -name "*.py" | xargs grep bar`. These are:
+- **Lexical-only**: Miss conceptual matches, require exact keyword knowledge
+- **Noisy**: Return too many or too few results
+- **Token-inefficient**: Raw grep output wastes context window on irrelevant matches
+- **Non-indexed**: Every query scans the entire codebase (slow on large repos)
+Semantic tools (ck --sem) solve these problems but agents don't use them by default because they're not native tools.
+## Enforcement Strategies
+### 1. System Prompt Rules (Weak)
+Add to agent system prompt / CLAUDE.md:
+```markdown
+## Search Policy
+- NEVER use raw `grep` for codebase exploration.
+- ALWAYS use `ck --sem` or `ck --hybrid` for conceptual searches.
+- `grep` is permitted ONLY for exact literal string matching (e.g., finding a specific error message).
+- Before any grep, consider: "Can I express this as a ck query?"
+```
+**Effectiveness**: Low-Medium. Depends on model compliance. Claude 4 Opus follows rules well; smaller models may ignore. Costs zero infrastructure.
+### 2. MCP Tool Registration (Medium)
+Register ck as an MCP tool:
+```bash
+claude mcp add ck-search -s user -- ck --serve
+```
+The agent sees `ck_search`, `ck_get`, `ck_info`, `ck_reindex` as first-class tools alongside `bash` and `read`. If the prompt emphasizes preferring MCP tools, the agent may route code searches through ck.
+**Effectiveness**: Medium. Agent still has `bash` available. Needs prompt reinforcement. Best when combined with Strategy 1.
+### 3. Shell Wrapper Interception (Medium-Strong)
+Create a wrapper script that intercepts grep and routes semantic-looking queries to ck:
+```bash
+#!/bin/bash
+# ~/bin/grep (wrapper for agent's PATH)
+# Route to ck if query looks conceptual (multi-word, no obvious regex)
+if [[ "$*" =~ [[:space:]] ]] && [[ ! "$*" =~ [\^\$\.\*\[\]\\] ]]; then
+  if command -v ck &>/dev/null; then
+    exec ck --hybrid "$@" 2>/dev/null || exec /usr/bin/grep "$@"
+  fi
+fi
+exec /usr/bin/grep "$@"
+```
+Place this in the agent's PATH before `/usr/bin`.
+**Risks**:
+- False positives: `grep "TODO: fix this"` gets intercepted but should be lexical
+- Breaks scripts that parse grep output format
+- Adding `--hybrid` changes output format (score fields, different line format)
+- Hard to distinguish "the agent wants grep" from "the agent typed something that looks semantic"
+**Mitigation**: Only wrap for known agent users, not system-wide. Use an explicit env var: `CK_ENFORCE=1 grep ...`
+### 4. Harness-Level Tool Routing (Strong)
+Modify the agent harness (e.g., lean-ctx bash tool) to inspect every bash command before execution:
+```python
+def pre_exec_hook(command: str) -> str:
+    """Intercept grep/cat and suggest ck."""
+    if re.match(r'^(grep|/usr/bin/grep|/bin/grep)\s', command):
+        # Extract pattern and path
+        match = re.match(r'^grep\s+(?:-[a-zA-Z]+\s+)*["\']?([^"\']+)["\']?\s+(.*)', command)
+        if match:
+            pattern, path = match.groups()
+            # If pattern is multi-word (conceptual), route to ck
+            if ' ' in pattern and not re.search(r'[\^\$\.\*\[\]\\]', pattern):
+                return f'ck --hybrid "{pattern}" {path}'
+    return command  # pass through unchanged
+```
+**Effectiveness**: Strong. Catches all grep invocations. Can log/report non-compliance. Requires modifying harness code.
+### 5. Post-Hoc Validation (Weak)
+A checker that scans agent action logs and flags grep usage. Reactive — doesn't prevent the bad behavior, only reports it.
+```bash
+# Check agent logs for grep usage
+grep -c '"command": "grep' agent-session.log
+```
+## Recommended Approach
+**Three-layer defense for the ultimate-pi harness:**
+1. **Layer 1 (immediate)**: System prompt rules in AGENTS.md + install ck + register MCP
+2. **Layer 2 (medium-term)**: Add pre-exec hook to lean-ctx bash tool that warns/logs grep usage and suggests ck
+3. **Layer 3 (optional)**: Shell wrapper for known agent sessions with `CK_ENFORCE` env var
+## Open Questions
+- [ ] How does Claude Code's native `Grep` tool interact with custom MCP tools? Does it prefer its own?
+- [ ] Can MCP tools be marked as "preferred" or given higher priority?
+- [ ] What's the false-positive rate of shell interception on real-world agent queries?

package/vault/wiki/concepts/agent-skills-ecosystem.md ADDED Viewed

@@ -0,0 +1,74 @@
+---
+type: concept
+status: developing
+created: 2026-05-05
+tags:
+  - agent-skills
+  - ecosystem
+  - open-standard
+  - progressive-disclosure
+related:
+  - "[[superpowers-methodology]]"
+  - "[[agent-skills-pattern]]"
+  - "[[skill-first-architecture]]"
+  - "[[policy-engine-pattern]]"
+---
+# Agent Skills Ecosystem
+## Definition
+The Agent Skills ecosystem is the open-standard marketplace and format for packaging reusable AI agent expertise as SKILL.md files. Originally developed by Anthropic, released as an open standard in October 2025, and adopted by all major agent platforms within weeks. As of May 2026: 490K+ skills across multiple marketplaces.
+## The SKILL.md Open Standard
+Every skill is a directory containing a `SKILL.md` file with:
+- **YAML frontmatter**: `name` (lowercase-hyphenated, ≤64 chars), `description` (≤1024 chars — the trigger), optional `allowed-tools`, `metadata`, `license`
+- **Markdown instructions**: What the agent should do when the skill activates
+Progressive disclosure architecture:
+1. **Discovery** (always loaded): Name + description only (~100 tokens per skill)
+2. **Activation** (on-demand): Full SKILL.md body loaded when task matches description
+3. **Execution** (on-demand): Scripts, reference files, templates loaded as needed
+## Marketplaces
+| Marketplace | Skills | Key Differentiator |
+|-------------|--------|-------------------|
+| **Skills.sh** (Vercel) | 83K+ | Curated quality, CLI-native install, Snyk security scanning, leaderboard |
+| **SkillsMP** | 400K+ | Volume leader, GitHub crawl, AI-powered semantic search |
+| **ClawHub** (OpenClaw) | ~10K+ | Open platform, hit by ClawHavoc malware campaign |
+## Installation
+Universal: `npx skills add owner/repo`
+Per-agent paths:
+- Claude Code: `.claude/skills/` (project) or `~/.claude/skills/` (personal)
+- Codex CLI: `.agents/skills/` or `.codex/skills/`
+- Cursor: `.cursor/skills/`
+- Gemini CLI: `.gemini/skills/`
+- GitHub Copilot: `.github/skills/`
+- Windsurf: `.windsurf/skills/`
+## Two Skill Types
+1. **Capability Uplift** — Gives agent abilities it doesn't have. Before the skill, agent can't do the task. Examples: Firecrawl (web scraping), Document Skills (PDF/DOCX creation), Webapp Testing (Playwright).
+2. **Encoded Preference** — Agent already knows how, but the skill encodes your team's specific way. Examples: Code review checklists, commit message formats, API conventions.
+## Security Risks
+Snyk's ToxicSkills study (Feb 2026) scanned 3,984 skills:
+- 36.8% had at least one security flaw
+- 13.4% contained critical-level issues
+- 76 skills were confirmed malicious payloads
+- 91% of malicious skills combined prompt injection with traditional malware
+The ClawHavoc campaign (Jan-Feb 2026): 341 malicious skills on ClawHub distributing Atomic macOS Stealer.
+## Ecosystem Trajectory
+Zero to 490K skills in six months (Oct 2025 – Mar 2026). All major platforms adopted within weeks. The format's simplicity (anyone who can write Markdown can create a skill) drove adoption. Network effects accelerating: more skills → more agent users → more skill authors.
+## Relevance to Harness
+Our `.pi/skills/` system uses the same progressive disclosure pattern. The Agent Skills ecosystem validates that markdown-based skills are the right primitive — and that cross-agent portability is the winning strategy. We should consider SKILL.md compatibility for maximum reuse of the 490K+ ecosystem.

package/vault/wiki/concepts/agent-skills-pattern.md ADDED Viewed

@@ -0,0 +1,68 @@
+---
+type: concept
+title: "Agent Skills Pattern (Progressive Disclosure)"
+created: 2026-05-01
+updated: 2026-05-01
+status: developing
+tags:
+  - harness
+  - skills
+  - context-engineering
+  - gemini-cli
+related:
+  - "[[harness-engineering-first-principles]]"
+  - "[[gemini-cli-architecture]]"
+sources:
+  - "[[Source: Gemini CLI Changelogs]]"
+  - "[[Source: LangChain - Anatomy of Agent Harness]]"
+---# Agent Skills Pattern: Progressive Disclosure
+## What It Is
+Agent Skills is a harness-level primitive for **progressive disclosure**: skills are loaded on-demand via an activation mechanism rather than all at context start. This prevents context rot — the observed degradation in model performance as the context window fills with irrelevant tool definitions and instructions.
+## Why It Matters
+Too many tools or MCP servers loaded into context on agent start degrades performance _before_ the agent can start working. Skills solve this by loading only when needed:
+1. Agent starts with minimal context (core tools + system prompt)
+2. Agent analyzes task, determines which skills are relevant
+3. Agent calls `activate_skill` tool to load specific skill's instructions + tools
+4. Skill's context injected into current conversation
+5. Agent uses skill, then moves on (skill context may persist or be compacted)
+## Gemini CLI Implementation (v0.23+)
+- **v0.23 (Jan 2026)**: Experimental Agent Skills support via agentskills.io
+- **v0.24**: Built-in agent skills, `/skills install/uninstall`, `/agents refresh`
+- **v0.25**: `activate_skill` tool formalized, `pr-creator` skill, skills enabled by default
+- **v0.26**: `skill-creator` meta-skill (skills that create skills)
+- **v0.30**: SDK package enabling custom skills with dynamic system instructions
+- **v0.39**: `/memory inbox` for reviewing and patching skills extracted during sessions
+## Key Design Decisions
+1. **Frontmatter metadata**: Each skill has structured metadata describing when to activate
+2. **Activation tool**: Model decides when to call `activate_skill` based on task analysis
+3. **Skill inbox**: Extracted skills don't auto-install — human reviews first via `/memory inbox`
+4. **Skill-creator**: Meta-skill enables agent to create new skills from observed patterns
+## Ultimate-PI Current State
+We have `.pi/skills/` directory with 16+ skills, but they load all at context start (no progressive disclosure). This follows the "delivery mechanism for context engineering" pattern but without the activation mechanism that prevents context rot.
+## Integration Path (P-F2)
+1. Add frontmatter to each skill: `activation_triggers`, `required_capabilities`, `token_budget`
+2. Add `activate_skill` tool to tool registry
+3. Implement skill registry that loads skills on-demand
+4. Add `/memory inbox` for reviewing AI-extracted patterns before they become permanent skills
+5. Implement skill-creator meta-skill for autonomous skill generation from observed failures
+## Relationship to Other Harness Primitives
+- **Context Compression**: Skills reduce the _need_ for compression by keeping context lean
+- **Subagents**: Skills can be loaded into subagents independently, each with relevant context
+- **Policy Engine**: Skill activation can be gated by policy (e.g., "never activate browser skill on production")
+- **Memory Systems**: Skills extracted from sessions feed into persistent memory (wiki in our case)

package/vault/wiki/concepts/agentic-harness-context-enforcement.md ADDED Viewed

@@ -0,0 +1,91 @@
+---
+type: concept
+title: Agentic Harness Context Enforcement
+created: 2026-04-30
+updated: 2026-04-30
+tags:
+  - agentic-harness
+  - context-optimization
+  - enforcement
+status: developing
+related:
+  - "[[think-in-code]]"
+  - "[[context-mode]]"
+  - "[[lean-ctx]]"
+sources:
+  - "[[Research: context-mode vs lean-ctx]]"
+---# Agentic Harness Context Enforcement
+How to enforce context-efficient behavior ("think in code") in an agentic harness — the orchestration layer that manages AI coding agents.
+## Problem
+AI agents are profligate with context. They call `Read()` on 47 files when 1 script would suffice. They produce verbose pleasantries. They forget what they already read. The harness must enforce discipline because the agent won't do it voluntarily.
+## Enforcement Layers
+### Layer 1: System Prompt / Instructions (cheapest, least reliable)
+- Inject "Think in Code" rules into AGENTS.md or system prompt
+- Works with any agent without custom tools
+- Relies on agent compliance — can be ignored under pressure
+- Examples: context-mode injects rules into 14 platform configs
+### Layer 2: PreToolUse Interception (medium cost, high reliability)
+- Intercept tool calls before execution
+- Route large reads to sandbox execution instead
+- Block dangerous commands (curl, wget, rm -rf)
+- Requires MCP or hook support in the harness
+- Example: context-mode PreToolUse hook
+### Layer 3: PostToolUse Compression (medium cost, medium reliability)
+- After tool output enters context, compress it
+- Strip noise, keep signal
+- Store raw data in searchable index (FTS5)
+- Example: lean-ctx shell hook patterns
+### Layer 4: Tool Replacement (highest cost, highest reliability)
+- Replace native `Read()`, `Bash()`, `WebFetch()` with optimized versions
+- AST-based file reading (signatures only)
+- Shell output compression (pattern-matched)
+- Cached re-reads
+- Example: lean-ctx's 46 MCP tools
+### Layer 5: Governance & Monitoring (supplemental)
+- Profiles define what each agent can do
+- Budgets limit token/cost/shell usage
+- SLOs trigger throttling
+- Anomaly detection for runaway consumption
+- Analytics dashboard for human oversight
+- Example: lean-ctx governance features
+### Layer 6: TypeScript Execution Layer (emerging, high potential)
+- Replace ALL individual tool calls with a single "write TypeScript" tool
+- Agent writes TS code that orchestrates tools via typed API
+- Code executes in sandboxed runtime (Node.js VM, Deno, or Worker isolate)
+- Tool calls dispatch via typed RPC to harness for permission gating
+- Intermediate results stay in sandbox — only final output enters LLM context
+- 3-4x context reduction vs flat tool calling
+- ~20% higher multi-tool success rate (CodeAct, ICML 2024)
+- Validated by: Apple CodeAct, Cloudflare Code Mode, Executor (1.3K stars)
+- See [[ts-execution-layer]] and [[harness-implementation-plan]] (P43)
+## Recommendation for ultimate-pi Harness
+**Current state**: lean-ctx installed as MCP server + shell hook.
+**Gap**: No "Think in Code" enforcement. The harness relies on AGENTS.md rules (Layer 1 only).
+**Recommended additions**:
+1. **Add Think in Code to system prompt** (zero cost, immediate). Update AGENTS.md with the mandatory rule from context-mode's playbook.
+2. **Verify lean-ctx `ctx_execute` works** — lean-ctx has execution capabilities. Test if agent can write and run analysis scripts through lean-ctx tools.
+3. **Consider context-mode as complement** — the two tools solve different halves: context-mode excels at sandbox enforcement + Think in Code paradigm; lean-ctx excels at compression + governance. They could coexist if the MCP namespace doesn't conflict.
+4. **Add output compression rules** — context-mode's output compression (strip filler, fragments OK, short synonyms) can be added to AGENTS.md regardless of tool choice.
+5. **Monitor context usage** — lean-ctx's `gain` dashboard and `wrapped` reports provide visibility. Use them to measure effectiveness of any new enforcement.
+6. **Plan TypeScript Execution Layer (P43)** — the logical extension of Think-in-Code. Instead of enforcing code-over-data for analysis tasks, replace the entire flat tool list with a typed TypeScript API + sandboxed runtime. Agent writes TS code; sandbox executes; only results enter context. 3-4x context reduction, ~20% higher success rate. See [[ts-execution-layer]] and [[harness-implementation-plan]].

package/vault/wiki/concepts/agentic-harness.md ADDED Viewed

@@ -0,0 +1,34 @@
+---
+type: concept
+title: "Agentic Harness"
+created: 2026-04-30
+updated: 2026-04-30
+status: seed
+tags: [#concept, #harness]
+related:
+  - "[[harness]]"
+  - "[[harness-implementation-plan]]"
+  - "[[harness-wiki-skill-mapping]]"
+---
+# Agentic Harness
+> [!stub] This is a stub page. See [[harness]] for the full module documentation.
+The agentic harness is the central execution pipeline in the ultimate-pi architecture. It enforces an 8-layer mandatory workflow where every task must flow through all layers without skipping.
+## What it does
+- Enforces structured execution (no ad-hoc coding)
+- Runs adversarial verification (critic agents attack, not review)
+- Maintains persistent memory via the wiki vault
+- Orchestrates multi-step plans with grounding checkpoints
+## Key pages
+- [[harness]] — full module documentation
+- [[harness-implementation-plan]] — build phases and token budgets
+- [[harness-wiki-pipeline]] — data flow between harness and wiki
+- [[adr-008]] — Spec-Only Black-Box QA decision
+- [[adr-009]] — Mode B persistent memory decision
+- [[adr-010]] — Harness-wiki tight-coupling contract

package/vault/wiki/concepts/agentic-orchestration-pipeline.md ADDED Viewed

@@ -0,0 +1,56 @@
+---
+type: concept
+tags:
+  - orchestration
+  - multi-agent
+  - pipeline
+  - agent-architecture
+related:
+  - "[[Agent Harness Architecture]]"
+  - "[[Multi-Agent Specialization]]"
+  - "[[sources/disler-pi-vs-claude-code]]"
+  - "[[sources/opendev-arxiv-2603.05344v1]]"
+---
+# Agentic Orchestration Pipeline
+A structured workflow where multiple specialized AI agents coordinate to complete complex software engineering tasks. The orchestrator decomposes work, routes to specialists, and assembles results.
+## Three Orchestration Patterns
+### 1. Subagent Delegation (Fan-out)
+A primary agent spawns isolated subagents for independent subtasks. Each subagent runs in its own context window with filtered tool access. Results are collected and synthesized by the primary agent.
+**Implementation**: Pi's `subagent-widget` extension (`/sub <task>`), OpenDev's `spawn_subagent` tool.
+**Best for**: Parallel exploration, isolated analysis, background tasks.
+### 2. Team Dispatch (Specialist Routing)
+A dispatcher agent reviews user requests and selects the most appropriate specialist from a predefined roster. Each specialist has a domain-specific system prompt and tool set.
+**Implementation**: Pi's `agent-team` extension, configured via `.pi/agents/teams.yaml`. The dispatcher uses a `dispatch_agent` tool.
+**Best for**: Work that benefits from domain expertise (frontend vs backend, planning vs execution).
+### 3. Sequential Chaining (Pipeline)
+Multiple agents execute in sequence where each step's output feeds into the next step's prompt. The `$INPUT` variable carries the previous step's output; `$ORIGINAL` always contains the initial user prompt.
+**Implementation**: Pi's `agent-chain` extension, defined in `.pi/agents/agent-chain.yaml` as a list of `steps` with `agent` and `prompt` fields.
+**Best for**: Multi-phase workflows (plan → build → review → fix → verify).
+## Design Principles
+1. **Schema-level isolation**: Subagents receive filtered tool schemas — they can't attempt actions they shouldn't perform. More robust than runtime permission checks.
+2. **Context isolation**: Each subagent runs with an independent conversation history. Only summaries return to the parent, preventing context pollution.
+3. **Explicit termination**: Subagents have clear stop conditions to prevent over-exploration.
+4. **Parallel execution**: Independent subagent calls auto-parallelize via thread pools.
+5. **Model specialization**: Different pipeline stages can use different models (e.g., Opus for planning, Sonnet for building, Haiku for reviewing).
+## Harness Implementation Path
+Our harness can adopt all three patterns as Pi extensions:
+1. Extend existing `Agent` tool with team dispatch via YAML config
+2. Add chain orchestration with `$INPUT` variable injection
+3. Implement context isolation per subagent (fresh conversation per spawn)
+4. Add progress dashboards (grid for teams, step tracker for chains)

package/vault/wiki/concepts/agentic-search-no-embeddings.md ADDED Viewed

@@ -0,0 +1,18 @@
+---
+type: concept
+status: stub
+created: 2026-05-02
+updated: 2026-05-02
+tags: [concept, search, agents]
+---
+# Agentic Search Without Embeddings
+Pattern used by Claude Code: agents search codebases by reading files directly (grep, find, AST traversal) rather than relying on pre-built embedding indexes. No vector database required.
+Contrasts with [[Semantic Codebase Indexing]] and [[hybrid-code-search]]. Relevant to the embedding-vs-agentic-search design tension in harness architecture.
+## References
+- [[claude-code-architecture-vila-lab-2026]]
+- [[agent-search-enforcement]]

package/vault/wiki/concepts/anthropic-context-engineering.md ADDED Viewed

@@ -0,0 +1,13 @@
+---
+type: concept
+status: stub
+created: 2026-05-02
+updated: 2026-05-02
+tags: [concept, context]
+---
+# Anthropic Context Engineering
+Anthropic's approach to context engineering for Claude agents. Encompasses prompt design, context window management, and tool output formatting.
+Referenced in: [[Research: Meta-Agent Context Drift Detection]]

package/vault/wiki/concepts/antigravity-agent-first-architecture.md ADDED Viewed

@@ -0,0 +1,61 @@
+---
+type: concept
+title: "Antigravity Agent-First Architecture"
+status: developing
+created: 2026-05-01
+updated: 2026-05-01
+tags:
+  - antigravity
+  - agent-architecture
+  - harness-design
+aliases: ["agent-first IDE", "Antigravity architecture"]
+related:
+  - "[[agentic-harness]]"
+  - "[[model-adaptive-harness]]"
+  - "[[harness-implementation-plan]]"
+sources:
+  - "[[google-antigravity-official-blog]]"
+  - "[[google-antigravity-wikipedia]]"
+  - "[[cursor-vs-antigravity-2026]]"
+---# Antigravity Agent-First Architecture
+Google Antigravity's foundational architectural shift: the IDE is not an AI-enhanced editor. It is a **control plane for autonomous coding agents**.
+## The Two-View Architecture
+### Editor View
+Traditional IDE interface (VS Code fork). Agent sidebar. Tab completions, inline commands. For hands-on synchronous workflows.
+### Manager View ("Mission Control")
+Dedicated orchestration interface. Spawn, supervise, and redirect multiple agents working asynchronously across different workspaces. The human shifts from coder to architect.
+## Core Innovation: The Inversion
+```
+Traditional: Human → IDE → Agent (agent as assistant in sidebar)
+Antigravity: Human → Manager View → Multiple Agents → Editor/Browser/Terminal
+```
+The Manager View inverts the relationship. The interface is embedded in the agent, not the other way around. Agents have direct access to editor, terminal, and browser as equal tool surfaces.
+## What This Means for Harness Design
+Our 8-layer harness is a **pipeline** (sequential, mandatory layers). Antigravity's is a **control plane** (parallel agents, asynchronous execution).
+These are complementary architectures:
+- **Pipeline**: Best for quality enforcement, correctness guarantees, drift detection
+- **Control Plane**: Best for parallelism, task delegation, human oversight
+The harness should adopt the control-plane model for its L7 orchestration layer while keeping the pipeline model for L1-L4 quality enforcement.
+## Four Design Tenets
+1. **Trust**: Artifacts replace raw tool logs. Agents prove work via verifiable deliverables.
+2. **Autonomy**: Agents have full control of multiple surfaces. No constant human prompts.
+3. **Feedback**: Google Docs-style commenting on artifacts. Asynchronous. No restart needed.
+4. **Self-Improvement**: Agents learn from past work. Knowledge base persists across projects.
+## Our Gap
+The harness has no Manager View equivalent. L7 (Schema Orchestration) is DAG-based sequential orchestration, not parallel agent dispatch. This is a design gap — but may be intentional: our harness targets CLI-level enforcement, not IDE-level.

package/vault/wiki/concepts/ast-compression.md ADDED Viewed

@@ -0,0 +1,19 @@
+---
+type: concept
+title: "ast-compression"
+created: 2026-04-30
+updated: 2026-04-30
+status: seed
+tags: [#concept, #lean-ctx, #context-optimization]
+related:
+  - "[[lean-ctx]]"
+  - "[[ast-truncation]]"
+---
+# AST Compression
+> [!stub] See also: [[ast-truncation]] for the harness-specific implementation.
+lean-ctx's approach to code compression: use tree-sitter to parse code in 18 languages, extract only signatures, types, and logic bodies, and strip comments, whitespace, and non-essential syntax. Achieves 60-95% token reduction on source files.
+Differs from [[ast-truncation]] (which stubs function bodies) in that AST compression preserves logic but strips non-semantic elements, while AST truncation removes function bodies entirely for high-level structural views.

package/vault/wiki/concepts/ast-truncation.md ADDED Viewed

@@ -0,0 +1,66 @@
+---
+type: concept
+title: "AST Truncation"
+created: 2026-04-30
+updated: 2026-04-30
+tags:
+  - agent-context
+  - token-reduction
+  - tree-sitter
+  - context-window
+related:
+  - "[[repo-map-ranking]]"
+  - "[[progressive-disclosure-agents]]"
+  - "[[wozcode]]"
+  - "[[research-wozcode-token-reduction]]"
+status: developing
+---# AST Truncation
+AST truncation is a technique for reducing LLM input tokens during code exploration by returning function/method signatures while stubbing their bodies. Unlike file-level selection (choose which files to show), AST truncation operates at the syntax level: show the interface, hide the implementation.
+## How It Works
+1. Parse a source file with tree-sitter to produce a concrete syntax tree
+2. Identify all definition nodes: functions, methods, classes, type definitions
+3. For each definition: return the signature (name, parameters, return type, docstring)
+4. Replace the body with a stub: `{ /* ... N lines truncated ... */ }`
+5. The model can request full body expansion for specific definitions
+## Token Savings
+- A typical function signature is 3-10 lines; its body may be 50-500 lines
+- For files with many functions, AST truncation can reduce context by 70-90%
+- The model still sees the "map" (what exists, how things connect) without the "territory" (full implementation)
+## Relationship to Repo-Map Ranking
+[[repo-map-ranking]] selects *which files* to include. AST truncation selects *how much* of each file to include. Combined:
+| Level | Technique | What's Shown |
+|-------|-----------|-------------|
+| L0 | File list | Filenames only |
+| L1 | AST truncation | Signatures + stubs |
+| L2 | AST truncation + imports | Signatures, imports, cross-references |
+| L3 | Full content | Everything (on demand) |
+This maps to and extends our existing [[progressive-disclosure-agents]] model.
+## WOZCODE Implementation
+WOZCODE uses AST truncation as its primary input-reduction lever (Source: [[wozcode]]). Combined with ranked search results (not full-file grep dumps), it reduces input tokens on code exploration calls. Their architecture returns "what the model needs" rather than everything found.
+## Limitations
+- **Dynamic languages**: Python, JavaScript, Ruby — tree-sitter can parse syntax but not always resolve types or call targets statically. Truncation may hide important runtime behavior.
+- **Decorators/metaprogramming**: Code generation patterns (Python decorators, Ruby method_missing, JS proxies) create behavior not visible in AST signatures.
+- **Test files**: Often rely on implicit context (fixtures, before/after hooks). Truncation may hide critical setup.
+- **Parser availability**: Requires tree-sitter grammar for each language in the codebase.
+## Implementation Path for Our Harness
+1. Leverage existing [[repo-map-ranking]] tree-sitter infrastructure
+2. Add a `--truncate` flag to the `read` tool (L8 wiki-query-interface)
+3. Implement progressive expansion: model requests `read --expand funcName`
+4. Integrate with [[grounding-checkpoints]] (L3) for verification reads
+5. Language coverage: start with TypeScript/JavaScript, Python, then extend

package/vault/wiki/concepts/barrel-files.md ADDED Viewed

@@ -0,0 +1,37 @@
+---
+type: concept
+status: developing
+tags:
+  - typescript
+  - barrel-files
+  - code-organization
+  - performance
+related:
+  - "[[barrel-files-tkdodo]]"
+  - "[[Research: TypeScript Best Practices and Codebase Structure]]"
+created: 2026-05-02
+updated: 2026-05-02
+---# Barrel Files
+A barrel file is a module (typically `index.ts`) that does nothing but re-export symbols from other files in the same directory. It provides a single import entry point for consumers.
+## The Debate
+**Pro-barrel** (traditional view): Clean imports (`import { X, Y } from '@/dir'`), hides internal structure, simplifies refactoring.
+**Anti-barrel** (emerging consensus, 2024+): Causes circular imports, slows development servers, blocks bundler optimizations.
+## Known Problems
+1. **Circular imports**: When a module inside a directory imports from its own barrel, a circular dependency forms.
+2. **Dev server slowdown**: JavaScript loads and parses every module in the barrel synchronously. Real-world case: 11K → 3.5K modules (68% reduction) by removing barrels, cutting startup from 5-10 seconds.
+3. **Blocks `optimizePackageImports`**: Next.js optimization only works on "pure" re-export barrels with no side-effect code.
+## Current Best Practice (2024+)
+**Application code**: Avoid barrel files. Import directly from source files.
+**Library code**: Barrel files are appropriate as the public API entry point (specified in `package.json` `main` field).
+**Linting**: Enable `import/no-cycle` ESLint rule to catch circular imports from barrels.