npm - ultimate-pi - Versions diffs - 0.1.2 → 0.1.3 - Mend

ultimate-pi 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (516) hide show

package/vault/wiki/concepts/provider-native-prompting.md ADDED Viewed

@@ -0,0 +1,203 @@
+---
+type: concept
+title: "Provider-Native Prompting"
+aliases: ["provider-native harness", "native prompt generation"]
+created: 2026-05-01
+updated: 2026-05-01
+tags: [concept, prompting, harness-design, model-adaptive]
+status: developing
+related:
+  - "[[model-adaptive-harness]]"
+  - "[[harness-configuration-layers]]"
+  - "[[Research: Model-Specific Prompting Guides]]"
+  - "[[openai-prompt-guidance]]"
+  - "[[anthropic-prompt-best-practices]]"
+  - "[[gemini-3-prompting-guide]]"
+---
+# Provider-Native Prompting
+**Generate prompts optimized for each model provider's official conventions, not a single canonical format with strictness relaxations.**
+## Problem
+The current harness design principle is: "Write once for strictest model (GPT-safe defaults). Relax for forgiving models." This treats model differences as varying STRICTNESS of the same format.
+Official provider guidance reveals this is wrong. Each provider specifies fundamentally DIFFERENT prompting conventions:
+| Concern | OpenAI | Anthropic | Google |
+|---------|--------|-----------|--------|
+| Structure | XML-like sections | XML tags | Plain text sections |
+| Constraint order | FIRST | Flexible | LAST |
+| Density | Concise, outcome-first | General over prescriptive | Concise by default |
+| Verification | Pre-flight/post-flight loop | Self-check at end | Split-step verify→generate |
+| Grounding | Citation rules | Quote extraction | Explicit truth statement |
+| Thinking | reasoning_effort param | effort param + adaptive | thinking level LOW/HIGH |
+| Temperature | Unspecified | Removed from API | **1.0 mandatory** |
+These are not relaxations of the same format. They are different formats.
+## Solution: Semantic Spec → Native Renderer
+The harness should separate WHAT must be communicated (semantic spec) from HOW it's communicated (provider-native renderer).
+```
+┌─────────────────┐     ┌──────────────────┐     ┌──────────────┐
+│  Semantic Spec  │ ──► │  Prompt Renderer  │ ──► │  API Call     │
+│  (provider-      │     │  ┌──────────────┐ │     │              │
+│   agnostic)      │     │  │ openai       │ │     │              │
+│                  │     │  │ anthropic    │ │     │              │
+│  - Task          │     │  │ google       │ │     │              │
+│  - Constraints   │     │  │ fallback     │ │     │              │
+│  - Context       │     │  └──────────────┘ │     │              │
+│  - Output spec   │     └──────────────────┘     └──────────────┘
+│  - Verification  │
+│  - Tool spec     │
+└─────────────────┘
+```
+### Semantic Spec (Provider-Agnostic)
+The harness internally represents every task as a structured specification, not a prompt string:
+```yaml
+spec:
+  task: "Refactor the authentication module to use JWT"
+  constraints:
+    - "Must maintain backward compatibility"
+    - "Must not introduce new dependencies"
+    - "Test coverage must not decrease"
+  context:
+    - file: auth/__init__.py
+    - file: auth/tokens.py
+    - file: tests/test_auth.py
+  output:
+    format: diff
+    include_explanation: true
+  verification:
+    - run: pytest tests/test_auth.py
+    - check: git diff --stat
+  tools:
+    - read_file
+    - apply_patch  # openai native
+    - text_editor   # anthropic native
+    - bash
+  model: gpt-5.4
+```
+### Renderer (Provider-Native)
+Each renderer converts the semantic spec into the provider's recommended format:
+**OpenAI renderer** produces:
+```
+<code_editing_rules>
+- Before editing, read relevant file contents for complete context
+- Make small, testable, incremental changes
+- Run pytest tests/test_auth.py after each change
+- Do not add new dependencies
+- Maintain backward compatibility
+</code_editing_rules>
+# Task
+Refactor the authentication module to use JWT.
+# Success criteria
+- Backward compatibility maintained
+- No new dependencies introduced
+- Test coverage not decreased
+- All existing tests pass
+# Stop rules
+- Do not stop until all tests pass
+- If blocked, explain what's missing and ask
+```
+**Anthropic renderer** produces:
+```xml
+<instructions>
+Refactor the authentication module to use JWT.
+</instructions>
+<context>
+<file path="auth/__init__.py">...</file>
+<file path="auth/tokens.py">...</file>
+<file path="tests/test_auth.py">...</file>
+</context>
+<constraints>
+- Maintain backward compatibility. This is important because...
+- Do not introduce new dependencies.
+- Test coverage must not decrease. Run tests after each change.
+</constraints>
+<verification>
+Before you finish, run pytest tests/test_auth.py and verify all tests pass.
+Check git diff --stat to confirm only intended files changed.
+</verification>
+```
+**Google renderer** produces:
+```
+You are a senior software engineer. You are refactoring the authentication module
+to use JWT tokens. You are expected to perform calculations and logical deductions
+based strictly on the provided code.
+[Context files: auth/__init__.py, auth/tokens.py, tests/test_auth.py]
+Based on the entire codebase above, refactor the authentication module to use JWT.
+Synthesize all relevant information from the code that pertains to the refactoring.
+- Maintain backward compatibility
+- No new dependencies
+- Test coverage must not decrease
+- All existing tests must pass
+Verify with high confidence that all tests pass before declaring done. If you cannot
+verify, state what's blocking and STOP.
+```
+## Integration with Existing Harness Layers
+Each harness layer generates a semantic spec fragment. The renderer combines them per model.
+| Layer | Semantic Spec Fragment |
+|-------|----------------------|
+| L1 Spec Hardening | Task definition, acceptance criteria, ambiguity resolution |
+| L2 Structured Planning | Task DAG, subtask ordering, dependency graph |
+| L2.5 Drift Monitor | Detection strategy, thresholds, escalation rules |
+| L3 Grounding Checkpoints | Verification steps per checkpoint |
+| L4 Adversarial Verification | Attack vectors, falsifiable criteria |
+| L5 Observability | Required metrics, instrumentation points |
+| L6 Memory | Wiki read/write contract, relevant pages |
+| L7 Orchestration | Workflow DAG, approval gates |
+| L8 Wiki Query | Search query, context assembly strategy |
+The renderer combines all fragments into a single provider-native prompt.
+## Design Principles
+1. **Official guidance over empirical observation.** Provider docs are the primary source. Empirical failure modes are layered on top, not the foundation.
+2. **Provider-native, not strictness-relaxation.** Generate different prompts for different providers. Never generate one canonical prompt and relax.
+3. **Semantic spec is the source of truth.** Prompt text is ephemeral. The spec is what gets versioned, tested, and audited.
+4. **Renderer is pluggable.** New providers/models get new renderers. The spec stays stable.
+5. **Validate against provider conventions.** Lint prompts against known provider rules before sending (e.g., "Gemini temperature != 1.0" = block).
+## Implementation Notes
+- **Phase P22b** in [[harness-implementation-plan]] adds the prompt renderer module
+- Each renderer is a standalone module: `lib/renderers/openai.ts`, `lib/renderers/anthropic.ts`, `lib/renderers/google.ts`
+- Fallback renderer for unknown models: conservative markdown format
+- Renderer config stored in `.pi/harness/renderers.json`
+- Model-to-renderer mapping in `.pi/harness/model-profiles.json`
+## Source
+Derived from official provider documentation:
+- [[openai-prompt-guidance]]
+- [[anthropic-prompt-best-practices]]
+- [[gemini-3-prompting-guide]]

package/vault/wiki/concepts/quality-signal-sentrux.md ADDED Viewed

@@ -0,0 +1,37 @@
+---
+type: concept
+title: "Quality Signal (sentrux)"
+created: 2026-05-03
+tags:
+  - sentrux
+  - code-quality
+  - metrics
+related:
+  - "[[Five Root Cause Metrics (sentrux)]]"
+  - "[[sentrux]]"
+sources:
+  - "[[sentrux-docs-quality-signal]]"
+  - "[[sentrux-docs-root-cause-metrics]]"
+---
+# Quality Signal (sentrux)
+A single continuous scalar score (0–10,000) computed as the geometric mean of 5 normalized root cause metrics:
+```
+quality_signal = (modularity × acyclicity × depth × equality × redundancy)^(1/5) × 10000
+```
+## Theoretical Basis
+The geometric mean is chosen via the Nash Social Welfare theorem (1950): it is the unique aggregation function satisfying Pareto optimality, symmetry, and independence of irrelevant alternatives.
+## Properties
+- **Ungameable:** No metric can be improved in isolation without genuine structural improvement. Adding useless edges decreases modularity. Removing files without restructuring doesn't affect modularity.
+- **Language-agnostic:** Uses graph-theoretic properties computed from tree-sitter parse trees. Works identically across 52 languages.
+- **Monotonic convergence:** Designed to improve monotonically under correct agent action — like gradient descent.
+## Why Not Letter Grades?
+Letter grades (A-F) collapse information. A codebase with D cyclicity but A in everything else gets the same overall grade as B across the board. Continuous score preserves resolution for incremental improvement tracking.
+## Normalization
+Each metric normalized to [0, 1] before aggregation, ensuring equal weight regardless of raw units.

package/vault/wiki/concepts/repo-map-ranking.md ADDED Viewed

@@ -0,0 +1,42 @@
+---
+type: concept
+title: "Repo Map Ranking"
+created: 2026-04-30
+updated: 2026-04-30
+tags:
+  - agent-context
+  - graph-algorithms
+  - context-window
+related:
+  - "[[aider-repomap-tree-sitter]]"
+  - "[[progressive-disclosure-agents]]"
+status: developing
+---# Repo Map Ranking
+A graph-based algorithm for selecting the most important portions of a codebase to present to an agent within a token budget.
+## Algorithm (from Aider)
+1. Parse every source file with tree-sitter to extract AST
+2. Identify all symbol definitions (classes, functions, methods, variables, types)
+3. Identify all cross-references (where each symbol is used)
+4. Build a dependency graph: nodes = files, edges = cross-file references
+5. Rank nodes by a centrality measure (most-referenced symbols = most important)
+6. Select top-ranked nodes that fit within the token budget
+7. For each selected node, include the symbol signatures (not full implementation)
+## Properties
+- **PageRank-inspired**: importance flows through the graph via references
+- **Language-aware**: tree-sitter provides accurate AST parsing per language
+- **Token-budgeted**: always fits in the context window
+- **Dynamic**: can recompute for sub-trees when working on specific areas
+## Why Ranking Matters
+Without ranking, agents either:
+- Get nothing beyond the current file (miss cross-file dependencies)
+- Get everything (blows context window on large repos)
+Ranking provides the "Goldilocks" middle: the most important context that fits.

package/vault/wiki/concepts/result-monad-error-handling.md ADDED Viewed

@@ -0,0 +1,47 @@
+---
+type: concept
+status: developing
+tags:
+  - typescript
+  - error-handling
+  - functional-programming
+  - result-monad
+related:
+  - "[[ts-result-error-handling-kkalamarski]]"
+  - "[[Research: TypeScript Best Practices and Codebase Structure]]"
+created: 2026-05-02
+updated: 2026-05-02
+---# Result Monad Error Handling
+A functional programming pattern for TypeScript that treats errors as **values** rather than exceptions. Instead of throwing and catching, functions return a `Result<Ok, Err>` type that is either a success (`Ok`) or failure (`Err`).
+## The Pattern
+```typescript
+type Result<Ok, Err> = {
+  map<O>(f: (v: Ok) => O): Result<O, Err>;
+  flatMap<O>(f: (v: Ok) => Result<O, Err>): Result<O, Err>;
+  match<O>(handlers: { Ok: (v: Ok) => O; Err: (e: Err) => O }): O;
+};
+```
+## Core Principle: Wrap Early, Unwrap Late
+1. **Wrap**: Convert fallible operations to `Result` immediately using `Ok(value)` or `Err(error)`
+2. **Compose**: Chain transformations with `.map()` and `.flatMap()`. Operations are only applied to `Ok` values — errors propagate automatically
+3. **Unwrap**: Only at the boundary (UI, API response) use `.match()` to handle both cases
+## Benefits
+- No scattered try-catch blocks
+- Error handling is explicit in the type signature
+- Impossible to forget to handle errors — the compiler enforces it
+- Error information can be carried through the call chain without additional parameters
+## Trade-offs
+- Not idiomatic TypeScript (language uses exceptions natively)
+- Requires team buy-in and consistency
+- Library support: neverthrow, ts-results, fp-ts, effect-ts
+- Adds abstraction overhead for simple cases

package/vault/wiki/concepts/safety-defense-in-depth.md ADDED Viewed

@@ -0,0 +1,83 @@
+---
+type: concept
+tags:
+  - safety
+  - security
+  - architecture
+  - agent-governance
+related:
+  - "[[Agent Harness Architecture]]"
+  - "[[sources/opendev-arxiv-2603.05344v1]]"
+  - "[[sources/disler-pi-vs-claude-code]]"
+---
+# Safety Defense-in-Depth
+A multi-layer safety architecture where each layer independently prevents a class of harm. No single bypass compromises the system. The key principle: make unsafe operations structurally impossible rather than relying on runtime permission checks that the agent can probe or argue against.
+## Five Safety Layers
+### Layer 1: Prompt-Level Guardrails
+System prompt encodes security policy, action safety rules, git workflow requirements, error recovery patterns. The first line of defense — guides model reasoning toward safe behavior.
+### Layer 2: Schema-Level Tool Restrictions
+Tools absent from the agent's schema cannot be invoked, argued about, or probed. This is more robust than runtime checks:
+- **Plan mode**: Only read-only tools in schema. Write tools don't exist from the model's perspective.
+- **Subagent filtering**: Each subagent receives only tools relevant to its role.
+- **MCP discovery gating**: External tools appear only after explicit search + selection.
+### Layer 3: Runtime Approval System
+Three autonomy levels:
+- **Manual**: Every tool call requires explicit approval
+- **Semi-Auto**: Read-only commands auto-approved; writes prompt for confirmation
+- **Auto**: All operations approved for trusted workflows
+Rule types evaluated in priority order:
+- **Danger**: Regex match with auto-deny (cannot be overridden). Default rules: `rm -rf /`, `rm -rf *`, `chmod 777`
+- **Pattern**: Regex match against command string
+- **Command**: Exact match
+- **Prefix**: Prefix match (e.g., `git` matches `git push`)
+### Layer 4: Tool-Level Validation
+Per-tool safety checks executed during tool handling:
+- **DANGEROUS_PATTERNS blocklist**: `rm -rf`, `sudo`, fork bombs, `curl|bash` pipes, `dd` to devices
+- **Stale-read detection**: Rejects edits if file was modified since last read
+- **Output truncation**: Caps at 30,000 chars with head-tail strategy
+- **Timeouts**: 60s idle timeout, 600s absolute timeout
+### Layer 5: Lifecycle Hooks
+External scripts register for lifecycle events (PreToolUse, PostToolUse, SessionStart, Stop). Hooks can:
+- **Block** (exit code 2): Prevent tool execution entirely
+- **Mutate**: Modify tool arguments before execution (e.g., inject `--dry-run`)
+- **Observe**: Async logging/auditing after execution
+## Key Safety Patterns
+### Schema Gating > Permission Checks
+Removing tools from the agent's schema eliminates the entire class of attack — the model cannot reason about capabilities it doesn't know exist. Runtime permission checks let the agent argue for exceptions.
+### Approval Persistence
+Approval rules persist to disk across sessions. Without persistence, users re-approve same operations every session, leading to approval fatigue and blanket auto-approval.
+### Doom-Loop Detection
+MD5 fingerprint of `(tool_name, tool_args)` tracked in sliding window of 20 calls. Same fingerprint appearing 3+ times triggers warning → approval pause escalation.
+### Resource Bounding
+Every resource that grows with session length must have a cap: iteration limits, nudge budgets, undo history (50 ops), concurrent tool calls (5 max).
+## Pi's Damage Control Extension
+disler's `damage-control` extension maps to Layers 3-4:
+- **Dangerous Commands**: Regex patterns with `ask: true` or strict block
+- **Zero Access Paths**: `.env`, `~/.ssh/`, `*.pem`
+- **Read-Only Paths**: `package-lock.json`, `/etc/`
+- **No-Delete Paths**: `.git/`, `Dockerfile`, `README.md`
+## Relevance to Our Harness
+Current state: No safety defense-in-depth. We need:
+- Schema-level filtering for subagents (already have `allowed-tools` concept)
+- Runtime approval system for dangerous commands
+- DANGEROUS_PATTERNS blocklist
+- Stale-read detection for file edits
+- Doom-loop detection for repeated tool calls

package/vault/wiki/concepts/sandbox-os-enforcement.md ADDED Viewed

@@ -0,0 +1,18 @@
+---
+type: concept
+status: stub
+created: 2026-05-02
+updated: 2026-05-02
+tags: [concept, sandbox, security]
+---
+# Sandbox OS Enforcement
+OS-level sandboxing for agent tool execution. Codex uses this as foundation with permissions as policy layer (First Principle #16). Ensures tool calls cannot escape their sandbox environment.
+Contrasts with application-level sandboxing. OS-level provides stronger isolation guarantees.
+## References
+- [[codex-harness-innovations]]
+- [[codex-open-source-agent-2026]]

package/vault/wiki/concepts/selective-debate-routing.md ADDED Viewed

@@ -0,0 +1,70 @@
+---
+type: concept
+title: "Selective Debate Routing"
+created: 2026-04-30
+updated: 2026-04-30
+status: active
+tags:
+  - debate
+  - consensus
+  - token-efficiency
+  - multi-agent
+related:
+  - "[[consensus-debate]]"
+  - "[[adr-011]]"
+  - "[[harness-implementation-plan]]"
+sources:
+  - "[[fan2025-imad]]"
+---# Selective Debate Routing
+The practice of triggering multi-agent debate only when likely to be beneficial, rather than for every query. From iMAD (Fan et al., AAAI 2026).
+## Core Insight
+> Multi-Agent Debate can overturn correct single-agent answers. Always-on debate wastes tokens AND can reduce accuracy.
+## iMAD Mechanism
+1. Single agent produces structured self-critique response
+2. Extract 41 linguistic/semantic features (hesitation cues):
+   - Uncertainty markers ("might", "could be", "I think")
+   - Contradictory statements
+   - Missing evidence references
+   - Low confidence indicators
+3. Lightweight classifier (FocusCal loss) → debate or skip
+4. Generalizes across datasets without per-task tuning
+## Results
+- **92% token reduction** vs always-debate
+- **13.5% accuracy improvement** vs single-agent
+- Works across 6 QA datasets, 5 baselines
+## Impact on Our Consensus Debate (ADR-011)
+Current ADR-011 design assumes:
+- Debate always beneficial
+- Always worth the ~13,000 token cost per subtask
+- Always improves over single-pass review
+iMAD suggests we should:
+1. Add a **pre-debate gate**: single agent self-critiques first
+2. If confidence is high + no hesitation cues → skip debate, save tokens
+3. If uncertainty detected → trigger debate
+4. This could reduce debate token cost by up to 92% on high-confidence tasks
+## Implementation Sketch
+```
+Task → Single agent self-critique → Extract hesitation features
+  ├─ High confidence → Skip debate, proceed
+  └─ Uncertainty detected → Trigger consensus debate (per ADR-011)
+       └─ Consensus reached → File winning position to wiki/consensus/ (mandatory, per [[consensus-debate]])
+```
+## Open Questions
+- Do hesitation cues in code review differ from QA tasks?
+- Can a single classifier work across L1 (spec), L2 (plan), L4 (code) debates?
+- Should the classifier be model-specific (different models show different hesitation patterns)?

package/vault/wiki/concepts/self-evolving-harness.md ADDED Viewed

@@ -0,0 +1,60 @@
+---
+type: concept
+title: "Self-Evolving Harness"
+created: 2026-04-30
+updated: 2026-04-30
+status: seed
+tags:
+  - harness
+  - auto-evolution
+  - meta-learning
+related:
+  - "[[agentic-harness]]"
+  - "[[harness-implementation-plan]]"
+  - "[[model-adaptive-harness]]"
+sources:
+  - "[[lou2026-autoharness]]"
+  - "[[lee2026-meta-harness]]"
+  - "[[meng2026-agent-harness-survey]]"
+---# Self-Evolving Harness
+The concept that a harness can automatically improve itself through iterative refinement, execution traces, and outer-loop optimization — rather than requiring manual human engineering.
+## Two Approaches
+### AutoHarness (Lou et al., 2026)
+- Small model generates harness code iteratively
+- Environment provides feedback (illegal move detection, scores)
+- LLM refines harness from failure signals
+- Result: synthesized harness + small model beats large model bare
+### Meta-Harness (Lee et al., 2026)
+- Outer-loop system searches over harness code
+- Agentic proposer accesses ALL prior candidates (code, traces, scores)
+- Filesystem-based memory of experiments
+- Surpasses hand-engineered baselines on TerminalBench-2
+## What Can Evolve
+From the Self-Evolving Agents Survey (Gao et al., 2026):
+- **Models**: Fine-tuning from agent experience
+- **Memory**: What to store, how to retrieve
+- **Tools**: Which tools to use, how to invoke them
+- **Architecture**: Agent topology, routing decisions
+## Relevance to Our Harness
+Our harness is currently static — manually designed skill files, schemas, gates. Self-evolution suggests:
+1. **Token budget auto-tuning**: Actual vs. budgeted token usage per layer → adjust budgets automatically
+2. **Gate threshold auto-tuning**: Which gates catch real issues vs. false positives → remove unnecessary gates
+3. **Model profile auto-learning**: Instead of hand-coding model profiles (opus/gpt/gemini), learn from execution traces
+4. **Debate routing**: Auto-decide whether a spec/plan/implementation needs debate (cf. iMAD selective debate)
+## Risks
+- Self-modifying harnesses can introduce bugs
+- Auto-removing gates may remove safety-critical checks
+- Meta-harness optimization may overfit to specific benchmarks
+- Needs safety bounds: which components can self-evolve vs. must remain manually controlled

package/vault/wiki/concepts/sentrux-mcp-integration.md ADDED Viewed

@@ -0,0 +1,36 @@
+---
+type: concept
+title: "sentrux MCP Integration"
+created: 2026-05-03
+tags:
+  - sentrux
+  - mcp
+  - ai-agents
+related:
+  - "[[sentrux]]"
+  - "[[Quality Signal (sentrux)]]"
+sources:
+  - "[[sentrux-github-repo]]"
+---
+# sentrux MCP Integration
+sentrux runs as a Model Context Protocol (MCP) server, giving AI coding agents real-time access to architectural health data.
+## Setup
+- **Claude Code:** `/plugin marketplace add sentrux/sentrux` then `/plugin install sentrux`
+- **Other clients (Cursor, Windsurf, OpenCode):** Add to MCP config: `{"command": "sentrux", "args": ["--mcp"]}`
+## Agent Workflow
+```
+scan("/project")       → quality_signal: 7342, files: 139, bottleneck: "modularity"
+session_start()        → Baseline saved
+... agent writes code ...
+session_end()          → pass: false, before: 7342, after: 6891
+```
+## Available Tools (9 total)
+`scan`, `health`, `session_start`, `session_end`, `rescan`, `check_rules`, `evolution`, `dsm`, `test_gaps`
+## Key Design
+The feedback loop closes automatically: sentrux provides the sensor (structural health), rules provide the spec (what "good" looks like), and the AI agent is the actuator (makes changes). No human intervention needed for routine quality checks.

package/vault/wiki/concepts/sentrux-rules-engine.md ADDED Viewed

@@ -0,0 +1,49 @@
+---
+type: concept
+title: "sentrux Rules Engine"
+created: 2026-05-03
+tags:
+  - sentrux
+  - architecture-governance
+  - ci
+related:
+  - "[[sentrux]]"
+  - "[[sentrux MCP Integration]]"
+sources:
+  - "[[sentrux-docs-rules-engine]]"
+---
+# sentrux Rules Engine
+A TOML-based constraint system that defines and enforces architectural rules. Configured via `.sentrux/rules.toml` at the project root.
+## Capabilities
+- **Global constraints:** max cycles, max coupling grade, max cyclomatic complexity, god file detection
+- **Layer definitions:** ordered dependency hierarchy (lower order = more foundational)
+- **Boundary rules:** block specific dependency paths with human-readable reasons
+## Execution Modes
+1. **CLI:** `sentrux check .` — exit 0 (pass) or 1 (fail), CI-friendly
+2. **MCP:** Agent calls `check_rules()` — gets structured violation list, can self-correct before human sees
+## Example
+```toml
+[constraints]
+max_cycles = 0
+max_coupling = "B"
+max_cc = 25
+no_god_files = true
+[[layers]]
+name = "core"
+paths = ["src/core/*"]
+order = 0
+[[boundaries]]
+from = "src/app/*"
+to = "src/core/internal/*"
+reason = "App must not depend on core internals"
+```
+## Integration
+Works in CI (GitHub Actions), as pre-merge gate, and as MCP tool for AI agents. Agents receive structured violation data and can fix before committing.

package/vault/wiki/concepts/shell-pattern-compression.md ADDED Viewed

@@ -0,0 +1,24 @@
+---
+type: concept
+title: "shell-pattern-compression"
+created: 2026-04-30
+updated: 2026-04-30
+status: seed
+tags: [#concept, #lean-ctx, #context-optimization]
+related:
+  - "[[lean-ctx]]"
+  - "[[Research: context-mode vs lean-ctx]]"
+---
+# shell-pattern-compression
+> [!stub] This is a stub page. See [[lean-ctx]] and [[leanctx-website]] for details.
+A lean-ctx feature that recognizes 90+ command patterns (git, npm, cargo, docker, kubectl, etc.) and intelligently compresses their output. Instead of passing raw terminal output to the LLM, lean-ctx strips irrelevant lines, summarizes, and formats output for maximum information density per token.
+Part of lean-ctx's broader context compression strategy alongside AST-based code compression.
+## Key pages
+- [[lean-ctx]] — Context Runtime for AI Agents
+- [[leanctx-website]] — source documentation