npm - maestro-bundle - Versions diffs - 1.3.0 → 1.4.0 - Mend

maestro-bundle 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (118) hide show

package/templates/bundle-ai-agents/skills/context-engineering/SKILL.md CHANGED Viewed

@@ -1,48 +1,72 @@
 ---
 name: context-engineering
-description: Implementar as 4 estratégias de context engineering (Write, Select, Compress, Isolate) para agentes. Use quando precisar gerenciar a janela de contexto, otimizar o que o agente recebe, ou reduzir custos de tokens.
+description: Implement the 4 context engineering strategies (Write, Select, Compress, Isolate) for AI agents. Use when managing context windows, optimizing what an agent receives, or reducing token costs.
+version: 1.0.0
+author: Maestro
 ---
 # Context Engineering
-As 4 estratégias conforme Anthropic:
+Apply the four context engineering strategies -- Write, Select, Compress, Isolate -- to maximize agent effectiveness while minimizing token costs.
-## 1. Write Context — Memória Persistente
+## When to Use
+- Designing what context an agent receives before execution
+- Optimizing a system prompt that is too long or unfocused
+- Reducing token costs on expensive LLM calls
+- Setting up context isolation between agents in a multi-agent system
+- Debugging an agent that "forgets" instructions or loses focus
-O que o agente "sabe" antes de começar.
+## Available Operations
+1. Write persistent context (CLAUDE.md, agents.md, skills)
+2. Select relevant context via retrieval
+3. Compress context to reduce token usage
+4. Isolate context per agent scope
+5. Budget context allocation across the window
+## Multi-Step Workflow
+### Step 1: Write Context -- Persistent Memory
+Define what the agent "knows" before any task begins. This is your baseline context layer.
 ```
-CLAUDE.md          → Padrões do projeto, arquitetura, decisões
-agents.md          → Comportamento específico do agente
-skills/SKILL.md    → Capacidades on-demand
-memory/            → Aprendizados de execuções anteriores
+CLAUDE.md          -> Project standards, architecture, decisions
+agents.md          -> Agent-specific behavior and role definition
+skills/SKILL.md    -> On-demand capabilities loaded when needed
+memory/            -> Learnings from previous executions
+```
+Check your CLAUDE.md token count:
+```bash
+wc -w CLAUDE.md  # Should be under ~1500 words (~2000 tokens)
 ```
-**Regra:** CLAUDE.md deve ter no máximo 2000 tokens. Se precisar de mais, mover para skills que são carregadas on-demand.
+**Rule**: CLAUDE.md must stay under 2000 tokens. If it grows beyond that, move details into skills that are loaded on-demand.
-## 2. Select Context — Retrieval Inteligente
+### Step 2: Select Context -- Retrieval for the Current Task
-Injetar apenas o contexto relevante para a task atual.
+Inject only the context relevant to the current task. Never dump everything.
 ```python
 def select_context(task: Task, retriever) -> str:
-    # Buscar skills relevantes para a task
+    # Retrieve skills relevant to the task
     relevant_skills = retriever.invoke(task.description)
-    # Buscar código relacionado no repo
+    # Search for related code in the repository
     related_code = code_search(task.description, worktree_path)
-    # Buscar decisões anteriores similares
+    # Find similar past decisions
     past_decisions = memory_store.search(task.description, k=3)
     return format_context(relevant_skills, related_code, past_decisions)
 ```
-**Regra:** Nunca injetar mais de 30% da janela de contexto com contexto selecionado. Deixar espaço para o agente raciocinar.
+**Rule**: Never inject more than 30% of the context window with selected context. Leave space for the agent to reason.
-## 3. Compress Context — Resumo Eficiente
+### Step 3: Compress Context -- Reduce Without Losing Essentials
-Reduzir informação sem perder o essencial.
+When context exceeds budget, compress it while preserving critical information.
 ```python
 async def compress_code(code: str, max_tokens: int = 2000) -> str:
@@ -50,23 +74,23 @@ async def compress_code(code: str, max_tokens: int = 2000) -> str:
         return code
     summary = await llm.ainvoke(f"""
-    Resuma este código mantendo:
-    - Assinaturas de funções/classes
-    - Tipos de entrada/saída
-    - Lógica principal (sem detalhes de implementação)
-    - Imports relevantes
+    Summarize this code keeping:
+    - Function/class signatures
+    - Input/output types
+    - Main logic (without implementation details)
+    - Relevant imports
-    Código:
+    Code:
     {code}
     """)
     return summary.content
 ```
-**Regra:** Comprimir apenas quando necessário. Código que o agente vai modificar deve estar completo, não comprimido.
+**Rule**: Never compress code the agent is about to modify. That code must remain complete and uncompressed.
-## 4. Isolate Context — Escopo por Agente
+### Step 4: Isolate Context -- Scope Per Agent
-Cada agente vê apenas o que precisa.
+Each agent should see only what it needs. No shared context windows.
 ```python
 agent_contexts = {
@@ -83,16 +107,60 @@ agent_contexts = {
 }
 ```
-**Regra:** Agentes nunca compartilham janela de contexto. Comunicação via mensagens estruturadas, não via contexto compartilhado.
+**Rule**: Agents never share a context window. Communication happens via structured messages, not shared context.
+### Step 5: Budget Context Allocation
+For a 200k token model, allocate the context window as follows:
-## Budget de contexto
+| Component | % | Tokens | Description |
+|---|---|---|---|
+| System prompt + CLAUDE.md | 5% | 10k | Identity, rules, format |
+| Loaded skills | 10% | 20k | On-demand capabilities |
+| Retrieved code/docs (Select) | 25% | 50k | Task-relevant context |
+| Conversation history | 15% | 30k | Previous messages |
+| **Reasoning space** | **45%** | **90k** | Model's working memory |
-Para um modelo com 200k tokens:
+Verify current usage:
+```bash
+python -m context.budget --prompt system_prompt.md --skills skills/ --history conversation.json
+```
-| Componente | % | Tokens |
-|---|---|---|
-| System prompt + CLAUDE.md | 5% | 10k |
-| Skills carregadas | 10% | 20k |
-| Código relevante (Select) | 25% | 50k |
-| Histórico de conversa | 15% | 30k |
-| **Espaço para raciocínio** | **45%** | **90k** |
+## Resources
+- `references/context-budget-calculator.md` - Formulas and guidelines for calculating context budgets
+- `references/compression-techniques.md` - Techniques for compressing different content types
+## Examples
+### Example 1: Set Up Context for a New Agent
+User asks: "Configure the context strategy for our new QA agent."
+Response approach:
+1. Write: Create agents/qa.md with QA agent identity, rules, and test standards
+2. Select: Configure retriever to pull test files and coverage reports relevant to the task
+3. Isolate: Limit visibility to `tests/`, `src/` (read-only), and CI config files
+4. Budget: Allocate 5% system prompt, 20% test context, 20% source code, 55% reasoning
+### Example 2: Fix an Agent That Loses Focus
+User asks: "Our backend agent keeps forgetting to follow Clean Architecture halfway through long tasks."
+Response approach:
+1. Check CLAUDE.md size -- if over 2000 tokens, move details to skills
+2. Move Clean Architecture rules into a skill that gets loaded per-task
+3. Add a "checkpoint" mechanism: after every 5 tool calls, re-inject key rules
+4. Reduce conversation history to last 10 messages to free reasoning space
+### Example 3: Reduce Token Costs
+User asks: "Our agent costs are too high. Optimize the context usage."
+Response approach:
+1. Audit current context window usage with the budget calculator
+2. Compress code summaries for files the agent reads but does not modify
+3. Reduce retrieved context from k=20 to k=5 with re-ranking
+4. Shorten system prompt by moving examples to a skill file
+5. Target: 40% reduction in input tokens per invocation
+## Notes
+- The 45% reasoning space is sacred -- never fill more than 55% of the window with context
+- CLAUDE.md changes affect all agents -- be deliberate about what goes there
+- Skills are the pressure release valve: move detailed instructions there for on-demand loading
+- Monitor token usage per agent invocation to catch context bloat early
+- Code the agent will modify must always be included uncompressed

package/templates/bundle-ai-agents/skills/context-engineering/references/compression-techniques.md ADDED Viewed

@@ -0,0 +1,76 @@
+# Compression Techniques Reference
+## Technique 1: Code Skeleton
+Keep signatures and types, remove implementation details.
+**Before** (450 tokens):
+```python
+class DemandRepository:
+    def __init__(self, session: Session):
+        self._session = session
+    def find_by_id(self, id: DemandId) -> Demand:
+        model = self._session.query(DemandModel).filter_by(id=str(id)).first()
+        if not model:
+            raise DemandNotFoundException(id)
+        return Demand(
+            id=DemandId(model.id),
+            description=model.description,
+            status=DemandStatus(model.status),
+        )
+    def save(self, demand: Demand) -> None:
+        model = DemandModel(
+            id=str(demand.id),
+            description=demand.description,
+            status=demand.status.value,
+        )
+        self._session.merge(model)
+        self._session.commit()
+```
+**After** (120 tokens):
+```python
+class DemandRepository:
+    def __init__(self, session: Session): ...
+    def find_by_id(self, id: DemandId) -> Demand: ...  # raises DemandNotFoundException
+    def save(self, demand: Demand) -> None: ...  # merges and commits
+```
+## Technique 2: Summary with Key Facts
+For documentation, extract bullet points instead of full text.
+**Before**: 500-word architecture document.
+**After**: 5 bullet points with the critical decisions.
+## Technique 3: History Trimming
+Keep only the last N messages, or summarize older messages.
+```python
+def trim_history(messages: list, max_messages: int = 10) -> list:
+    if len(messages) <= max_messages:
+        return messages
+    # Keep system message + last N messages
+    return [messages[0]] + messages[-max_messages:]
+```
+## Technique 4: Selective File Loading
+Load only the files the agent will interact with, not the entire directory.
+```python
+def select_files(task_description: str, file_index: dict) -> list[str]:
+    # Use embedding similarity to find relevant files
+    relevant = retriever.invoke(task_description)
+    return [f.metadata["path"] for f in relevant[:5]]
+```
+## When NOT to Compress
+- Code the agent is about to modify (needs full context)
+- Error messages and stack traces (details matter)
+- Test files being debugged
+- Configuration files being updated

package/templates/bundle-ai-agents/skills/context-engineering/references/context-budget-calculator.md ADDED Viewed

@@ -0,0 +1,45 @@
+# Context Budget Calculator Reference
+## Budget Formula
+```
+total_available = model_context_window
+reasoning_space = total_available * 0.45  # NEVER reduce below 40%
+usable_context = total_available - reasoning_space
+system_prompt_budget = usable_context * 0.09    # ~5% of total
+skills_budget = usable_context * 0.18           # ~10% of total
+retrieved_context_budget = usable_context * 0.45 # ~25% of total
+history_budget = usable_context * 0.27          # ~15% of total
+```
+## Budget by Model
+| Model | Context Window | System + Skills | Retrieved | History | Reasoning |
+|---|---|---|---|---|---|
+| Claude Sonnet (200k) | 200,000 | 30,000 | 50,000 | 30,000 | 90,000 |
+| Claude Haiku (200k) | 200,000 | 30,000 | 50,000 | 30,000 | 90,000 |
+| GPT-4 Turbo (128k) | 128,000 | 19,200 | 32,000 | 19,200 | 57,600 |
+| GPT-4o (128k) | 128,000 | 19,200 | 32,000 | 19,200 | 57,600 |
+## Warning Signs of Context Bloat
+1. Agent starts ignoring rules mentioned in the system prompt -> system prompt too far back
+2. Agent repeats itself -> conversation history too long, compress or trim
+3. Agent hallucinates code structure -> retrieved context is stale or missing
+4. Agent is slow and expensive -> too much context being sent per invocation
+## Quick Check
+```python
+import tiktoken
+def check_budget(system_prompt, skills, retrieved, history, model_limit=200000):
+    enc = tiktoken.encoding_for_model("gpt-4")  # approximate
+    total = sum(len(enc.encode(t)) for t in [system_prompt, skills, retrieved, history])
+    usage_pct = total / model_limit * 100
+    reasoning_pct = (model_limit - total) / model_limit * 100
+    print(f"Context usage: {usage_pct:.1f}% | Reasoning space: {reasoning_pct:.1f}%")
+    if reasoning_pct < 40:
+        print("WARNING: Reasoning space below 40%. Reduce context.")
+```

package/templates/bundle-ai-agents/skills/database-modeling/SKILL.md CHANGED Viewed

@@ -1,19 +1,55 @@
 ---
 name: database-modeling
-description: Modelar banco de dados PostgreSQL com migrations, índices, e pgvector. Use quando for criar tabelas, definir schema, criar migrations, ou otimizar queries.
+description: Model PostgreSQL databases with migrations, indexes, and pgvector. Use when creating tables, defining schemas, writing migrations, or optimizing queries.
+version: 1.0.0
+author: Maestro
 ---
-# Modelagem de Banco — PostgreSQL
+# Database Modeling
-## Convenções
+Design PostgreSQL schemas with proper conventions, Alembic migrations, performant indexes, pgvector for semantic search, and full-text search capabilities.
-- Nomes de tabelas: `snake_case`, plural (`demands`, `tasks`, `agents`)
-- PKs: `id UUID DEFAULT gen_random_uuid()`
-- FKs: `<tabela_singular>_id` (ex: `demand_id`)
-- Timestamps: `created_at`, `updated_at` com default `NOW()`
+## When to Use
+- Creating new database tables or modifying existing schemas
+- Writing Alembic migrations
+- Adding indexes to optimize slow queries
+- Setting up pgvector for embedding storage
+- Configuring full-text search
+- Reviewing schema design for anti-patterns
+## Available Operations
+1. Design tables following naming conventions
+2. Create Alembic migrations (upgrade and downgrade)
+3. Add indexes for query optimization
+4. Set up pgvector for semantic search
+5. Configure full-text search with tsvector
+6. Analyze and optimize slow queries with EXPLAIN
+## Multi-Step Workflow
+### Step 1: Design the Table Schema
+Follow these naming conventions strictly:
+- Table names: `snake_case`, plural (`demands`, `tasks`, `agents`)
+- Primary keys: `id UUID DEFAULT gen_random_uuid()`
+- Foreign keys: `<singular_table>_id` (e.g., `demand_id`)
+- Timestamps: `created_at`, `updated_at` with default `NOW()`
 - Soft delete: `deleted_at TIMESTAMP NULL`
-## Migration com Alembic
+### Step 2: Create the Alembic Migration
+Generate and write the migration file.
+```bash
+# Generate a new migration
+alembic revision --autogenerate -m "create_demands_table"
+# Or create manually
+alembic revision -m "create_demands_table"
+```
+Write the migration:
 ```python
 # alembic/versions/001_create_demands.py
@@ -32,28 +68,119 @@ def downgrade():
     op.drop_table('demands')
 ```
-## Índices
+Run the migration:
+```bash
+# Apply migration
+alembic upgrade head
+# Verify the table was created
+psql $DATABASE_URL -c "\d demands"
+# Check migration history
+alembic history
+```
+### Step 3: Add Performance Indexes
+Index columns used in WHERE clauses, JOINs, and ORDER BY.
 ```sql
--- Queries frequentes devem ter índice
+-- Partial index for active records (most queries filter by status)
 CREATE INDEX idx_tasks_status ON tasks(status) WHERE status != 'completed';
+-- Foreign key index for JOIN performance
 CREATE INDEX idx_tasks_demand ON tasks(demand_id);
+-- Composite index for common query patterns
 CREATE INDEX idx_tracking_events_demand_agent ON tracking_events(demand_id, agent_id, created_at DESC);
+```
--- pgvector para busca semântica
+### Step 4: Set Up pgvector for Semantic Search
+```bash
+# Enable pgvector extension
+psql $DATABASE_URL -c "CREATE EXTENSION IF NOT EXISTS vector;"
+```
+```sql
+-- Create embeddings table
+CREATE TABLE bundle_embeddings (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    content TEXT NOT NULL,
+    embedding vector(1536) NOT NULL,
+    metadata JSONB,
+    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
+);
+-- HNSW index for fast cosine similarity search
 CREATE INDEX idx_embeddings_vector ON bundle_embeddings
   USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 200);
+```
+### Step 5: Configure Full-Text Search
--- Full-text search
+```sql
+-- Add generated tsvector column
 ALTER TABLE bundles ADD COLUMN search_vector tsvector
-  GENERATED ALWAYS AS (to_tsvector('portuguese', name || ' ' || description)) STORED;
+  GENERATED ALWAYS AS (to_tsvector('english', name || ' ' || description)) STORED;
+-- GIN index for full-text search
 CREATE INDEX idx_bundles_search ON bundles USING GIN(search_vector);
 ```
-## Anti-patterns a evitar
+### Step 6: Analyze and Optimize Queries
+```bash
+# Run EXPLAIN ANALYZE on slow queries
+psql $DATABASE_URL -c "EXPLAIN ANALYZE SELECT * FROM tasks WHERE demand_id = 'abc-123' AND status = 'pending';"
+# Check for missing indexes
+psql $DATABASE_URL -c "SELECT schemaname, tablename, indexname FROM pg_indexes WHERE tablename = 'tasks';"
+# Check table sizes
+psql $DATABASE_URL -c "SELECT pg_size_pretty(pg_total_relation_size('tasks'));"
+```
+## Resources
+- `references/naming-conventions.md` - Complete naming conventions for PostgreSQL schemas
+- `references/index-strategies.md` - When and how to create indexes for different query patterns
+## Examples
+### Example 1: Create a New Feature Table
+User asks: "Create a tasks table with status tracking and assignment to agents."
+Response approach:
+1. Design columns: id (UUID), title, description, status, demand_id (FK), agent_id (FK), timestamps
+2. Write Alembic migration with proper types and defaults
+3. Add foreign key constraints with ON DELETE behavior
+4. Create indexes on demand_id, agent_id, and status
+5. Run migration: `alembic upgrade head`
+6. Verify: `psql $DATABASE_URL -c "\d tasks"`
+### Example 2: Optimize a Slow Query
+User asks: "The demands list endpoint is slow when filtering by status."
+Response approach:
+1. Run `EXPLAIN ANALYZE` on the slow query
+2. Check if an index exists on the status column
+3. Create a partial index: `CREATE INDEX idx_demands_status ON demands(status) WHERE deleted_at IS NULL;`
+4. Re-run `EXPLAIN ANALYZE` and compare execution times
+5. Verify the index is being used (look for "Index Scan" in the plan)
+### Example 3: Add Soft Delete
+User asks: "Implement soft delete for the demands table."
+Response approach:
+1. Write migration to add `deleted_at TIMESTAMP NULL` column
+2. Update partial indexes to exclude soft-deleted rows: `WHERE deleted_at IS NULL`
+3. Update repository queries to filter `WHERE deleted_at IS NULL` by default
+4. Create a `restore` method that sets `deleted_at = NULL`
+5. Run: `alembic upgrade head`
-- Não usar CASCADE DELETE sem pensar nas consequências
-- Não criar índice em toda coluna (custo de escrita)
-- Não fazer SELECT * em produção
-- Não ignorar EXPLAIN ANALYZE para queries lentas
-- Não alterar schema sem migration
+## Notes
+- Never alter schema without a migration -- even in development
+- Always write both `upgrade()` and `downgrade()` functions
+- Do not create indexes on every column (increases write cost)
+- Do not use `SELECT *` in production code -- select only needed columns
+- Always run `EXPLAIN ANALYZE` before and after adding indexes to verify improvement
+- Use `CASCADE DELETE` with extreme caution -- prefer soft delete or application-level cleanup
+- Use `TIMESTAMP WITH TIME ZONE` for all timestamp columns

package/templates/bundle-ai-agents/skills/database-modeling/references/index-strategies.md ADDED Viewed

@@ -0,0 +1,48 @@
+# Index Strategies Reference
+## When to Create Indexes
+| Query Pattern | Index Type | Example |
+|---|---|---|
+| `WHERE column = value` | B-tree (default) | `CREATE INDEX idx_tasks_status ON tasks(status)` |
+| `WHERE col1 = X AND col2 = Y` | Composite | `CREATE INDEX idx_tasks_demand_status ON tasks(demand_id, status)` |
+| `WHERE status != 'completed'` | Partial | `CREATE INDEX idx_tasks_active ON tasks(status) WHERE status != 'completed'` |
+| `WHERE deleted_at IS NULL` | Partial | `CREATE INDEX idx_demands_live ON demands(id) WHERE deleted_at IS NULL` |
+| Full-text search | GIN on tsvector | `CREATE INDEX idx_search ON docs USING GIN(search_vector)` |
+| Vector similarity | HNSW | `CREATE INDEX idx_vec ON embeddings USING hnsw(embedding vector_cosine_ops)` |
+| JSONB queries | GIN | `CREATE INDEX idx_meta ON docs USING GIN(metadata)` |
+## When NOT to Create Indexes
+- Tables with fewer than 1,000 rows (sequential scan is faster)
+- Columns with very low cardinality (e.g., boolean with 50/50 distribution)
+- Columns that are rarely used in WHERE, JOIN, or ORDER BY
+- Write-heavy tables where index maintenance cost exceeds read benefit
+## Composite Index Column Order
+Put the most selective column first (the one that filters out the most rows).
+```sql
+-- Good: demand_id is more selective than status
+CREATE INDEX idx_tasks_demand_status ON tasks(demand_id, status);
+-- Bad: status has few distinct values, less selective
+CREATE INDEX idx_tasks_status_demand ON tasks(status, demand_id);
+```
+## Monitoring Index Usage
+```sql
+-- Find unused indexes
+SELECT indexrelname, idx_scan, idx_tup_read
+FROM pg_stat_user_indexes
+WHERE idx_scan = 0
+ORDER BY pg_relation_size(indexrelid) DESC;
+-- Find missing indexes (sequential scans on large tables)
+SELECT relname, seq_scan, idx_scan, n_live_tup
+FROM pg_stat_user_tables
+WHERE seq_scan > idx_scan AND n_live_tup > 10000
+ORDER BY seq_scan - idx_scan DESC;
+```

package/templates/bundle-ai-agents/skills/database-modeling/references/naming-conventions.md ADDED Viewed

@@ -0,0 +1,27 @@
+# Naming Conventions Reference
+## Tables
+- `snake_case`, plural: `demands`, `tasks`, `tracking_events`
+- Junction tables: `<table1>_<table2>` alphabetically: `agents_tasks`
+## Columns
+- Primary key: `id` (always UUID with `gen_random_uuid()`)
+- Foreign key: `<singular_table>_id`: `demand_id`, `agent_id`
+- Boolean: `is_<adjective>`: `is_active`, `is_verified`
+- Timestamps: `created_at`, `updated_at`, `deleted_at`
+- Status: `status VARCHAR(20)` with constrained values
+## Indexes
+- Format: `idx_<table>_<columns>`: `idx_tasks_status`, `idx_tasks_demand_id`
+- Unique: `uq_<table>_<columns>`: `uq_users_email`
+- Partial: suffix with purpose: `idx_tasks_status_active`
+## Constraints
+- Primary key: `pk_<table>`: `pk_demands`
+- Foreign key: `fk_<table>_<referenced>`: `fk_tasks_demands`
+- Unique: `uq_<table>_<columns>`: `uq_users_email`
+- Check: `ck_<table>_<rule>`: `ck_demands_status_valid`
+## Migrations
+- Format: `NNN_<action>_<table>.py`: `001_create_demands.py`, `002_add_tasks_status_index.py`
+- Actions: `create`, `add`, `alter`, `drop`, `rename`