maestro-bundle 1.3.1 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (116) hide show
  1. package/package.json +1 -1
  2. package/templates/bundle-ai-agents/skills/agent-orchestration/SKILL.md +107 -41
  3. package/templates/bundle-ai-agents/skills/agent-orchestration/references/graph-patterns.md +50 -0
  4. package/templates/bundle-ai-agents/skills/agent-orchestration/references/routing-strategies.md +47 -0
  5. package/templates/bundle-ai-agents/skills/api-design/SKILL.md +125 -16
  6. package/templates/bundle-ai-agents/skills/api-design/references/pydantic-patterns.md +72 -0
  7. package/templates/bundle-ai-agents/skills/api-design/references/rest-conventions.md +51 -0
  8. package/templates/bundle-ai-agents/skills/clean-architecture/SKILL.md +113 -21
  9. package/templates/bundle-ai-agents/skills/clean-architecture/references/dependency-injection.md +60 -0
  10. package/templates/bundle-ai-agents/skills/clean-architecture/references/layer-rules.md +56 -0
  11. package/templates/bundle-ai-agents/skills/context-engineering/SKILL.md +104 -36
  12. package/templates/bundle-ai-agents/skills/context-engineering/references/compression-techniques.md +76 -0
  13. package/templates/bundle-ai-agents/skills/context-engineering/references/context-budget-calculator.md +45 -0
  14. package/templates/bundle-ai-agents/skills/database-modeling/SKILL.md +146 -19
  15. package/templates/bundle-ai-agents/skills/database-modeling/references/index-strategies.md +48 -0
  16. package/templates/bundle-ai-agents/skills/database-modeling/references/naming-conventions.md +27 -0
  17. package/templates/bundle-ai-agents/skills/docker-containerization/SKILL.md +124 -15
  18. package/templates/bundle-ai-agents/skills/docker-containerization/references/compose-patterns.md +97 -0
  19. package/templates/bundle-ai-agents/skills/docker-containerization/references/dockerfile-checklist.md +37 -0
  20. package/templates/bundle-ai-agents/skills/eval-testing/SKILL.md +113 -25
  21. package/templates/bundle-ai-agents/skills/eval-testing/references/eval-types.md +52 -0
  22. package/templates/bundle-ai-agents/skills/eval-testing/references/golden-dataset-template.md +59 -0
  23. package/templates/bundle-ai-agents/skills/memory-management/SKILL.md +112 -28
  24. package/templates/bundle-ai-agents/skills/memory-management/references/memory-tiers.md +41 -0
  25. package/templates/bundle-ai-agents/skills/memory-management/references/namespace-conventions.md +41 -0
  26. package/templates/bundle-ai-agents/skills/prompt-engineering/SKILL.md +139 -47
  27. package/templates/bundle-ai-agents/skills/prompt-engineering/references/anti-patterns.md +59 -0
  28. package/templates/bundle-ai-agents/skills/prompt-engineering/references/prompt-templates.md +75 -0
  29. package/templates/bundle-ai-agents/skills/rag-pipeline/SKILL.md +104 -27
  30. package/templates/bundle-ai-agents/skills/rag-pipeline/references/chunking-strategies.md +27 -0
  31. package/templates/bundle-ai-agents/skills/rag-pipeline/references/embedding-models.md +31 -0
  32. package/templates/bundle-ai-agents/skills/rag-pipeline/references/rag-evaluation.md +39 -0
  33. package/templates/bundle-ai-agents/skills/testing-strategy/SKILL.md +127 -18
  34. package/templates/bundle-ai-agents/skills/testing-strategy/references/fixture-patterns.md +81 -0
  35. package/templates/bundle-ai-agents/skills/testing-strategy/references/naming-conventions.md +69 -0
  36. package/templates/bundle-base/skills/branch-strategy/SKILL.md +134 -21
  37. package/templates/bundle-base/skills/branch-strategy/references/branch-rules.md +40 -0
  38. package/templates/bundle-base/skills/code-review/SKILL.md +123 -38
  39. package/templates/bundle-base/skills/code-review/references/review-checklist.md +45 -0
  40. package/templates/bundle-base/skills/commit-pattern/SKILL.md +98 -39
  41. package/templates/bundle-base/skills/commit-pattern/references/conventional-commits.md +40 -0
  42. package/templates/bundle-data-pipeline/skills/data-preprocessing/SKILL.md +110 -19
  43. package/templates/bundle-data-pipeline/skills/data-preprocessing/references/pandas-cheatsheet.md +63 -0
  44. package/templates/bundle-data-pipeline/skills/data-preprocessing/references/pandera-schemas.md +44 -0
  45. package/templates/bundle-data-pipeline/skills/docker-containerization/SKILL.md +132 -16
  46. package/templates/bundle-data-pipeline/skills/docker-containerization/references/compose-patterns.md +82 -0
  47. package/templates/bundle-data-pipeline/skills/docker-containerization/references/dockerfile-best-practices.md +57 -0
  48. package/templates/bundle-data-pipeline/skills/feature-engineering/SKILL.md +143 -45
  49. package/templates/bundle-data-pipeline/skills/feature-engineering/references/encoding-guide.md +41 -0
  50. package/templates/bundle-data-pipeline/skills/feature-engineering/references/scaling-guide.md +38 -0
  51. package/templates/bundle-data-pipeline/skills/mlops-pipeline/SKILL.md +156 -37
  52. package/templates/bundle-data-pipeline/skills/mlops-pipeline/references/mlflow-commands.md +69 -0
  53. package/templates/bundle-data-pipeline/skills/model-training/SKILL.md +152 -33
  54. package/templates/bundle-data-pipeline/skills/model-training/references/evaluation-metrics.md +52 -0
  55. package/templates/bundle-data-pipeline/skills/model-training/references/model-selection-guide.md +41 -0
  56. package/templates/bundle-data-pipeline/skills/rag-pipeline/SKILL.md +127 -39
  57. package/templates/bundle-data-pipeline/skills/rag-pipeline/references/chunking-strategies.md +51 -0
  58. package/templates/bundle-data-pipeline/skills/rag-pipeline/references/embedding-models.md +49 -0
  59. package/templates/bundle-frontend-spa/skills/authentication/SKILL.md +196 -13
  60. package/templates/bundle-frontend-spa/skills/authentication/references/jwt-security.md +41 -0
  61. package/templates/bundle-frontend-spa/skills/component-design/SKILL.md +191 -41
  62. package/templates/bundle-frontend-spa/skills/component-design/references/accessibility-checklist.md +41 -0
  63. package/templates/bundle-frontend-spa/skills/component-design/references/tailwind-patterns.md +65 -0
  64. package/templates/bundle-frontend-spa/skills/e2e-testing/SKILL.md +241 -79
  65. package/templates/bundle-frontend-spa/skills/e2e-testing/references/playwright-selectors.md +66 -0
  66. package/templates/bundle-frontend-spa/skills/e2e-testing/references/test-patterns.md +82 -0
  67. package/templates/bundle-frontend-spa/skills/integration-api/SKILL.md +221 -31
  68. package/templates/bundle-frontend-spa/skills/integration-api/references/api-patterns.md +81 -0
  69. package/templates/bundle-frontend-spa/skills/react-patterns/SKILL.md +195 -70
  70. package/templates/bundle-frontend-spa/skills/react-patterns/references/component-checklist.md +22 -0
  71. package/templates/bundle-frontend-spa/skills/react-patterns/references/hook-patterns.md +63 -0
  72. package/templates/bundle-frontend-spa/skills/responsive-layout/SKILL.md +162 -22
  73. package/templates/bundle-frontend-spa/skills/responsive-layout/references/breakpoint-guide.md +63 -0
  74. package/templates/bundle-frontend-spa/skills/state-management/SKILL.md +158 -30
  75. package/templates/bundle-frontend-spa/skills/state-management/references/react-query-config.md +64 -0
  76. package/templates/bundle-frontend-spa/skills/state-management/references/state-patterns.md +78 -0
  77. package/templates/bundle-jhipster-microservices/skills/ci-cd-pipeline/SKILL.md +135 -45
  78. package/templates/bundle-jhipster-microservices/skills/ci-cd-pipeline/references/gitlab-ci-templates.md +93 -0
  79. package/templates/bundle-jhipster-microservices/skills/clean-architecture/SKILL.md +87 -21
  80. package/templates/bundle-jhipster-microservices/skills/clean-architecture/references/layer-rules.md +78 -0
  81. package/templates/bundle-jhipster-microservices/skills/ddd-tactical/SKILL.md +94 -25
  82. package/templates/bundle-jhipster-microservices/skills/ddd-tactical/references/ddd-patterns.md +48 -0
  83. package/templates/bundle-jhipster-microservices/skills/jhipster-angular/SKILL.md +63 -21
  84. package/templates/bundle-jhipster-microservices/skills/jhipster-angular/references/angular-microservices.md +40 -0
  85. package/templates/bundle-jhipster-microservices/skills/jhipster-angular/references/angular-structure.md +59 -0
  86. package/templates/bundle-jhipster-microservices/skills/jhipster-docker-k8s/SKILL.md +125 -91
  87. package/templates/bundle-jhipster-microservices/skills/jhipster-docker-k8s/references/docker-k8s-commands.md +68 -0
  88. package/templates/bundle-jhipster-microservices/skills/jhipster-entities/SKILL.md +72 -20
  89. package/templates/bundle-jhipster-microservices/skills/jhipster-entities/references/cross-service-entities.md +36 -0
  90. package/templates/bundle-jhipster-microservices/skills/jhipster-entities/references/jdl-types.md +56 -0
  91. package/templates/bundle-jhipster-microservices/skills/jhipster-gateway/SKILL.md +80 -8
  92. package/templates/bundle-jhipster-microservices/skills/jhipster-gateway/references/gateway-config.md +43 -0
  93. package/templates/bundle-jhipster-microservices/skills/jhipster-kafka/SKILL.md +115 -22
  94. package/templates/bundle-jhipster-microservices/skills/jhipster-kafka/references/kafka-events.md +39 -0
  95. package/templates/bundle-jhipster-microservices/skills/jhipster-registry/SKILL.md +92 -23
  96. package/templates/bundle-jhipster-microservices/skills/jhipster-registry/references/consul-config.md +61 -0
  97. package/templates/bundle-jhipster-microservices/skills/jhipster-service/SKILL.md +81 -18
  98. package/templates/bundle-jhipster-microservices/skills/jhipster-service/references/service-patterns.md +40 -0
  99. package/templates/bundle-jhipster-microservices/skills/testing-strategy/SKILL.md +101 -20
  100. package/templates/bundle-jhipster-microservices/skills/testing-strategy/references/test-naming.md +55 -0
  101. package/templates/bundle-jhipster-monorepo/skills/clean-architecture/SKILL.md +87 -21
  102. package/templates/bundle-jhipster-monorepo/skills/clean-architecture/references/layer-rules.md +78 -0
  103. package/templates/bundle-jhipster-monorepo/skills/ddd-tactical/SKILL.md +94 -25
  104. package/templates/bundle-jhipster-monorepo/skills/ddd-tactical/references/ddd-patterns.md +48 -0
  105. package/templates/bundle-jhipster-monorepo/skills/jhipster-angular/SKILL.md +99 -52
  106. package/templates/bundle-jhipster-monorepo/skills/jhipster-angular/references/angular-structure.md +59 -0
  107. package/templates/bundle-jhipster-monorepo/skills/jhipster-entities/SKILL.md +89 -36
  108. package/templates/bundle-jhipster-monorepo/skills/jhipster-entities/references/jdl-types.md +56 -0
  109. package/templates/bundle-jhipster-monorepo/skills/jhipster-liquibase/SKILL.md +123 -23
  110. package/templates/bundle-jhipster-monorepo/skills/jhipster-liquibase/references/liquibase-operations.md +95 -0
  111. package/templates/bundle-jhipster-monorepo/skills/jhipster-security/SKILL.md +106 -19
  112. package/templates/bundle-jhipster-monorepo/skills/jhipster-security/references/security-checklist.md +47 -0
  113. package/templates/bundle-jhipster-monorepo/skills/jhipster-spring/SKILL.md +84 -16
  114. package/templates/bundle-jhipster-monorepo/skills/jhipster-spring/references/spring-layers.md +41 -0
  115. package/templates/bundle-jhipster-monorepo/skills/testing-strategy/SKILL.md +101 -20
  116. package/templates/bundle-jhipster-monorepo/skills/testing-strategy/references/test-naming.md +55 -0
@@ -1,66 +1,158 @@
1
1
  ---
2
2
  name: prompt-engineering
3
- description: Criar e otimizar system prompts para agentes seguindo melhores práticas de context engineering. Use quando precisar escrever prompts, melhorar prompts existentes, ou criar instruções para agentes.
3
+ description: Create and optimize system prompts for AI agents following context engineering best practices. Use when writing prompts, improving existing prompts, or creating agent instructions.
4
+ version: 1.0.0
5
+ author: Maestro
4
6
  ---
5
7
 
6
- # Prompt Engineering para Agentes
8
+ # Prompt Engineering
7
9
 
8
- ## Estrutura de System Prompt
10
+ Craft effective system prompts for AI agents using structured templates, best practices, and iterative refinement.
11
+
12
+ ## When to Use
13
+ - Writing a new system prompt for an agent
14
+ - Improving an underperforming agent's instructions
15
+ - Creating role-specific prompts for multi-agent systems
16
+ - Reviewing prompts for anti-patterns and clarity issues
17
+ - Optimizing prompts to reduce token usage without losing quality
18
+
19
+ ## Available Operations
20
+ 1. Write a structured system prompt from scratch
21
+ 2. Audit an existing prompt for anti-patterns
22
+ 3. Refine a prompt based on agent evaluation results
23
+ 4. Create few-shot examples for a prompt
24
+ 5. Optimize prompt token count
25
+
26
+ ## Multi-Step Workflow
27
+
28
+ ### Step 1: Define the Prompt Structure
29
+
30
+ Every agent system prompt should follow this 6-part structure:
9
31
 
10
32
  ```
11
- 1. IDENTIDADE Quem o agente é
12
- 2. OBJETIVO O que ele deve alcançar
13
- 3. FERRAMENTAS O que tem disponível
14
- 4. REGRAS Limites inegociáveis
15
- 5. FORMATO Como estruturar a saída
16
- 6. EXEMPLOS Demonstrações concretas
33
+ 1. IDENTITY -- Who the agent is
34
+ 2. OBJECTIVE -- What it must achieve
35
+ 3. TOOLS -- What it has available
36
+ 4. RULES -- Non-negotiable constraints
37
+ 5. FORMAT -- How to structure output
38
+ 6. EXAMPLES -- Concrete demonstrations
17
39
  ```
18
40
 
19
- ## Template
41
+ ### Step 2: Write the System Prompt
42
+
43
+ Use the template below, filling in each section with specific details.
20
44
 
21
45
  ```python
22
46
  SYSTEM_PROMPT = """
23
- ## Identidade
24
- Você é {role}, especializado em {especialidade}.
25
-
26
- ## Objetivo
27
- Sua missão é {objetivo_principal}. Você trabalha dentro do Maestro,
28
- uma plataforma de governança de desenvolvimento.
29
-
30
- ## Ferramentas disponíveis
31
- {lista_de_tools_com_descrição}
32
-
33
- ## Regras
34
- 1. Sempre seguir o bundle {bundle_name} para padrões de código
35
- 2. Todo commit deve referenciar a task: {task_id}
36
- 3. Trabalhar apenas na worktree designada: {worktree_path}
37
- 4. Reportar progresso a cada etapa significativa
38
- 5. Solicitar human review para operações destrutivas
39
-
40
- ## Formato de resposta
41
- - Para código: blocos com linguagem especificada
42
- - Para decisões: justificar com "porquê"
43
- - Para erros: incluir contexto e sugestão de fix
44
-
45
- ## Exemplo
46
- Task: "Criar endpoint GET /api/v1/demands"
47
- Ação: Criar controller, use case, repository seguindo Clean Architecture
47
+ ## Identity
48
+ You are {role}, specialized in {specialty}.
49
+
50
+ ## Objective
51
+ Your mission is {primary_objective}. You work within Maestro,
52
+ a development governance platform.
53
+
54
+ ## Available Tools
55
+ {list_of_tools_with_descriptions}
56
+
57
+ ## Rules
58
+ 1. Always follow the {bundle_name} bundle for code standards
59
+ 2. Every commit must reference the task: {task_id}
60
+ 3. Work only in the designated worktree: {worktree_path}
61
+ 4. Report progress at every significant step
62
+ 5. Request human review for destructive operations
63
+
64
+ ## Response Format
65
+ - For code: use fenced code blocks with language specified
66
+ - For decisions: justify with "why"
67
+ - For errors: include context and suggested fix
68
+
69
+ ## Example
70
+ Task: "Create endpoint GET /api/v1/demands"
71
+ Action: Create controller, use case, repository following Clean Architecture
48
72
  Branch: feature/backend-{task_id}
49
73
  """
50
74
  ```
51
75
 
52
- ## Boas práticas
76
+ ### Step 3: Apply Best Practices
77
+
78
+ Review the prompt against these rules:
79
+
80
+ 1. **Be specific** -- "Create a REST API with FastAPI" > "Create an API"
81
+ 2. **Explain why** -- "Use Value Objects because they enforce validation at construction" > "Use Value Objects"
82
+ 3. **Give examples** -- One good example is worth 10 lines of instruction
83
+ 4. **Avoid negatives** -- "Keep functions under 20 lines" > "Don't write long functions"
84
+ 5. **Prioritize** -- Put the most important rules first (models pay more attention to early content)
85
+ 6. **Test** -- Run the prompt with real scenarios before deploying
86
+
87
+ ### Step 4: Check for Anti-Patterns
88
+
89
+ Audit the prompt for these common problems:
90
+
91
+ ```bash
92
+ # Count NEVER/ALWAYS occurrences (too many weaken their impact)
93
+ grep -c -i "never\|always" prompt.md
94
+
95
+ # Check prompt length (over 5000 words causes focus loss)
96
+ wc -w prompt.md
97
+ ```
98
+
99
+ Anti-patterns to fix:
100
+ - Excessive NEVER/ALWAYS (loses emphasis when overused)
101
+ - Contradictory instructions (e.g., "be concise" + "explain everything in detail")
102
+ - Prompts over 5000 words (agent loses focus on key instructions)
103
+ - Rules without justification (agent cannot reason about when to flex)
104
+
105
+ ### Step 5: Test and Iterate
106
+
107
+ Run the prompt through evaluation scenarios to measure effectiveness.
108
+
109
+ ```bash
110
+ python -m evals.run_prompt_eval --prompt prompts/agent_backend.md --scenarios evals/prompt_scenarios.json
111
+ ```
112
+
113
+ Compare scores across prompt versions:
114
+
115
+ ```bash
116
+ python -m evals.compare_prompts --v1 prompts/v1.md --v2 prompts/v2.md --scenarios evals/prompt_scenarios.json
117
+ ```
118
+
119
+ ## Resources
120
+ - `references/prompt-templates.md` - Ready-to-use prompt templates for common agent roles
121
+ - `references/anti-patterns.md` - Detailed anti-pattern catalog with fix examples
122
+
123
+ ## Examples
124
+
125
+ ### Example 1: Write a Backend Agent Prompt
126
+ User asks: "Create a system prompt for our backend agent that builds FastAPI endpoints."
127
+ Response approach:
128
+ 1. Set identity to "Backend Engineer Agent, specialized in FastAPI and Clean Architecture"
129
+ 2. Define objective: "Build production-ready REST API endpoints following bundle standards"
130
+ 3. List tools: file operations, git commands, test runner, linter
131
+ 4. Set rules: follow clean-architecture skill, enforce test coverage > 80%, use typed DTOs
132
+ 5. Add format section for code blocks and error reporting
133
+ 6. Include a concrete example of building a CRUD endpoint
53
134
 
54
- 1. **Seja específico** "Crie uma API REST com FastAPI" > "Crie uma API"
55
- 2. **Explique o porquê** — "Usar Value Objects porque garantem validação no construtor" > "Usar Value Objects"
56
- 3. **Dê exemplos** — Um bom exemplo vale mais que 10 linhas de instrução
57
- 4. **Evite negativos** "Mantenha funções com até 20 linhas" > "Não crie funções longas"
58
- 5. **Priorize** Coloque as regras mais importantes primeiro
59
- 6. **Teste** Rode o prompt com cenários reais antes de deployar
135
+ ### Example 2: Fix an Underperforming Prompt
136
+ User asks: "Our agent keeps ignoring the coding standards. Fix the prompt."
137
+ Response approach:
138
+ 1. Read the current prompt and check for vague instructions
139
+ 2. Look for missing justifications on rules (agent doesn't understand importance)
140
+ 3. Move coding standards rules higher in the prompt (priority by position)
141
+ 4. Add a concrete example showing compliant vs non-compliant code
142
+ 5. Add a "Common Mistakes" section with specific things to avoid
60
143
 
61
- ## Anti-patterns
144
+ ### Example 3: Reduce Prompt Token Count
145
+ User asks: "The system prompt is too long, cut it down without losing quality."
146
+ Response approach:
147
+ 1. Count current tokens with `tiktoken`
148
+ 2. Move detailed reference material to skill files loaded on-demand
149
+ 3. Replace verbose explanations with concise bullet points
150
+ 4. Keep examples (they're high-value) but trim redundant ones
151
+ 5. Target: system prompt under 2000 tokens, details in skills
62
152
 
63
- - NEVER/ALWAYS em excesso (perde a força)
64
- - Instruções contraditórias
65
- - Prompts com 5000+ palavras (o agente se perde)
66
- - Regras sem justificativa (o agente não sabe quando flexibilizar)
153
+ ## Notes
154
+ - System prompts should stay under 2000 tokens; move details to on-demand skills
155
+ - Test every prompt change with at least 5 real-world scenarios
156
+ - Version your prompts (v1, v2, etc.) and track performance across versions
157
+ - The first 500 tokens of a prompt get the most attention from the model
158
+ - Few-shot examples are the single most effective prompting technique
@@ -0,0 +1,59 @@
1
+ # Prompt Anti-Patterns Reference
2
+
3
+ ## 1. NEVER/ALWAYS Overuse
4
+
5
+ **Problem**: Using NEVER and ALWAYS too frequently dilutes their impact.
6
+
7
+ **Bad**:
8
+ ```
9
+ NEVER use var. ALWAYS use const. NEVER use any. ALWAYS type everything.
10
+ NEVER skip tests. ALWAYS write docs. NEVER commit without review.
11
+ ```
12
+
13
+ **Good**:
14
+ ```
15
+ Use const by default, let when reassignment is needed.
16
+ Type all function parameters and return values.
17
+ Critical: NEVER commit secrets or credentials to the repository.
18
+ ```
19
+
20
+ **Fix**: Reserve NEVER/ALWAYS for truly critical rules (security, data integrity). Use softer language for preferences.
21
+
22
+ ## 2. Contradictory Instructions
23
+
24
+ **Problem**: Instructions that conflict cause unpredictable behavior.
25
+
26
+ **Bad**:
27
+ ```
28
+ Be concise in your responses.
29
+ Explain every decision in detail with full justification.
30
+ ```
31
+
32
+ **Good**:
33
+ ```
34
+ Be concise by default. When making architectural decisions, explain the reasoning.
35
+ ```
36
+
37
+ **Fix**: Qualify when each instruction applies.
38
+
39
+ ## 3. Excessive Length (> 5000 words)
40
+
41
+ **Problem**: The agent loses focus on key instructions when the prompt is too long.
42
+
43
+ **Fix**: Move reference material to skill files. Keep the system prompt under 2000 tokens. Load details on-demand.
44
+
45
+ ## 4. Rules Without Justification
46
+
47
+ **Problem**: Without knowing why, the agent cannot reason about edge cases.
48
+
49
+ **Bad**: "Use Value Objects for all domain primitives."
50
+
51
+ **Good**: "Use Value Objects for domain primitives because they enforce validation at construction time and make the domain model self-documenting."
52
+
53
+ ## 5. Vague Instructions
54
+
55
+ **Problem**: Ambiguity leads to inconsistent agent behavior.
56
+
57
+ **Bad**: "Write good code."
58
+
59
+ **Good**: "Write code that follows Clean Architecture: separate controllers, use cases, and repositories. Keep functions under 20 lines. Include type hints on all function signatures."
@@ -0,0 +1,75 @@
1
+ # Prompt Templates Reference
2
+
3
+ ## Backend Agent Template
4
+
5
+ ```
6
+ ## Identity
7
+ You are a Backend Engineer Agent, specialized in building REST APIs with FastAPI and Clean Architecture.
8
+
9
+ ## Objective
10
+ Build production-ready API endpoints that follow the project's bundle standards, including proper error handling, validation, pagination, and test coverage.
11
+
12
+ ## Tools
13
+ - File read/write operations
14
+ - Git commands (commit, branch, push)
15
+ - pytest for running tests
16
+ - ruff for linting
17
+
18
+ ## Rules
19
+ 1. Follow Clean Architecture: controller -> use case -> repository
20
+ 2. Every endpoint must have input validation with Pydantic models
21
+ 3. Test coverage must be >= 80% for new code
22
+ 4. Use typed DTOs for all API responses
23
+ 5. Handle errors with standardized ErrorResponse format
24
+
25
+ ## Response Format
26
+ - Code in fenced blocks with language specified
27
+ - Explain architectural decisions with "why"
28
+ - Report test results after implementation
29
+ ```
30
+
31
+ ## Frontend Agent Template
32
+
33
+ ```
34
+ ## Identity
35
+ You are a Frontend Engineer Agent, specialized in React with TypeScript.
36
+
37
+ ## Objective
38
+ Build responsive, accessible UI components following the project's design system and bundle standards.
39
+
40
+ ## Tools
41
+ - File read/write operations
42
+ - npm/pnpm commands
43
+ - Jest/Vitest for testing
44
+ - ESLint for linting
45
+
46
+ ## Rules
47
+ 1. Use functional components with hooks
48
+ 2. All props must be typed with TypeScript interfaces
49
+ 3. Components must be accessible (ARIA labels, keyboard navigation)
50
+ 4. Write unit tests for business logic, integration tests for user flows
51
+ 5. Use the project's design tokens for spacing, colors, typography
52
+ ```
53
+
54
+ ## DevOps Agent Template
55
+
56
+ ```
57
+ ## Identity
58
+ You are a DevOps Engineer Agent, specialized in Docker, CI/CD, and infrastructure.
59
+
60
+ ## Objective
61
+ Containerize applications, configure CI/CD pipelines, and manage deployment infrastructure.
62
+
63
+ ## Tools
64
+ - Docker CLI (build, push, compose)
65
+ - Git commands
66
+ - kubectl for Kubernetes
67
+ - Terraform for infrastructure
68
+
69
+ ## Rules
70
+ 1. All Dockerfiles must use multi-stage builds
71
+ 2. Never run containers as root
72
+ 3. Include health checks in all service containers
73
+ 4. Secrets must come from environment variables, never hardcoded
74
+ 5. CI pipelines must include lint, test, build, and security scan stages
75
+ ```
@@ -1,23 +1,54 @@
1
1
  ---
2
2
  name: rag-pipeline
3
- description: Construir pipeline RAG completo com ingestão, chunking, embedding, indexação e retrieval usando LangChain + pgvector. Use sempre que precisar implementar busca semântica, responder perguntas sobre documentos, ou criar um sistema de retrieval.
3
+ description: Build complete RAG pipelines with ingestion, chunking, embedding, indexing, and retrieval using LangChain + pgvector. Use when implementing semantic search, answering questions over documents, or creating retrieval systems.
4
+ version: 1.0.0
5
+ author: Maestro
4
6
  ---
5
7
 
6
8
  # RAG Pipeline
7
9
 
8
- ## Pipeline completo
10
+ Build production-ready Retrieval-Augmented Generation pipelines with hybrid search, re-ranking, and quality evaluation.
9
11
 
12
+ ## When to Use
13
+ - Building a semantic search system over documents
14
+ - Answering questions from a knowledge base (PDFs, Markdown, code)
15
+ - Creating a retrieval layer for an AI agent
16
+ - Indexing project documentation, skills, or bundles into a vector store
17
+ - Improving an existing RAG pipeline's accuracy or performance
18
+
19
+ ## Available Operations
20
+ 1. Ingest documents (load, split, enrich with metadata)
21
+ 2. Generate embeddings and index into pgvector
22
+ 3. Configure hybrid retrieval (semantic + keyword BM25)
23
+ 4. Add re-ranking for precision
24
+ 5. Build a query chain with LLM
25
+ 6. Evaluate retrieval quality with golden datasets
26
+
27
+ ## Multi-Step Workflow
28
+
29
+ ### Step 1: Set Up Environment
30
+
31
+ Install required dependencies and verify database connectivity.
32
+
33
+ ```bash
34
+ pip install langchain langchain-openai langchain-postgres langchain-community langchain-cohere pgvector rank-bm25
10
35
  ```
11
- Documentos → Loader → Splitter → Embeddings → pgvector → Retriever → Re-ranker → LLM
36
+
37
+ Verify pgvector is available:
38
+
39
+ ```bash
40
+ psql $DATABASE_URL -c "CREATE EXTENSION IF NOT EXISTS vector;"
12
41
  ```
13
42
 
14
- ## 1. Ingestão
43
+ ### Step 2: Ingest Documents
44
+
45
+ Load documents from the target directory and split into chunks with appropriate overlap.
15
46
 
16
47
  ```python
17
48
  from langchain_community.document_loaders import DirectoryLoader, UnstructuredMarkdownLoader
18
49
  from langchain.text_splitter import RecursiveCharacterTextSplitter
19
50
 
20
- # Loader por tipo de documento
51
+ # Load documents by type
21
52
  loader = DirectoryLoader(
22
53
  "./documents/",
23
54
  glob="**/*.md",
@@ -25,7 +56,7 @@ loader = DirectoryLoader(
25
56
  )
26
57
  docs = loader.load()
27
58
 
28
- # Splitter com separadores Markdown
59
+ # Split with Markdown-aware separators
29
60
  splitter = RecursiveCharacterTextSplitter(
30
61
  chunk_size=1000,
31
62
  chunk_overlap=200,
@@ -34,10 +65,13 @@ splitter = RecursiveCharacterTextSplitter(
34
65
  chunks = splitter.split_documents(docs)
35
66
  ```
36
67
 
37
- ## 2. Metadados obrigatórios
68
+ ### Step 3: Enrich Chunks with Metadata
69
+
70
+ Every chunk must carry metadata for filtering and traceability.
38
71
 
39
- Cada chunk deve ter:
40
72
  ```python
73
+ from datetime import datetime
74
+
41
75
  for chunk in chunks:
42
76
  chunk.metadata.update({
43
77
  "source": chunk.metadata.get("source", "unknown"),
@@ -47,7 +81,7 @@ for chunk in chunks:
47
81
  })
48
82
  ```
49
83
 
50
- ## 3. Embedding + Indexação
84
+ ### Step 4: Embed and Index into pgvector
51
85
 
52
86
  ```python
53
87
  from langchain_openai import OpenAIEmbeddings
@@ -63,26 +97,26 @@ vectorstore = PGVector(
63
97
  vectorstore.add_documents(chunks)
64
98
  ```
65
99
 
66
- ## 4. Retrieval Híbrido
100
+ ### Step 5: Configure Hybrid Retrieval
101
+
102
+ Combine semantic search with keyword-based BM25 using Reciprocal Rank Fusion.
67
103
 
68
104
  ```python
69
105
  from langchain.retrievers import EnsembleRetriever
70
106
  from langchain_community.retrievers import BM25Retriever
71
107
 
72
- # Semântico
73
108
  semantic_retriever = vectorstore.as_retriever(search_kwargs={"k": 20})
74
-
75
- # Keyword
76
109
  bm25_retriever = BM25Retriever.from_documents(chunks, k=20)
77
110
 
78
- # Ensemble com RRF
79
111
  hybrid_retriever = EnsembleRetriever(
80
112
  retrievers=[semantic_retriever, bm25_retriever],
81
113
  weights=[0.6, 0.4]
82
114
  )
83
115
  ```
84
116
 
85
- ## 5. Re-ranking
117
+ ### Step 6: Add Re-Ranking
118
+
119
+ Use Cohere re-ranker to refine top-k results for higher precision.
86
120
 
87
121
  ```python
88
122
  from langchain.retrievers import ContextualCompressionRetriever
@@ -95,18 +129,19 @@ final_retriever = ContextualCompressionRetriever(
95
129
  )
96
130
  ```
97
131
 
98
- ## 6. Query Chain
132
+ ### Step 7: Build Query Chain
99
133
 
100
134
  ```python
101
135
  from langchain_core.prompts import ChatPromptTemplate
102
136
  from langchain_core.output_parsers import StrOutputParser
137
+ from langchain_core.runnables import RunnablePassthrough
103
138
 
104
139
  prompt = ChatPromptTemplate.from_template("""
105
- Responda a pergunta baseado apenas no contexto fornecido.
106
- Se a resposta não estiver no contexto, diga "Não encontrei essa informação".
140
+ Answer the question based only on the provided context.
141
+ If the answer is not in the context, say "I could not find that information."
107
142
 
108
- Contexto: {context}
109
- Pergunta: {question}
143
+ Context: {context}
144
+ Question: {question}
110
145
  """)
111
146
 
112
147
  chain = (
@@ -116,13 +151,55 @@ chain = (
116
151
  | StrOutputParser()
117
152
  )
118
153
 
119
- result = chain.invoke("Qual skill usar para criar componentes React?")
154
+ result = chain.invoke("Which skill should I use to create React components?")
120
155
  ```
121
156
 
122
- ## Checklist de qualidade
157
+ ### Step 8: Evaluate Retrieval Quality
158
+
159
+ Run evaluation against a golden dataset to measure retrieval accuracy.
160
+
161
+ ```bash
162
+ python -m evals.run_rag_eval --dataset evals/rag_golden_dataset.json --min-score 0.8
163
+ ```
123
164
 
124
- - [ ] Chunks testados com perguntas reais
125
- - [ ] Metadados completos em todos os chunks
126
- - [ ] Retrieval quality medido com golden dataset
127
- - [ ] Re-ranking ativo para refinar top-k
128
- - [ ] Fallback para quando retrieval não encontra nada
165
+ ## Resources
166
+ - `references/chunking-strategies.md` - Guidance on chunk sizes and overlap for different document types
167
+ - `references/embedding-models.md` - Comparison of embedding models and their trade-offs
168
+ - `references/rag-evaluation.md` - How to build golden datasets and measure RAG quality
169
+
170
+ ## Examples
171
+
172
+ ### Example 1: Index Project Documentation
173
+ User asks: "Set up RAG for our project docs so the agent can answer questions about our architecture."
174
+ Response approach:
175
+ 1. Scan the `docs/` directory for Markdown files
176
+ 2. Split with Markdown-aware separators (chunk_size=1000, overlap=200)
177
+ 3. Enrich metadata with doc_type and source path
178
+ 4. Embed with text-embedding-3-large and index into pgvector
179
+ 5. Configure hybrid retrieval with BM25 fallback
180
+ 6. Wire up a query chain and test with sample questions
181
+
182
+ ### Example 2: Improve Retrieval Accuracy
183
+ User asks: "Our RAG is returning irrelevant results for technical queries."
184
+ Response approach:
185
+ 1. Check current chunk sizes -- may be too large or too small
186
+ 2. Verify metadata filtering is applied for doc_type
187
+ 3. Add Cohere re-ranking to refine top-k
188
+ 4. Adjust ensemble weights (increase semantic weight for technical content)
189
+ 5. Build a golden dataset of 10-20 question/answer pairs and run evals
190
+
191
+ ### Example 3: Add a New Document Source
192
+ User asks: "Add our API specs (OpenAPI YAML) to the RAG pipeline."
193
+ Response approach:
194
+ 1. Add a YAML loader to the ingestion pipeline
195
+ 2. Configure appropriate splitter for structured YAML content
196
+ 3. Set doc_type metadata to "api_spec"
197
+ 4. Re-index and test retrieval with API-related queries
198
+
199
+ ## Notes
200
+ - Always test chunks with real questions before deploying
201
+ - Keep metadata complete on all chunks for filtering and traceability
202
+ - Measure retrieval quality with a golden dataset (minimum 10 question/answer pairs)
203
+ - Re-ranking is critical for precision -- always enable it in production
204
+ - Implement a fallback response when retrieval returns no relevant results
205
+ - Monitor token costs: embedding large document sets can be expensive
@@ -0,0 +1,27 @@
1
+ # Chunking Strategies Reference
2
+
3
+ ## Recommended Chunk Sizes by Document Type
4
+
5
+ | Document Type | Chunk Size | Overlap | Separators |
6
+ |---|---|---|---|
7
+ | Markdown docs | 1000 | 200 | `\n## `, `\n### `, `\n\n`, `\n` |
8
+ | Source code | 1500 | 300 | `\nclass `, `\ndef `, `\n\n`, `\n` |
9
+ | API specs (YAML/JSON) | 800 | 100 | `\n- `, `\n `, `\n\n` |
10
+ | PDF documents | 1200 | 250 | `\n\n`, `\n`, `. ` |
11
+ | Plain text | 1000 | 200 | `\n\n`, `\n`, `. `, ` ` |
12
+
13
+ ## Guidelines
14
+
15
+ - **Too small** (< 500 tokens): Loses context, retrieval finds fragments without meaning.
16
+ - **Too large** (> 2000 tokens): Dilutes relevance, wastes context window space.
17
+ - **Overlap**: 15-25% of chunk_size prevents information loss at boundaries.
18
+ - **Separators**: Order matters -- RecursiveCharacterTextSplitter tries separators in order, falling back to the next one.
19
+
20
+ ## Metadata to Attach
21
+
22
+ Every chunk should carry:
23
+ - `source` -- file path or URL
24
+ - `doc_type` -- classification (skill, prd, code, api_spec, etc.)
25
+ - `language` -- content language for multilingual pipelines
26
+ - `created_at` -- timestamp for freshness filtering
27
+ - `section_title` -- nearest heading for context
@@ -0,0 +1,31 @@
1
+ # Embedding Models Reference
2
+
3
+ ## Model Comparison
4
+
5
+ | Model | Dimensions | Max Tokens | Cost | Quality |
6
+ |---|---|---|---|---|
7
+ | text-embedding-3-large | 3072 (or 1536) | 8191 | $0.13/1M tokens | Best |
8
+ | text-embedding-3-small | 1536 | 8191 | $0.02/1M tokens | Good |
9
+ | text-embedding-ada-002 | 1536 | 8191 | $0.10/1M tokens | Legacy |
10
+
11
+ ## Recommendations
12
+
13
+ - **Production**: Use `text-embedding-3-large` with `dimensions=1536` for best quality/cost balance.
14
+ - **Development/Prototyping**: Use `text-embedding-3-small` to reduce costs.
15
+ - **Consistency**: Never mix embedding models in the same collection -- re-embed everything if you switch.
16
+
17
+ ## pgvector Index Types
18
+
19
+ | Index | Build Speed | Query Speed | Recall | Use Case |
20
+ |---|---|---|---|---|
21
+ | HNSW | Slow | Fast | High | Production (< 10M vectors) |
22
+ | IVFFlat | Fast | Medium | Medium | Large datasets, quick setup |
23
+ | None (brute force) | N/A | Slow | Perfect | Small datasets (< 50k) |
24
+
25
+ ### Recommended HNSW Settings
26
+
27
+ ```sql
28
+ CREATE INDEX idx_embeddings ON documents
29
+ USING hnsw (embedding vector_cosine_ops)
30
+ WITH (m = 16, ef_construction = 200);
31
+ ```
@@ -0,0 +1,39 @@
1
+ # RAG Evaluation Reference
2
+
3
+ ## Golden Dataset Format
4
+
5
+ ```json
6
+ {
7
+ "evals": [
8
+ {
9
+ "id": "rag-eval-001",
10
+ "question": "Which skill handles React component creation?",
11
+ "expected_answer": "The react-patterns skill covers component creation.",
12
+ "expected_sources": ["skills/react-patterns/SKILL.md"],
13
+ "relevance_threshold": 0.8
14
+ }
15
+ ]
16
+ }
17
+ ```
18
+
19
+ ## Key Metrics
20
+
21
+ | Metric | What It Measures | Target |
22
+ |---|---|---|
23
+ | Retrieval Precision | % of retrieved docs that are relevant | > 0.7 |
24
+ | Retrieval Recall | % of relevant docs that are retrieved | > 0.8 |
25
+ | Faithfulness | Answer grounded in retrieved context | > 0.9 |
26
+ | Answer Relevancy | Answer addresses the question | > 0.85 |
27
+
28
+ ## Evaluation Workflow
29
+
30
+ 1. Build golden dataset with 20+ question/answer/source triples.
31
+ 2. Run retrieval for each question and compare against expected sources.
32
+ 3. Generate answers and evaluate faithfulness with LLM-as-judge.
33
+ 4. Track metrics over time -- regression means something changed in ingestion or indexing.
34
+
35
+ ## Common Failure Modes
36
+
37
+ - **Low precision**: Chunks too large, no re-ranking, or poor metadata filtering.
38
+ - **Low recall**: Chunks too small, missing document sources, or embedding model mismatch.
39
+ - **Low faithfulness**: LLM hallucinating beyond context -- tighten the system prompt.