rag-skills 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. package/.claude-plugin/marketplace.json +14 -0
  2. package/.claude-plugin/plugin.json +8 -0
  3. package/CONTRIBUTING.md +210 -0
  4. package/LICENSE +21 -0
  5. package/README.md +148 -0
  6. package/examples/foundational-rag-pipeline.md +104 -0
  7. package/examples/multi-agent-rag.md +111 -0
  8. package/examples/production-rag-setup.md +133 -0
  9. package/package.json +22 -0
  10. package/scripts/generate-index.py +276 -0
  11. package/scripts/validate-skills.py +214 -0
  12. package/skills/chunking/choosing-a-chunking-framework.md +186 -0
  13. package/skills/chunking/contextual-chunk-headers.md +106 -0
  14. package/skills/chunking/hierarchical-chunking.md +77 -0
  15. package/skills/chunking/semantic-chunking.md +78 -0
  16. package/skills/chunking/sliding-window-chunking.md +82 -0
  17. package/skills/data-type-handling/rag-for-code-documentation.md +83 -0
  18. package/skills/data-type-handling/rag-for-multimodal-content.md +83 -0
  19. package/skills/performance-optimization/optimize-retrieval-latency.md +88 -0
  20. package/skills/retrieval-strategies/adaptive-retrieval.md +102 -0
  21. package/skills/retrieval-strategies/context-enrichment-window.md +99 -0
  22. package/skills/retrieval-strategies/crag-corrective-rag.md +108 -0
  23. package/skills/retrieval-strategies/explainable-retrieval.md +106 -0
  24. package/skills/retrieval-strategies/graph-rag.md +107 -0
  25. package/skills/retrieval-strategies/hybrid-search-bm25-dense.md +81 -0
  26. package/skills/retrieval-strategies/hyde-hypothetical-document-embeddings.md +91 -0
  27. package/skills/retrieval-strategies/hype-hypothetical-prompt-embeddings.md +98 -0
  28. package/skills/retrieval-strategies/multi-pass-retrieval-with-reranking.md +82 -0
  29. package/skills/retrieval-strategies/query-transformation-strategies.md +93 -0
  30. package/skills/retrieval-strategies/raptor-hierarchical-retrieval.md +106 -0
  31. package/skills/retrieval-strategies/self-rag.md +108 -0
  32. package/skills/vector-databases/choosing-vector-db-by-datatype.md +112 -0
  33. package/skills/vector-databases/qdrant-for-production-rag.md +88 -0
  34. package/skills/vector-databases/qdrant-setup-rag.md +86 -0
  35. package/templates/skill-template.md +53 -0
  36. package/templates/workflow-template.md +67 -0
@@ -0,0 +1,14 @@
1
+ {
2
+ "name": "rag-skills",
3
+ "owner": {
4
+ "name": "Mohamed Arbi ",
5
+ "url": "https://github.com/Goodnight77"
6
+ },
7
+ "plugins": [
8
+ {
9
+ "name": "rag-skills",
10
+ "source": "./",
11
+ "description": "Agent skills for RAG: chunking strategies, retrieval methods, vector databases, and performance optimization"
12
+ }
13
+ ]
14
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "name": "rag-skills",
3
+ "description": "Agent skills for RAG (Retrieval Augmented Generation): chunking strategies (sliding window, semantic, hierarchical), retrieval strategies (HyDE, CRAG, Self-RAG, Graph RAG, adaptive, multi-pass), vector database setup (Qdrant), data type handling (code, multimodal), and performance optimization",
4
+ "version": "1.0.0",
5
+ "author": {
6
+ "name": "Mohamed Arbi"
7
+ }
8
+ }
@@ -0,0 +1,210 @@
1
+ # Contributing to rag-skills
2
+
3
+ Thank you for your interest in contributing to rag-skills! This document provides guidelines for submitting new skills, reviewing existing skills, and maintaining the repository.
4
+
5
+ ## Table of Contents
6
+
7
+ - [Code of Conduct](#code-of-conduct)
8
+ - [Getting Started](#getting-started)
9
+ - [Submitting New Skills](#submitting-new-skills)
10
+ - [Skill Review Criteria](#skill-review-criteria)
11
+ - [Naming Conventions](#naming-conventions)
12
+ - [Testing and Validation](#testing-and-validation)
13
+ - [Credit and Attribution](#credit-and-attribution)
14
+ - [Reporting Issues](#reporting-issues)
15
+
16
+ ## Code of Conduct
17
+
18
+ - Be respectful and inclusive
19
+ - Provide constructive feedback
20
+ - Focus on what is best for the community
21
+ - Show empathy towards other community members
22
+
23
+ ## Getting Started
24
+
25
+ ### Development Setup
26
+
27
+ ### Running Validation
28
+
29
+ ## Submitting New Skills
30
+
31
+ ### Step 1: Choose Your Topic
32
+
33
+ Before creating a new skill, check that:
34
+
35
+ 1. The topic is not already covered by an existing skill
36
+ 2. The topic is relevant to RAG systems
37
+ 3. You have practical experience with the topic
38
+
39
+ ### Step 2: Use the Template
40
+
41
+ Start with [templates/skill-template.md](templates/skill-template.md):
42
+ Use the template as a lightweight reference format: brief illustrative text, no long runnable code, and 3-5 external implementation links folded into the relevant sections instead of a separate `References` block.
43
+
44
+ ### Step 3: Place Your Skill
45
+
46
+ Organize skills by category:
47
+
48
+ If your category doesn't exist, create it in the appropriate location.
49
+
50
+ ### Step 4: Validate Your Skill
51
+
52
+ Run the validation script and fix any errors before submitting.
53
+
54
+ ### Step 5: Submit a Pull Request
55
+
56
+ 1. Commit your changes
57
+ 2. Push to your fork
58
+ 3. Open a pull request with a descriptive title
59
+
60
+ Example PR title: `Add skill: Semantic Chunking for Markdown Documents`
61
+
62
+ ## Skill Review Criteria
63
+
64
+ When reviewing skills, maintainers evaluate them against these criteria:
65
+
66
+ ### Clarity
67
+
68
+ - [ ] Is the problem statement clear and specific?
69
+ - [ ] Are key concepts well-defined?
70
+ - [ ] Are implementation steps logically ordered?
71
+ - [ ] Is the writing free of ambiguity?
72
+
73
+ ### Accuracy
74
+
75
+ - [ ] Are the technical statements correct?
76
+ - [ ] Do code examples run as expected?
77
+ - [ ] Are references valid and current?
78
+ - [ ] Are the metrics/success criteria realistic?
79
+
80
+ ### Completeness
81
+
82
+ - [ ] All required sections are present
83
+ - [ ] Code examples are brief and illustrative, not full implementations
84
+ - [ ] Related skills are properly linked
85
+ - [ ] Both use cases and anti-patterns are covered
86
+
87
+ ### Practicality
88
+
89
+ - [ ] The skill addresses a real-world problem
90
+ - [ ] The approach is production-viable
91
+ - [ ] The complexity matches the difficulty level
92
+ - [ ] Dependencies are reasonable and well-known
93
+
94
+ ### Code Quality
95
+
96
+ - [ ] Code follows Python conventions (PEP 8)
97
+ - [ ] Code includes comments for complex logic
98
+ - [ ] Code handles errors appropriately
99
+ - [ ] Code examples stay lightweight and defer to external implementations
100
+
101
+ ## Naming Conventions
102
+
103
+ ### Skill Files
104
+
105
+ - Use kebab-case: `semantic-chunking.md`
106
+ - Be descriptive: `hybrid-search-bm25-dense.md`
107
+ - Keep names under 50 characters
108
+
109
+ ### Categories
110
+
111
+ Use existing category names:
112
+ - `chunking`
113
+ - `vector-databases`
114
+ - `retrieval-strategies`
115
+ - `data-type-handling`
116
+ - `performance-optimization`
117
+ - `evaluation-metrics`
118
+ - `rag-agents`
119
+ - `deployment`
120
+
121
+ ### Levels
122
+
123
+ Use one of:
124
+ - `beginner` — Basic concepts, minimal dependencies
125
+ - `intermediate` — Some experience required, moderate complexity
126
+ - `advanced` — Expert knowledge, complex implementations
127
+
128
+ If you keep the level field in a skill file, treat it as metadata for filtering rather than a required browsing path in the README.
129
+
130
+ ### Tags
131
+
132
+ - Use 3-5 relevant tags per skill
133
+ - Use lowercase: `["semantic", "nlp", "context"]`
134
+ - Avoid overly specific tags
135
+ - Focus on searchable terms
136
+
137
+ ## Testing and Validation
138
+
139
+ ### Local Validation
140
+
141
+ Before submitting, ensure your skill passes validation:
142
+
143
+ Strict mode treats warnings as errors.
144
+
145
+ ### Link Validation
146
+
147
+ Verify all internal links work and that external implementation links are placed inline in the relevant step or sentence:
148
+
149
+ ## Credit and Attribution
150
+
151
+ ### Author Field
152
+
153
+ Include your name in the `author` field:
154
+
155
+ For organizational contributions:
156
+
157
+ ### Last Updated
158
+
159
+ Update the `last_updated` field with the current date:
160
+
161
+ ### Co-Authors
162
+
163
+ For substantial contributions from multiple people, list them in the pull request description:
164
+
165
+ ## Reporting Issues
166
+
167
+ When reporting issues, include:
168
+
169
+ - A clear title
170
+ - Steps to reproduce
171
+ - Expected vs actual behavior
172
+ - Environment information
173
+ - Screenshots if applicable
174
+
175
+ ### Issue Templates
176
+
177
+ #### Bug Report
178
+
179
+ #### Feature Request
180
+
181
+ ## Community Guidelines
182
+
183
+ ### Discussions
184
+
185
+ - Use GitHub Discussions for questions and ideas
186
+ - Be specific in your questions
187
+ - Share code snippets when helpful
188
+ - Follow up on responses
189
+
190
+ ### Code Review
191
+
192
+ - Be constructive in your reviews
193
+ - Explain the reasoning for suggested changes
194
+ - Acknowledge good work
195
+ - Be patient with maintainers' time
196
+
197
+ ### Maintainer Response Time
198
+
199
+ Maintainers aim to respond to:
200
+ - Pull requests: Within 7 days
201
+ - Issues: Within 7 days
202
+ - Discussions: Within 3 days
203
+
204
+ ## Additional Resources
205
+
206
+ - [Project README](README.md)
207
+ - [Skill Template](templates/skill-template.md)
208
+ - [GitHub Community Guidelines](https://docs.github.com/en/site-policy/github-terms/github-community-guidelines)
209
+
210
+ Thank you for contributing to rag-skills!
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 rag-skills contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,148 @@
1
+ # Rag-skills
2
+
3
+ <p>
4
+ <code>agent routing</code> <code>RAG skills</code> <code>markdown</code>
5
+ </p>
6
+
7
+ A modular collection of best-practice guides and skill definitions for building Retrieval-Augmented Generation (RAG) systems. Designed for AI coding agents, agent frameworks, and teams that want a structured way to route RAG work to the right strategy.
8
+
9
+ ## Overview
10
+
11
+ RAG-skills consolidates actionable skills that help AI agents and builders improve RAG performance, choose appropriate vector databases, implement effective chunking strategies, optimize retrieval quality, and orchestrate multi-step RAG workflows.
12
+
13
+ ## Installation
14
+
15
+ ### Via Claude Code (Local)
16
+
17
+ Clone and use as a plugin:
18
+
19
+ ```bash
20
+ # Clone to your plugins directory
21
+ git clone https://github.com/Goodnight77/Rag-skills.git ~/.claude/plugins/rag-skills
22
+
23
+ # Or use from current directory
24
+ claude --plugin-dir .
25
+ ```
26
+
27
+ ### Via npm
28
+ ```bash
29
+ npx skills add @goodnight/rag-skills
30
+ ```
31
+
32
+ ### Manual Usage
33
+
34
+ Clone or fork the repository and reference skill files directly in your agent workflows.
35
+
36
+ ## Skills by Decision Area
37
+
38
+ This repo is organized as a routing layer for RAG work. Agents can use the category and metadata in each skill file to decide which path to follow for a given problem, instead of treating the repo like a generic reference manual.
39
+
40
+ ### Chunking
41
+ Use these when the main problem is how to split source material into retrievable units.
42
+ - [Semantic Chunking](skills/chunking/semantic-chunking.md) - Chunk documents based on semantic boundaries
43
+ - [Hierarchical Chunking](skills/chunking/hierarchical-chunking.md) - Multi-level chunking for nested structures
44
+ - [Sliding Window Chunking](skills/chunking/sliding-window-chunking.md) - Overlap-based chunking for context preservation
45
+ - [Contextual Chunk Headers](skills/chunking/contextual-chunk-headers.md) - Adding higher-level context to chunks
46
+
47
+ ### Vector Databases
48
+ Use these when the main problem is choosing or operating the storage layer for embeddings and metadata.
49
+ - [Qdrant Setup for RAG](skills/vector-databases/qdrant-setup-rag.md) - Setting up Qdrant for RAG
50
+ - [Qdrant for Production RAG](skills/vector-databases/qdrant-for-production-rag.md) - Scaling RAG with Qdrant
51
+ - [Choosing Vector DB by Datatype](skills/vector-databases/choosing-vector-db-by-datatype.md) - Database selection guide
52
+
53
+ ### Retrieval Strategies
54
+ Use these when the main problem is search quality, ranking, recall, or combining search methods.
55
+ - [Hybrid Search BM25 Dense](skills/retrieval-strategies/hybrid-search-bm25-dense.md) - Combining keyword and semantic search
56
+ - [Multi-Pass Retrieval with Reranking](skills/retrieval-strategies/multi-pass-retrieval-with-reranking.md) - Two-pass retrieval with cross-encoder reranking
57
+ - [Query Transformation Strategies](skills/retrieval-strategies/query-transformation-strategies.md) - Query rewriting, step-back prompting, sub-query decomposition
58
+ - [HyDE - Hypothetical Document Embeddings](skills/retrieval-strategies/hyde-hypothetical-document-embeddings.md) - Query expansion with LLM-generated documents
59
+ - [HyPE - Hypothetical Prompt Embeddings](skills/retrieval-strategies/hype-hypothetical-prompt-embeddings.md) - Precomputed question embeddings at indexing time
60
+ - [Self-RAG](skills/retrieval-strategies/self-rag.md) - Self-reflective retrieval with relevance evaluation
61
+ - [RAPTOR - Hierarchical Retrieval](skills/retrieval-strategies/raptor-hierarchical-retrieval.md) - Multi-level tree of document summaries
62
+ - [Context Enrichment Window](skills/retrieval-strategies/context-enrichment-window.md) - Adding surrounding chunks to retrieved results
63
+ - [Adaptive Retrieval](skills/retrieval-strategies/adaptive-retrieval.md) - Dynamic strategy selection based on query type
64
+ - [Explainable Retrieval with Citations](skills/retrieval-strategies/explainable-retrieval.md) - Traceability and source attribution
65
+ - [CRAG - Corrective RAG](skills/retrieval-strategies/crag-corrective-rag.md) - Dynamic correction with web search
66
+ - [Graph RAG](skills/retrieval-strategies/graph-rag.md) - Knowledge graph-based retrieval
67
+
68
+ ### Data Type Handling
69
+ Use these when the source content is code, APIs, diagrams, tables, or mixed media.
70
+ - [RAG for Code Documentation](skills/data-type-handling/rag-for-code-documentation.md) - Special handling for code and technical docs
71
+ - [RAG for Multimodal Content](skills/data-type-handling/rag-for-multimodal-content.md) - Images, tables, and mixed media
72
+
73
+ ### Performance Optimization
74
+ Use these when the problem is latency, throughput, cache behavior, or production efficiency.
75
+ - [Optimize Retrieval Latency](skills/performance-optimization/optimize-retrieval-latency.md) - Caching, indexing, and query optimization
76
+
77
+ ### RAG Agents
78
+ Use these when the problem is orchestration, delegation, or multi-step workflows.
79
+ - *See [Examples](#examples) for multi-agent workflows*
80
+
81
+ ### Deployment
82
+ Use these when the problem is production rollout, reliability, or operationalization.
83
+ - *See [Production RAG Setup](#examples)*
84
+
85
+ ### Evaluation Metrics
86
+ Use these when the problem is measurement, regression detection, or retrieval benchmarking.
87
+ - *Coming soon*
88
+
89
+ ## Quick Start
90
+
91
+ ### For AI Agents
92
+
93
+ Read the frontmatter metadata, then route to the skill that best matches the user’s problem. Treat the repo as a decision tree for RAG tasks: chunking, retrieval, vector store choice, embeddings, performance, and workflow orchestration.
94
+
95
+ ### For Framework Integration
96
+
97
+ Build a lightweight index from the markdown frontmatter and use it to filter by category, tags, and task type. The goal is not to mirror all content in code, but to point an agent to the right skill or external implementation quickly.
98
+
99
+ Keep examples in the repo lightweight and point readers to external implementations instead of embedding long code samples.
100
+
101
+ ## Examples
102
+
103
+ Complete walkthroughs and reference implementations:
104
+
105
+ - [Foundational RAG Pipeline Example](examples/foundational-rag-pipeline.md) - A guided RAG build path for agents and builders
106
+ - [Multi-Agent RAG](examples/multi-agent-rag.md) - An orchestration pattern for specialized agents
107
+ - [Production RAG Setup](examples/production-rag-setup.md) - A deployment-oriented route for production systems
108
+
109
+ ## Contributing
110
+
111
+ We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
112
+
113
+ ### Quick Contribution Steps
114
+
115
+ 1. Fork the repository
116
+ 2. Create a new skill file using [templates/skill-template.md](templates/skill-template.md)
117
+ 3. Ensure your skill follows the required structure
118
+ 4. Run validation: `python scripts/validate-skills.py`
119
+ 5. Submit a pull request
120
+
121
+ ## Skill File Format
122
+
123
+ Each skill follows a consistent structure with a short illustrative snippet, not a full implementation. See the template in [templates/skill-template.md](templates/skill-template.md).
124
+
125
+ ## Scripts
126
+
127
+ - `validate-skills.py` — Validate all skill files for format compliance
128
+ - `generate-index.py` — Generate browsable INDEX.md and SKILLS.json
129
+
130
+ ## Project Status
131
+
132
+ This is an active open-source project. Skills are continuously added and updated as RAG best practices evolve.
133
+
134
+ Current statistics:
135
+ - **Total Skills**: 21
136
+ - **Categories**: 7
137
+ - **Examples**: 3
138
+
139
+ *Run `python scripts/generate-index.py` for current statistics.*
140
+
141
+ ## Acknowledgments
142
+
143
+ Built for the RAG community. Special thanks to contributors and the open-source RAG ecosystem.
144
+
145
+
146
+ ## License
147
+
148
+ MIT License — see [LICENSE](LICENSE) for details.
@@ -0,0 +1,104 @@
1
+ ---
2
+ title: "Foundational RAG Pipeline Example"
3
+ category: "rag-agents"
4
+ tags: ["workflow", "quickstart", "pipeline"]
5
+ author: "rag-skills"
6
+ ---
7
+
8
+ ## Overview
9
+ This example shows the minimum viable shape of a RAG pipeline without turning the repository into an implementation repo. Use it as a reference workflow for agents and developers who need to understand the sequence of stages, the key decisions, and where to look for production-ready code.
10
+
11
+ ## Goal
12
+ Build a simple question-answering workflow that:
13
+ 1. Loads source documents
14
+ 2. Chunks and embeds them
15
+ 3. Stores them in a vector database
16
+ 4. Retrieves relevant context at query time
17
+ 5. Generates an answer grounded in retrieved passages
18
+
19
+ ## Recommended Workflow
20
+
21
+ ### Stage 1: Ingestion
22
+ - Load source documents from files, docs platforms, or APIs
23
+ - Normalize text and preserve useful metadata such as title, URL, section, and timestamp
24
+ - Apply a chunking strategy that matches the corpus
25
+
26
+ ### Stage 2: Indexing
27
+ - Generate embeddings for each chunk
28
+ - Store vectors, chunk text, and metadata in a vector database
29
+ - Validate that filtering and retrieval work before moving on
30
+
31
+ ### Stage 3: Retrieval
32
+ - Embed the user query
33
+ - Retrieve the top candidate chunks
34
+ - Optionally rerank or filter results before generation
35
+
36
+ ### Stage 4: Answer Generation
37
+ - Pass the query and retrieved context to the model
38
+ - Require grounded answers and source attribution
39
+ - Return both answer text and the supporting chunks
40
+
41
+ ## Reference Architecture
42
+
43
+ ```text
44
+ Documents -> Loader -> Chunker -> Embedder -> Vector DB
45
+ User Query -> Query Embedder -> Retriever -> Prompt Builder -> LLM -> Answer + Sources
46
+ ```
47
+
48
+ ## Pseudocode
49
+
50
+ ```text
51
+ load documents
52
+ chunk documents with metadata preserved
53
+ embed chunks
54
+ store chunks + embeddings in vector database
55
+
56
+ for each user query:
57
+ embed query
58
+ retrieve top-k chunks
59
+ optionally rerank
60
+ generate answer from retrieved context
61
+ return answer with sources
62
+ ```
63
+
64
+ ## Design Choices
65
+
66
+ ### Keep It Simple First
67
+ - Start with one loader, one chunking strategy, one embedding model, and one vector store
68
+ - Avoid hybrid retrieval and agent orchestration until the baseline is working
69
+
70
+ ### Preserve Metadata Early
71
+ - Store source, section, and document identifiers during ingestion
72
+ - This enables filtering, citations, debugging, and later migration
73
+
74
+ ### Measure Before Optimizing
75
+ - Check whether the retrieved chunks are actually relevant before tuning prompts
76
+ - Most early RAG failures are indexing or chunking problems, not generation problems
77
+
78
+ ## Common Failure Modes
79
+ - Chunk size is too large, so retrieval returns noisy context
80
+ - Chunk size is too small, so meaning gets fragmented
81
+ - Metadata is missing, so debugging and citations are weak
82
+ - Retrieval is evaluated only by final answer quality instead of retrieval quality
83
+ - The pipeline uses synthetic examples that are too easy and hide failure cases
84
+
85
+ ## When to Use This Example
86
+ - As the first pipeline shape for a new RAG application
87
+ - As the baseline architecture before adding reranking or agents
88
+ - As a teaching document for junior contributors or coding agents
89
+
90
+ ## When NOT to Use This Example
91
+ - When the system already requires hybrid retrieval or multimodal ingestion
92
+ - When the main problem is orchestration across multiple tools or agents
93
+ - When the corpus depends heavily on layout parsing, code structure, or hierarchical retrieval
94
+
95
+ ## External Implementations
96
+ - [LangChain RAG guide](https://docs.langchain.com/oss/python/langchain/rag)
97
+ - [LlamaIndex Introduction to RAG](https://docs.llamaindex.ai/en/stable/understanding/rag/)
98
+ - [Qdrant quickstart](https://qdrant.tech/documentation/quickstart/)
99
+ - [OpenAI cookbook RAG examples](https://github.com/openai/openai-cookbook/tree/main/examples)
100
+
101
+ ## Related Skills
102
+ - [Semantic Chunking](../skills/chunking/semantic-chunking.md)
103
+ - [Qdrant Setup for RAG](../skills/vector-databases/qdrant-setup-rag.md)
104
+ - [Hybrid Search BM25 Dense](../skills/retrieval-strategies/hybrid-search-bm25-dense.md)
@@ -0,0 +1,111 @@
1
+ ---
2
+ title: "Multi-Agent RAG System"
3
+ category: "rag-agents"
4
+ level: "advanced"
5
+ tags: ["multi-agent", "orchestration", "workflow", "collaboration"]
6
+ author: "rag-skills"
7
+ last_updated: "2026-04-07"
8
+ ---
9
+
10
+ ## Overview
11
+ This example describes how a multi-agent RAG system should be structured without embedding a large implementation directly in the repository. Use it when a single retrieve-then-answer flow is no longer sufficient and different agent roles need to collaborate.
12
+
13
+ ## Goal
14
+ Coordinate specialized agents so that query understanding, retrieval, reranking, synthesis, and validation can be separated into distinct responsibilities.
15
+
16
+ ## Reference Architecture
17
+
18
+ ```text
19
+ User Query
20
+ -> Orchestrator
21
+ -> Query Analysis Agent
22
+ -> Retrieval Agent
23
+ -> Reranking Agent
24
+ -> Synthesis Agent
25
+ -> Optional Critique / Verification Agent
26
+ -> Final Answer + Sources + Execution Trace
27
+ ```
28
+
29
+ ## Agent Responsibilities
30
+
31
+ ### Query Analysis Agent
32
+ - Classify the query type
33
+ - Decide whether the request needs plain retrieval, comparison, troubleshooting, or decomposition
34
+ - Select the workflow path
35
+
36
+ ### Retrieval Agent
37
+ - Execute first-pass retrieval
38
+ - Return candidate chunks plus retrieval metadata
39
+ - Prefer recall over precision at this stage
40
+
41
+ ### Reranking Agent
42
+ - Reorder the candidate set using stronger relevance logic
43
+ - Drop low-value chunks before generation
44
+ - Improve precision without forcing the retriever to be overly strict
45
+
46
+ ### Synthesis Agent
47
+ - Build the final grounded answer
48
+ - Cite the chunks or documents used
49
+ - Surface uncertainty when evidence is weak
50
+
51
+ ### Optional Critique Agent
52
+ - Check grounding, coverage, and contradiction risk
53
+ - Flag weak evidence or missing context
54
+ - Trigger another retrieval pass when needed
55
+
56
+ ## Pseudocode
57
+
58
+ ```text
59
+ analyze query
60
+ choose workflow
61
+ retrieve broad candidate set
62
+ rerank candidates
63
+ synthesize grounded answer
64
+ optionally critique and retry
65
+ return answer, sources, and trace
66
+ ```
67
+
68
+ ## When Multi-Agent RAG Is Worth It
69
+ - Queries vary widely in type and complexity
70
+ - The cost of retrieval mistakes is high
71
+ - The system needs traceability across steps
72
+ - Different tools or retrieval paths need explicit coordination
73
+ - The team wants modular components that can evolve independently
74
+
75
+ ## When It Is Not Worth It
76
+ - The workload is simple FAQ or narrow-domain Q&A
77
+ - The main bottleneck is chunking or indexing quality
78
+ - Latency budgets are tight and multiple stages are too expensive
79
+ - The team cannot monitor or debug a more complex workflow
80
+
81
+ ## Operational Guidance
82
+
83
+ ### Start With a Single-Agent Baseline
84
+ - Prove that the corpus, chunking, and retrieval are sound first
85
+ - Add extra agents only after identifying a real failure pattern
86
+
87
+ ### Keep Agent Boundaries Clear
88
+ - Each agent should own one job
89
+ - Avoid agents that retrieve, reason, rerank, and answer all at once
90
+
91
+ ### Log the Workflow
92
+ - Record which agent ran, what it produced, and why the next step happened
93
+ - Multi-agent systems are only useful if they stay debuggable
94
+
95
+ ## Common Failure Modes
96
+ - Too many agents with overlapping roles
97
+ - Orchestrator logic that is harder to understand than the retrieval problem
98
+ - Repeated retrieval loops without clear stopping rules
99
+ - Reranking and synthesis both trying to decide relevance independently
100
+ - No workflow trace, making failure analysis impossible
101
+
102
+ ## External Implementations
103
+ - [LangGraph agentic RAG guide](https://docs.langchain.com/oss/python/langgraph/agentic-rag)
104
+ - [LangChain multi-agent docs](https://docs.langchain.com/oss/python/langchain/multi-agent)
105
+ - [LlamaIndex multi-agent patterns](https://developers.llamaindex.ai/python/framework/understanding/putting_it_all_together/agents/)
106
+ - [Microsoft AutoGen documentation](https://microsoft.github.io/autogen/stable/)
107
+
108
+ ## Related Skills
109
+ - [Multi-Pass Retrieval with Reranking](../skills/retrieval-strategies/multi-pass-retrieval-with-reranking.md)
110
+ - [Optimize Retrieval Latency](../skills/performance-optimization/optimize-retrieval-latency.md)
111
+ - [RAG for Code Documentation](../skills/data-type-handling/rag-for-code-documentation.md)