rag-skills 1.0.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (84) hide show
  1. package/.agents/skills/rag-skills/CONTRIBUTING.md +211 -0
  2. package/.agents/skills/rag-skills/INDEX.md +113 -0
  3. package/.agents/skills/rag-skills/LICENSE +21 -0
  4. package/.agents/skills/rag-skills/README.md +169 -0
  5. package/.agents/skills/rag-skills/SKILL.md +92 -0
  6. package/.agents/skills/rag-skills/SKILLS.json +442 -0
  7. package/.agents/skills/rag-skills/examples/foundational-rag-pipeline.md +104 -0
  8. package/.agents/skills/rag-skills/examples/multi-agent-rag.md +111 -0
  9. package/.agents/skills/rag-skills/examples/production-rag-setup.md +133 -0
  10. package/.agents/skills/rag-skills/package.json +22 -0
  11. package/.agents/skills/rag-skills/scripts/generate-index.py +255 -0
  12. package/.agents/skills/rag-skills/scripts/validate-skills.py +217 -0
  13. package/.agents/skills/rag-skills/skills/chunking/SKILL.md +58 -0
  14. package/{skills/chunking/choosing-a-chunking-framework.md → .agents/skills/rag-skills/skills/chunking/choosing-a-chunking-framework/SKILL.md} +187 -186
  15. package/{skills/chunking/contextual-chunk-headers.md → .agents/skills/rag-skills/skills/chunking/contextual-chunk-headers/SKILL.md} +107 -106
  16. package/{skills/chunking/hierarchical-chunking.md → .agents/skills/rag-skills/skills/chunking/hierarchical-chunking/SKILL.md} +11 -10
  17. package/{skills/chunking/semantic-chunking.md → .agents/skills/rag-skills/skills/chunking/semantic-chunking/SKILL.md} +11 -10
  18. package/{skills/chunking/sliding-window-chunking.md → .agents/skills/rag-skills/skills/chunking/sliding-window-chunking/SKILL.md} +11 -10
  19. package/.agents/skills/rag-skills/skills/data-type-handling/SKILL.md +55 -0
  20. package/{skills/data-type-handling/rag-for-code-documentation.md → .agents/skills/rag-skills/skills/data-type-handling/rag-for-code-documentation/SKILL.md} +11 -10
  21. package/{skills/data-type-handling/rag-for-multimodal-content.md → .agents/skills/rag-skills/skills/data-type-handling/rag-for-multimodal-content/SKILL.md} +11 -10
  22. package/.agents/skills/rag-skills/skills/performance-optimization/SKILL.md +56 -0
  23. package/{skills/performance-optimization/optimize-retrieval-latency.md → .agents/skills/rag-skills/skills/performance-optimization/optimize-retrieval-latency/SKILL.md} +5 -4
  24. package/.agents/skills/rag-skills/skills/retrieval-strategies/SKILL.md +58 -0
  25. package/{skills/retrieval-strategies/adaptive-retrieval.md → .agents/skills/rag-skills/skills/retrieval-strategies/adaptive-retrieval/SKILL.md} +103 -102
  26. package/{skills/retrieval-strategies/context-enrichment-window.md → .agents/skills/rag-skills/skills/retrieval-strategies/context-enrichment-window/SKILL.md} +100 -99
  27. package/{skills/retrieval-strategies/crag-corrective-rag.md → .agents/skills/rag-skills/skills/retrieval-strategies/crag-corrective-rag/SKILL.md} +109 -108
  28. package/{skills/retrieval-strategies/explainable-retrieval.md → .agents/skills/rag-skills/skills/retrieval-strategies/explainable-retrieval/SKILL.md} +106 -105
  29. package/{skills/retrieval-strategies/graph-rag.md → .agents/skills/rag-skills/skills/retrieval-strategies/graph-rag/SKILL.md} +108 -107
  30. package/{skills/retrieval-strategies/hybrid-search-bm25-dense.md → .agents/skills/rag-skills/skills/retrieval-strategies/hybrid-search-bm25-dense/SKILL.md} +11 -10
  31. package/{skills/retrieval-strategies/hyde-hypothetical-document-embeddings.md → .agents/skills/rag-skills/skills/retrieval-strategies/hyde-hypothetical-document-embeddings/SKILL.md} +92 -91
  32. package/{skills/retrieval-strategies/hype-hypothetical-prompt-embeddings.md → .agents/skills/rag-skills/skills/retrieval-strategies/hype-hypothetical-prompt-embeddings/SKILL.md} +98 -97
  33. package/{skills/retrieval-strategies/multi-pass-retrieval-with-reranking.md → .agents/skills/rag-skills/skills/retrieval-strategies/multi-pass-retrieval-with-reranking/SKILL.md} +11 -10
  34. package/{skills/retrieval-strategies/query-transformation-strategies.md → .agents/skills/rag-skills/skills/retrieval-strategies/query-transformation-strategies/SKILL.md} +94 -93
  35. package/{skills/retrieval-strategies/raptor-hierarchical-retrieval.md → .agents/skills/rag-skills/skills/retrieval-strategies/raptor-hierarchical-retrieval/SKILL.md} +107 -106
  36. package/{skills/retrieval-strategies/self-rag.md → .agents/skills/rag-skills/skills/retrieval-strategies/self-rag/SKILL.md} +109 -108
  37. package/.agents/skills/rag-skills/skills/vector-databases/SKILL.md +56 -0
  38. package/{skills/vector-databases/choosing-vector-db-by-datatype.md → .agents/skills/rag-skills/skills/vector-databases/choosing-vector-db-by-datatype/SKILL.md} +24 -23
  39. package/.agents/skills/rag-skills/skills/vector-databases/qdrant-for-production-rag/SKILL.md +89 -0
  40. package/{skills/vector-databases/qdrant-setup-rag.md → .agents/skills/rag-skills/skills/vector-databases/qdrant-setup-rag/SKILL.md} +11 -10
  41. package/.agents/skills/rag-skills/templates/skill-template.md +54 -0
  42. package/.agents/skills/rag-skills/templates/workflow-template.md +67 -0
  43. package/CONTRIBUTING.md +21 -20
  44. package/INDEX.md +113 -0
  45. package/README.md +162 -141
  46. package/SKILL.md +92 -0
  47. package/SKILLS.json +442 -0
  48. package/examples/foundational-rag-pipeline.md +104 -104
  49. package/examples/multi-agent-rag.md +111 -111
  50. package/examples/production-rag-setup.md +133 -133
  51. package/package.json +2 -2
  52. package/scripts/generate-index.py +16 -37
  53. package/scripts/validate-skills.py +7 -4
  54. package/skills/chunking/SKILL.md +58 -0
  55. package/skills/chunking/choosing-a-chunking-framework/SKILL.md +187 -0
  56. package/skills/chunking/contextual-chunk-headers/SKILL.md +107 -0
  57. package/skills/chunking/hierarchical-chunking/SKILL.md +78 -0
  58. package/skills/chunking/semantic-chunking/SKILL.md +79 -0
  59. package/skills/chunking/sliding-window-chunking/SKILL.md +83 -0
  60. package/skills/data-type-handling/SKILL.md +55 -0
  61. package/skills/data-type-handling/rag-for-code-documentation/SKILL.md +84 -0
  62. package/skills/data-type-handling/rag-for-multimodal-content/SKILL.md +84 -0
  63. package/skills/performance-optimization/SKILL.md +56 -0
  64. package/skills/performance-optimization/optimize-retrieval-latency/SKILL.md +89 -0
  65. package/skills/retrieval-strategies/SKILL.md +58 -0
  66. package/skills/retrieval-strategies/adaptive-retrieval/SKILL.md +103 -0
  67. package/skills/retrieval-strategies/context-enrichment-window/SKILL.md +100 -0
  68. package/skills/retrieval-strategies/crag-corrective-rag/SKILL.md +109 -0
  69. package/skills/retrieval-strategies/explainable-retrieval/SKILL.md +107 -0
  70. package/skills/retrieval-strategies/graph-rag/SKILL.md +108 -0
  71. package/skills/retrieval-strategies/hybrid-search-bm25-dense/SKILL.md +82 -0
  72. package/skills/retrieval-strategies/hyde-hypothetical-document-embeddings/SKILL.md +92 -0
  73. package/skills/retrieval-strategies/hype-hypothetical-prompt-embeddings/SKILL.md +99 -0
  74. package/skills/retrieval-strategies/multi-pass-retrieval-with-reranking/SKILL.md +83 -0
  75. package/skills/retrieval-strategies/query-transformation-strategies/SKILL.md +94 -0
  76. package/skills/retrieval-strategies/raptor-hierarchical-retrieval/SKILL.md +107 -0
  77. package/skills/retrieval-strategies/self-rag/SKILL.md +109 -0
  78. package/skills/vector-databases/SKILL.md +56 -0
  79. package/skills/vector-databases/choosing-vector-db-by-datatype/SKILL.md +113 -0
  80. package/skills/vector-databases/{qdrant-for-production-rag.md → qdrant-for-production-rag/SKILL.md} +4 -3
  81. package/skills/vector-databases/qdrant-setup-rag/SKILL.md +87 -0
  82. package/skills-lock.json +10 -0
  83. package/templates/skill-template.md +12 -11
  84. package/templates/workflow-template.md +5 -5
@@ -0,0 +1,211 @@
1
+ # Contributing to rag-skills
2
+
3
+ Thank you for your interest in contributing to rag-skills! This document provides guidelines for submitting new skills, reviewing existing skills, and maintaining the repository.
4
+
5
+ ## Table of Contents
6
+
7
+ - [Code of Conduct](#code-of-conduct)
8
+ - [Getting Started](#getting-started)
9
+ - [Submitting New Skills](#submitting-new-skills)
10
+ - [Skill Review Criteria](#skill-review-criteria)
11
+ - [Naming Conventions](#naming-conventions)
12
+ - [Testing and Validation](#testing-and-validation)
13
+ - [Credit and Attribution](#credit-and-attribution)
14
+ - [Reporting Issues](#reporting-issues)
15
+
16
+ ## Code of Conduct
17
+
18
+ - Be respectful and inclusive
19
+ - Provide constructive feedback
20
+ - Focus on what is best for the community
21
+ - Show empathy towards other community members
22
+
23
+ ## Getting Started
24
+
25
+ ### Development Setup
26
+
27
+ ### Running Validation
28
+
29
+ ## Submitting New Skills
30
+
31
+ ### Step 1: Choose Your Topic
32
+
33
+ Before creating a new skill, check that:
34
+
35
+ 1. The topic is not already covered by an existing skill
36
+ 2. The topic is relevant to RAG systems
37
+ 3. You have practical experience with the topic
38
+
39
+ ### Step 2: Use the Template
40
+
41
+ Start with [templates/skill-template.md](templates/skill-template.md):
42
+ Use the template as a lightweight reference format: brief illustrative text, no long runnable code, and 3-5 external implementation links folded into the relevant sections instead of a separate `References` block.
43
+
44
+ ### Step 3: Place Your Skill
45
+
46
+ Organize skills by category:
47
+
48
+ ```text
49
+ skills/<category>/<skill-name>/SKILL.md
50
+ ```
51
+
52
+ For example:
53
+
54
+ ```text
55
+ skills/chunking/semantic-chunking/SKILL.md
56
+ ```
57
+
58
+ If your category doesn't exist, create it in the appropriate location.
59
+
60
+ ### Step 4: Validate Your Skill
61
+
62
+ Run the validation script and fix any errors before submitting.
63
+
64
+ ### Step 5: Submit a Pull Request
65
+
66
+ 1. Commit your changes
67
+ 2. Push to your fork
68
+ 3. Open a pull request with a descriptive title
69
+
70
+ Example PR title: `Add skill: Semantic Chunking for Markdown Documents`
71
+
72
+ ## Skill Review Criteria
73
+
74
+ When reviewing skills, maintainers evaluate them against these criteria:
75
+
76
+ ### Clarity
77
+
78
+ - [ ] Is the problem statement clear and specific?
79
+ - [ ] Are key concepts well-defined?
80
+ - [ ] Are implementation steps logically ordered?
81
+ - [ ] Is the writing free of ambiguity?
82
+
83
+ ### Accuracy
84
+
85
+ - [ ] Are the technical statements correct?
86
+ - [ ] Do code examples run as expected?
87
+ - [ ] Are references valid and current?
88
+ - [ ] Are the metrics/success criteria realistic?
89
+
90
+ ### Completeness
91
+
92
+ - [ ] All required sections are present
93
+ - [ ] Code examples are brief and illustrative, not full implementations
94
+ - [ ] Related skills are properly linked
95
+ - [ ] Both use cases and anti-patterns are covered
96
+
97
+ ### Practicality
98
+
99
+ - [ ] The skill addresses a real-world problem
100
+ - [ ] The approach is production-viable
101
+ - [ ] The complexity matches the difficulty level
102
+ - [ ] Dependencies are reasonable and well-known
103
+
104
+ ### Code Quality
105
+
106
+ - [ ] Code follows Python conventions (PEP 8)
107
+ - [ ] Code includes comments for complex logic
108
+ - [ ] Code handles errors appropriately
109
+ - [ ] Code examples stay lightweight and defer to external implementations
110
+
111
+ ## Naming Conventions
112
+
113
+ ### Skill Files
114
+
115
+ - Use kebab-case directory names: `semantic-chunking/SKILL.md`
116
+ - Be descriptive: `hybrid-search-bm25-dense/SKILL.md`
117
+ - Keep names under 50 characters
118
+
119
+ ### Categories
120
+
121
+ Use existing category names:
122
+ - `chunking`
123
+ - `vector-databases`
124
+ - `retrieval-strategies`
125
+ - `data-type-handling`
126
+ - `performance-optimization`
127
+ - `evaluation-metrics`
128
+ - `rag-agents`
129
+ - `deployment`
130
+
131
+ ### Tags
132
+
133
+ - Use 3-5 relevant tags per skill
134
+ - Use lowercase: `["semantic", "nlp", "context"]`
135
+ - Avoid overly specific tags
136
+ - Focus on searchable terms
137
+
138
+ ## Testing and Validation
139
+
140
+ ### Local Validation
141
+
142
+ Before submitting, ensure your skill passes validation:
143
+
144
+ Strict mode treats warnings as errors.
145
+
146
+ ### Link Validation
147
+
148
+ Verify all internal links work and that external implementation links are placed inline in the relevant step or sentence:
149
+
150
+ ## Credit and Attribution
151
+
152
+ ### Author Field
153
+
154
+ Include your name in the `author` field:
155
+
156
+ For organizational contributions:
157
+
158
+ ### Last Updated
159
+
160
+ Update the `last_updated` field with the current date:
161
+
162
+ ### Co-Authors
163
+
164
+ For substantial contributions from multiple people, list them in the pull request description:
165
+
166
+ ## Reporting Issues
167
+
168
+ When reporting issues, include:
169
+
170
+ - A clear title
171
+ - Steps to reproduce
172
+ - Expected vs actual behavior
173
+ - Environment information
174
+ - Screenshots if applicable
175
+
176
+ ### Issue Templates
177
+
178
+ #### Bug Report
179
+
180
+ #### Feature Request
181
+
182
+ ## Community Guidelines
183
+
184
+ ### Discussions
185
+
186
+ - Use GitHub Discussions for questions and ideas
187
+ - Be specific in your questions
188
+ - Share code snippets when helpful
189
+ - Follow up on responses
190
+
191
+ ### Code Review
192
+
193
+ - Be constructive in your reviews
194
+ - Explain the reasoning for suggested changes
195
+ - Acknowledge good work
196
+ - Be patient with maintainers' time
197
+
198
+ ### Maintainer Response Time
199
+
200
+ Maintainers aim to respond to:
201
+ - Pull requests: Within 7 days
202
+ - Issues: Within 7 days
203
+ - Discussions: Within 3 days
204
+
205
+ ## Additional Resources
206
+
207
+ - [Project README](README.md)
208
+ - [Skill Template](templates/skill-template.md)
209
+ - [GitHub Community Guidelines](https://docs.github.com/en/site-policy/github-terms/github-community-guidelines)
210
+
211
+ Thank you for contributing to rag-skills!
@@ -0,0 +1,113 @@
1
+ # RAG Skills Index
2
+
3
+ Generated: 2026-04-11 03:11:54
4
+
5
+ Total Skills: 28
6
+
7
+ ---
8
+
9
+ ## Browse by Category
10
+
11
+ ### Chunking
12
+
13
+ - [Choosing a Chunking Framework](skills/chunking/choosing-a-chunking-framework/SKILL.md)
14
+ - *Chunking quality depends as much on the framework as on the strategy itself.*
15
+ - [Contextual Chunk Headers](skills/chunking/contextual-chunk-headers/SKILL.md)
16
+ - *Contextual chunk headers (CCH) enhance retrieval by prepending higher-level context (document title, section headers, summaries) to each chunk before embedding.*
17
+ - [Hierarchical Chunking](skills/chunking/hierarchical-chunking/SKILL.md)
18
+ - *Hierarchical chunking creates multi-level chunk structures that preserve document hierarchies (chapters, sections, subsections).*
19
+ - [Semantic Chunking](skills/chunking/semantic-chunking/SKILL.md)
20
+ - *Semantic chunking divides documents into segments based on natural language boundaries and semantic meaning rather than fixed character counts.*
21
+ - [Chunking](skills/chunking/SKILL.md)
22
+ - *Use this parent skill when the main RAG problem is how to split source material into retrievable units.*
23
+ - [Sliding Window Chunking](skills/chunking/sliding-window-chunking/SKILL.md)
24
+ - *Sliding window chunking creates overlapping chunks where each chunk shares content with adjacent chunks.*
25
+
26
+ ### Data Type Handling
27
+
28
+ - [RAG for Code Documentation](skills/data-type-handling/rag-for-code-documentation/SKILL.md)
29
+ - *RAG for code documentation requires specialized handling due to code's structured nature, syntax-specific patterns, and the importance of preserving function signatures, imports, and contextual relationships.*
30
+ - [RAG for Multimodal Content](skills/data-type-handling/rag-for-multimodal-content/SKILL.md)
31
+ - *Multimodal RAG extends retrieval to include images, videos, audio, and mixed media content alongside text.*
32
+ - [Data Type Handling](skills/data-type-handling/SKILL.md)
33
+ - *Use this parent skill when source material is not plain prose or when different data types need different parsing, metadata, chunking, and retrieval strategies.*
34
+
35
+ ### Performance Optimization
36
+
37
+ - [Optimize Retrieval Latency](skills/performance-optimization/optimize-retrieval-latency/SKILL.md)
38
+ - *Optimizing RAG retrieval latency is critical for production applications where user experience depends on fast response times.*
39
+ - [Performance Optimization](skills/performance-optimization/SKILL.md)
40
+ - *Use this parent skill when the RAG system works functionally but is too slow, expensive, or unstable under expected traffic.*
41
+
42
+ ### Retrieval Strategies
43
+
44
+ - [Adaptive Retrieval](skills/retrieval-strategies/adaptive-retrieval/SKILL.md)
45
+ - *Adaptive retrieval classifies queries into types (factual, analytical, opinion, contextual) and applies different retrieval strategies optimized for each type.*
46
+ - [Context Enrichment Window](skills/retrieval-strategies/context-enrichment-window/SKILL.md)
47
+ - *Context enrichment window expands retrieved chunks by including neighboring text from the original document.*
48
+ - [CRAG - Corrective RAG](skills/retrieval-strategies/crag-corrective-rag/SKILL.md)
49
+ - *Corrective RAG (CRAG) extends standard retrieval by dynamically evaluating document relevance and correcting the retrieval process when needed.*
50
+ - [Explainable Retrieval with Citations](skills/retrieval-strategies/explainable-retrieval/SKILL.md)
51
+ - *Explainable retrieval adds citations, source attribution, and traceability to RAG systems.*
52
+ - [Graph RAG - Knowledge Graph Retrieval](skills/retrieval-strategies/graph-rag/SKILL.md)
53
+ - *Graph RAG enhances traditional retrieval by constructing knowledge graphs from documents, identifying communities of related entities, and using these structures to improve retrieval.*
54
+ - [Hybrid Search: BM25 + Dense](skills/retrieval-strategies/hybrid-search-bm25-dense/SKILL.md)
55
+ - *Hybrid search combines BM25 (keyword search) with dense vector embeddings (semantic search) to leverage both exact term matching and semantic understanding.*
56
+ - [HyDE - Hypothetical Document Embeddings](skills/retrieval-strategies/hyde-hypothetical-document-embeddings/SKILL.md)
57
+ - *HyDE (Hypothetical Document Embeddings) is a query expansion technique that generates a hypothetical document answering the user's query, then uses this synthetic document as the query for vector search.*
58
+ - [HyPE - Hypothetical Prompt Embeddings](skills/retrieval-strategies/hype-hypothetical-prompt-embeddings/SKILL.md)
59
+ - *HyPE (Hypothetical Prompt Embeddings) transforms retrieval from query-document matching to question-question matching by generating multiple hypothetical questions for each document chunk during the indexing phase.*
60
+ - [Multi-Pass Retrieval with Reranking](skills/retrieval-strategies/multi-pass-retrieval-with-reranking/SKILL.md)
61
+ - *Multi-pass retrieval with reranking is a two-stage approach that first retrieves a broad set of candidates using fast bi-encoder search, then refines them using a more accurate but slower cross-encoder reranker.*
62
+ - [Query Transformation Strategies](skills/retrieval-strategies/query-transformation-strategies/SKILL.md)
63
+ - *Query transformation strategies modify or expand user queries before retrieval to bridge the gap between natural language queries and document representations.*
64
+ - [RAPTOR - Hierarchical Abstractive Retrieval](skills/retrieval-strategies/raptor-hierarchical-retrieval/SKILL.md)
65
+ - *RAPTOR (Recursive Abstractive Processing and Tree-Organized Retrieval) creates a hierarchical tree of document summaries, allowing retrieval at multiple levels of abstraction.*
66
+ - [Self-RAG - Self-Reflective Retrieval](skills/retrieval-strategies/self-rag/SKILL.md)
67
+ - *Self-RAG is a reflective framework that decides whether to retrieve information, evaluates the relevance of retrieved documents, assesses response support, and rates output utility.*
68
+ - [Retrieval Strategies](skills/retrieval-strategies/SKILL.md)
69
+ - *Use this parent skill when the main RAG problem is search quality, ranking, recall, context selection, or evidence traceability.*
70
+
71
+ ### Vector Databases
72
+
73
+ - [Choosing Vector Database by Data Type](skills/vector-databases/choosing-vector-db-by-datatype/SKILL.md)
74
+ - *Selecting the right vector database depends heavily on your data type (text, images, code, multimodal) and use case requirements.*
75
+ - [Qdrant for Production RAG](skills/vector-databases/qdrant-for-production-rag/SKILL.md)
76
+ - *Productionizing a RAG system with Qdrant requires considerations beyond basic setup: horizontal scaling, high availability, performance optimization, monitoring, and cost management.*
77
+ - [Qdrant Setup for RAG](skills/vector-databases/qdrant-setup-rag/SKILL.md)
78
+ - *Qdrant is an open-source vector similarity search engine designed for high-performance RAG applications.*
79
+ - [Vector Databases](skills/vector-databases/SKILL.md)
80
+ - *Use this parent skill when the main RAG problem is choosing, configuring, or operating the vector storage layer.*
81
+
82
+ ## All Skills
83
+
84
+ | Title | Category | Tags |
85
+ |-------|----------|------|
86
+ | [Adaptive Retrieval](skills/retrieval-strategies/adaptive-retrieval/SKILL.md) | retrieval-strategies | adaptive, query-classification, dynamic-strategy (+1) |
87
+ | [CRAG - Corrective RAG](skills/retrieval-strategies/crag-corrective-rag/SKILL.md) | retrieval-strategies | crag, corrective, web-search (+2) |
88
+ | [Choosing Vector Database by Data Type](skills/vector-databases/choosing-vector-db-by-datatype/SKILL.md) | vector-databases | selection, text, multimodal (+2) |
89
+ | [Choosing a Chunking Framework](skills/chunking/choosing-a-chunking-framework/SKILL.md) | chunking | framework-selection, chonkie, langchain (+3) |
90
+ | [Chunking](skills/chunking/SKILL.md) | chunking | chunking, routing, rag (+1) |
91
+ | [Context Enrichment Window](skills/retrieval-strategies/context-enrichment-window/SKILL.md) | retrieval-strategies | context-enrichment, surrounding-context, window (+1) |
92
+ | [Contextual Chunk Headers](skills/chunking/contextual-chunk-headers/SKILL.md) | chunking | contextual-headers, metadata, chunk-enhancement (+1) |
93
+ | [Data Type Handling](skills/data-type-handling/SKILL.md) | data-type-handling | data-types, code, multimodal (+1) |
94
+ | [Explainable Retrieval with Citations](skills/retrieval-strategies/explainable-retrieval/SKILL.md) | retrieval-strategies | explainability, citations, traceability (+1) |
95
+ | [Graph RAG - Knowledge Graph Retrieval](skills/retrieval-strategies/graph-rag/SKILL.md) | retrieval-strategies | graph-rag, knowledge-graph, entity-extraction (+1) |
96
+ | [Hierarchical Chunking](skills/chunking/hierarchical-chunking/SKILL.md) | chunking | nested, multi-level, document-structure (+1) |
97
+ | [HyDE - Hypothetical Document Embeddings](skills/retrieval-strategies/hyde-hypothetical-document-embeddings/SKILL.md) | retrieval-strategies | hyde, query-expansion, llm-generation (+1) |
98
+ | [HyPE - Hypothetical Prompt Embeddings](skills/retrieval-strategies/hype-hypothetical-prompt-embeddings/SKILL.md) | retrieval-strategies | hype, precomputed-queries, indexing-time (+1) |
99
+ | [Hybrid Search: BM25 + Dense](skills/retrieval-strategies/hybrid-search-bm25-dense/SKILL.md) | retrieval-strategies | hybrid, bm25, dense (+2) |
100
+ | [Multi-Pass Retrieval with Reranking](skills/retrieval-strategies/multi-pass-retrieval-with-reranking/SKILL.md) | retrieval-strategies | reranking, cross-encoder, two-stage (+1) |
101
+ | [Optimize Retrieval Latency](skills/performance-optimization/optimize-retrieval-latency/SKILL.md) | performance-optimization | latency, performance, caching (+2) |
102
+ | [Performance Optimization](skills/performance-optimization/SKILL.md) | performance-optimization | latency, performance, caching (+1) |
103
+ | [Qdrant Setup for RAG](skills/vector-databases/qdrant-setup-rag/SKILL.md) | vector-databases | qdrant, setup, ingestion (+1) |
104
+ | [Qdrant for Production RAG](skills/vector-databases/qdrant-for-production-rag/SKILL.md) | vector-databases | production, scaling, optimization (+1) |
105
+ | [Query Transformation Strategies](skills/retrieval-strategies/query-transformation-strategies/SKILL.md) | retrieval-strategies | query-expansion, step-back, sub-query (+1) |
106
+ | [RAG for Code Documentation](skills/data-type-handling/rag-for-code-documentation/SKILL.md) | data-type-handling | code, programming, syntax (+2) |
107
+ | [RAG for Multimodal Content](skills/data-type-handling/rag-for-multimodal-content/SKILL.md) | data-type-handling | multimodal, images, text (+2) |
108
+ | [RAPTOR - Hierarchical Abstractive Retrieval](skills/retrieval-strategies/raptor-hierarchical-retrieval/SKILL.md) | retrieval-strategies | raptor, hierarchical, clustering (+2) |
109
+ | [Retrieval Strategies](skills/retrieval-strategies/SKILL.md) | retrieval-strategies | retrieval, ranking, hybrid-search (+1) |
110
+ | [Self-RAG - Self-Reflective Retrieval](skills/retrieval-strategies/self-rag/SKILL.md) | retrieval-strategies | self-rag, reflection, retrieval-decision (+1) |
111
+ | [Semantic Chunking](skills/chunking/semantic-chunking/SKILL.md) | chunking | semantic, nlp, sentence-boundary (+1) |
112
+ | [Sliding Window Chunking](skills/chunking/sliding-window-chunking/SKILL.md) | chunking | overlap, context-preservation, window (+1) |
113
+ | [Vector Databases](skills/vector-databases/SKILL.md) | vector-databases | vector-database, qdrant, metadata (+1) |
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 rag-skills contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,169 @@
1
+ # Rag-skills
2
+
3
+ <p>
4
+ <code>agent routing</code> <code>RAG skills</code> <code>markdown</code>
5
+ </p>
6
+
7
+ A modular collection of best-practice guides and skill definitions for building Retrieval-Augmented Generation (RAG) systems. Designed for AI coding agents, agent frameworks, and teams that want a structured way to route RAG work to the right strategy.
8
+
9
+ ## Overview
10
+
11
+ RAG-skills consolidates actionable skills that help AI agents and builders improve RAG performance, choose appropriate vector databases, implement effective chunking strategies, optimize retrieval quality, and orchestrate multi-step RAG workflows.
12
+
13
+ ## Installation
14
+
15
+ ### Claude Code
16
+
17
+ Add this repository as a Claude Code plugin marketplace:
18
+
19
+ ```text
20
+ /plugin marketplace add Goodnight77/rag-skills
21
+ ```
22
+
23
+ Then install the RAG skills plugin:
24
+
25
+ ```text
26
+ /plugin install rag-skills@rag-skills
27
+ ```
28
+
29
+ Restart Claude Code after installation.
30
+
31
+ ### Skills CLI
32
+
33
+ Install with the Skills CLI:
34
+
35
+ ```bash
36
+ npx skills add Goodnight77/rag-skills
37
+ ```
38
+
39
+ This installs the root [`SKILL.md`](SKILL.md) plus the native skill tree under
40
+ [`skills/`](skills/). Claude Code can discover category skills such as
41
+ `/chunking` and specific skills such as `/semantic-chunking`.
42
+
43
+ ### Manual Usage
44
+
45
+ You can also clone the repository and reference the Markdown skills directly:
46
+
47
+ ```bash
48
+ git clone https://github.com/Goodnight77/rag-skills.git
49
+ ```
50
+
51
+ Then point your agent or coding assistant to the `skills/` directory.
52
+
53
+ > Note: This repository follows the Claude Code/Qdrant-style structure: category routers live at paths like `skills/chunking/SKILL.md`, and specific skills live at paths like `skills/chunking/semantic-chunking/SKILL.md`.
54
+
55
+ ## Skills by Decision Area
56
+
57
+ This repo is organized as a routing layer for RAG work. Agents can use the category and metadata in each skill file to decide which path to follow for a given problem, instead of treating the repo like a generic reference manual.
58
+
59
+ ### Chunking
60
+ Use these when the main problem is how to split source material into retrievable units.
61
+ - [Semantic Chunking](skills/chunking/semantic-chunking/SKILL.md) - Chunk documents based on semantic boundaries
62
+ - [Hierarchical Chunking](skills/chunking/hierarchical-chunking/SKILL.md) - Multi-level chunking for nested structures
63
+ - [Sliding Window Chunking](skills/chunking/sliding-window-chunking/SKILL.md) - Overlap-based chunking for context preservation
64
+ - [Contextual Chunk Headers](skills/chunking/contextual-chunk-headers/SKILL.md) - Adding higher-level context to chunks
65
+
66
+ ### Vector Databases
67
+ Use these when the main problem is choosing or operating the storage layer for embeddings and metadata.
68
+ - [Qdrant Setup for RAG](skills/vector-databases/qdrant-setup-rag/SKILL.md) - Setting up Qdrant for RAG
69
+ - [Qdrant for Production RAG](skills/vector-databases/qdrant-for-production-rag/SKILL.md) - Scaling RAG with Qdrant
70
+ - [Choosing Vector DB by Datatype](skills/vector-databases/choosing-vector-db-by-datatype/SKILL.md) - Database selection guide
71
+
72
+ ### Retrieval Strategies
73
+ Use these when the main problem is search quality, ranking, recall, or combining search methods.
74
+ - [Hybrid Search BM25 Dense](skills/retrieval-strategies/hybrid-search-bm25-dense/SKILL.md) - Combining keyword and semantic search
75
+ - [Multi-Pass Retrieval with Reranking](skills/retrieval-strategies/multi-pass-retrieval-with-reranking/SKILL.md) - Two-pass retrieval with cross-encoder reranking
76
+ - [Query Transformation Strategies](skills/retrieval-strategies/query-transformation-strategies/SKILL.md) - Query rewriting, step-back prompting, sub-query decomposition
77
+ - [HyDE - Hypothetical Document Embeddings](skills/retrieval-strategies/hyde-hypothetical-document-embeddings/SKILL.md) - Query expansion with LLM-generated documents
78
+ - [HyPE - Hypothetical Prompt Embeddings](skills/retrieval-strategies/hype-hypothetical-prompt-embeddings/SKILL.md) - Precomputed question embeddings at indexing time
79
+ - [Self-RAG](skills/retrieval-strategies/self-rag/SKILL.md) - Self-reflective retrieval with relevance evaluation
80
+ - [RAPTOR - Hierarchical Retrieval](skills/retrieval-strategies/raptor-hierarchical-retrieval/SKILL.md) - Multi-level tree of document summaries
81
+ - [Context Enrichment Window](skills/retrieval-strategies/context-enrichment-window/SKILL.md) - Adding surrounding chunks to retrieved results
82
+ - [Adaptive Retrieval](skills/retrieval-strategies/adaptive-retrieval/SKILL.md) - Dynamic strategy selection based on query type
83
+ - [Explainable Retrieval with Citations](skills/retrieval-strategies/explainable-retrieval/SKILL.md) - Traceability and source attribution
84
+ - [CRAG - Corrective RAG](skills/retrieval-strategies/crag-corrective-rag/SKILL.md) - Dynamic correction with web search
85
+ - [Graph RAG](skills/retrieval-strategies/graph-rag/SKILL.md) - Knowledge graph-based retrieval
86
+
87
+ ### Data Type Handling
88
+ Use these when the source content is code, APIs, diagrams, tables, or mixed media.
89
+ - [RAG for Code Documentation](skills/data-type-handling/rag-for-code-documentation/SKILL.md) - Special handling for code and technical docs
90
+ - [RAG for Multimodal Content](skills/data-type-handling/rag-for-multimodal-content/SKILL.md) - Images, tables, and mixed media
91
+
92
+ ### Performance Optimization
93
+ Use these when the problem is latency, throughput, cache behavior, or production efficiency.
94
+ - [Optimize Retrieval Latency](skills/performance-optimization/optimize-retrieval-latency/SKILL.md) - Caching, indexing, and query optimization
95
+
96
+ ### RAG Agents
97
+ Use these when the problem is orchestration, delegation, or multi-step workflows.
98
+ - *See [Examples](#examples) for multi-agent workflows*
99
+
100
+ ### Deployment
101
+ Use these when the problem is production rollout, reliability, or operationalization.
102
+ - *See [Production RAG Setup](#examples)*
103
+
104
+ ### Evaluation Metrics
105
+ Use these when the problem is measurement, regression detection, or retrieval benchmarking.
106
+ - *Coming soon*
107
+
108
+ ## Quick Start
109
+
110
+ ### For AI Agents
111
+
112
+ Read the frontmatter metadata, then route to the skill that best matches the user’s problem. Treat the repo as a decision tree for RAG tasks: chunking, retrieval, vector store choice, embeddings, performance, and workflow orchestration.
113
+
114
+ ### For Framework Integration
115
+
116
+ Build a lightweight index from the markdown frontmatter and use it to filter by category, tags, and task type. The goal is not to mirror all content in code, but to point an agent to the right skill or external implementation quickly.
117
+
118
+ Keep examples in the repo lightweight and point readers to external implementations instead of embedding long code samples.
119
+
120
+ ## Examples
121
+
122
+ Complete walkthroughs and reference implementations:
123
+
124
+ - [Foundational RAG Pipeline Example](examples/foundational-rag-pipeline.md) - A guided RAG build path for agents and builders
125
+ - [Multi-Agent RAG](examples/multi-agent-rag.md) - An orchestration pattern for specialized agents
126
+ - [Production RAG Setup](examples/production-rag-setup.md) - A deployment-oriented route for production systems
127
+
128
+ ## Contributing
129
+
130
+ We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
131
+
132
+ ### Quick Contribution Steps
133
+
134
+ 1. Fork the repository
135
+ 2. Create a new skill file using [templates/skill-template.md](templates/skill-template.md)
136
+ 3. Ensure your skill follows the required structure
137
+ 4. Run validation: `python scripts/validate-skills.py`
138
+ 5. Submit a pull request
139
+
140
+ ## Skill File Format
141
+
142
+ Each skill follows a consistent structure with a short illustrative snippet, not a full implementation. See the template in [templates/skill-template.md](templates/skill-template.md).
143
+
144
+ ## Scripts
145
+
146
+ - `validate-skills.py` — Validate all skill files for format compliance
147
+ - `generate-index.py` — Generate browsable INDEX.md and SKILLS.json
148
+
149
+ ## Project Status
150
+
151
+ This is an active open-source project. Skills are continuously added and updated as RAG best practices evolve.
152
+
153
+ Current statistics:
154
+ - **Native Skills**: 28
155
+ - **Guide Skills**: 23
156
+ - **Category Router Skills**: 5
157
+ - **Categories**: 5
158
+ - **Examples**: 3
159
+
160
+ *Run `python scripts/generate-index.py` for current statistics.*
161
+
162
+ ## Acknowledgments
163
+
164
+ Built for the RAG community. Special thanks to contributors and the open-source RAG ecosystem.
165
+
166
+
167
+ ## License
168
+
169
+ MIT License — see [LICENSE](LICENSE) for details.
@@ -0,0 +1,92 @@
1
+ ---
2
+ name: rag-skills
3
+ description: Use this skill when building, debugging, or improving Retrieval-Augmented Generation systems, including chunking, vector database selection, hybrid search, reranking, multimodal RAG, code documentation RAG, retrieval latency, and production RAG architecture.
4
+ ---
5
+
6
+ # RAG Skills
7
+
8
+ This skill routes RAG implementation work to the right guide in this repository.
9
+ Use it when a user asks for help designing, implementing, improving, evaluating,
10
+ or operating a Retrieval-Augmented Generation pipeline.
11
+
12
+ ## How to Use This Skill
13
+
14
+ 1. Identify the main RAG problem: chunking, vector storage, retrieval quality,
15
+ data type handling, latency, evaluation, agents, or deployment.
16
+ 2. Open the most relevant guide under `skills/`.
17
+ 3. Follow the guide's decision criteria, implementation notes, references, and
18
+ success metrics.
19
+ 4. Prefer lightweight examples in this repo, then use the linked external
20
+ implementations for production code patterns.
21
+
22
+ ## Skill Routes
23
+
24
+ ### Chunking
25
+
26
+ - `skills/chunking/semantic-chunking/SKILL.md`: Chunk by semantic boundaries instead
27
+ of fixed token windows.
28
+ - `skills/chunking/hierarchical-chunking/SKILL.md`: Preserve document hierarchy across
29
+ sections, headings, and nested structures.
30
+ - `skills/chunking/sliding-window-chunking/SKILL.md`: Add overlap to preserve context
31
+ near chunk boundaries.
32
+ - `skills/chunking/contextual-chunk-headers/SKILL.md`: Add inherited section context
33
+ to chunks.
34
+ - `skills/chunking/choosing-a-chunking-framework/SKILL.md`: Select chunking libraries
35
+ and frameworks.
36
+
37
+ ### Vector Databases
38
+
39
+ - `skills/vector-databases/qdrant-setup-rag/SKILL.md`: Set up Qdrant for RAG with
40
+ metadata and filtering.
41
+ - `skills/vector-databases/qdrant-for-production-rag/SKILL.md`: Operate Qdrant in
42
+ production RAG systems.
43
+ - `skills/vector-databases/choosing-vector-db-by-datatype/SKILL.md`: Choose a vector
44
+ database for text, code, multimodal, and structured data.
45
+
46
+ ### Retrieval Strategies
47
+
48
+ - `skills/retrieval-strategies/hybrid-search-bm25-dense/SKILL.md`: Combine keyword
49
+ and dense vector retrieval.
50
+ - `skills/retrieval-strategies/multi-pass-retrieval-with-reranking/SKILL.md`: Retrieve
51
+ broadly, then rerank with a stronger model.
52
+ - `skills/retrieval-strategies/query-transformation-strategies/SKILL.md`: Rewrite,
53
+ decompose, or expand queries before retrieval.
54
+ - `skills/retrieval-strategies/hyde-hypothetical-document-embeddings/SKILL.md`: Use
55
+ hypothetical answer documents to improve query embeddings.
56
+ - `skills/retrieval-strategies/hype-hypothetical-prompt-embeddings/SKILL.md`: Index
57
+ likely prompts or questions alongside source content.
58
+ - `skills/retrieval-strategies/self-rag/SKILL.md`: Add self-reflection and retrieval
59
+ validation to generation workflows.
60
+ - `skills/retrieval-strategies/raptor-hierarchical-retrieval/SKILL.md`: Retrieve over
61
+ hierarchical summaries and source chunks.
62
+ - `skills/retrieval-strategies/context-enrichment-window/SKILL.md`: Expand retrieved
63
+ chunks with neighboring context.
64
+ - `skills/retrieval-strategies/adaptive-retrieval/SKILL.md`: Choose retrieval strategy
65
+ dynamically based on query type.
66
+ - `skills/retrieval-strategies/explainable-retrieval/SKILL.md`: Improve traceability
67
+ with source attribution and citations.
68
+ - `skills/retrieval-strategies/crag-corrective-rag/SKILL.md`: Correct weak retrieval
69
+ with validation and fallback search.
70
+ - `skills/retrieval-strategies/graph-rag/SKILL.md`: Use graph structure and entity
71
+ relationships for retrieval.
72
+
73
+ ### Data Type Handling
74
+
75
+ - `skills/data-type-handling/rag-for-code-documentation/SKILL.md`: Handle code,
76
+ APIs, examples, and technical documentation.
77
+ - `skills/data-type-handling/rag-for-multimodal-content/SKILL.md`: Handle images,
78
+ tables, diagrams, and mixed media.
79
+
80
+ ### Performance Optimization
81
+
82
+ - `skills/performance-optimization/optimize-retrieval-latency/SKILL.md`: Reduce
83
+ retrieval latency with indexing, caching, and query optimization.
84
+
85
+ ## Success Criteria
86
+
87
+ - The selected RAG pattern matches the user's actual bottleneck.
88
+ - Retrieval quality improves without adding unnecessary architecture.
89
+ - The implementation keeps metadata, evaluation, and production constraints in
90
+ view from the start.
91
+ - External references are used for real implementation details instead of
92
+ copying large code blocks into this skill.