qualia-framework 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (261) hide show
  1. package/README.md +50 -0
  2. package/bin/cli.js +519 -0
  3. package/framework/agents/architecture-strategist.md +53 -0
  4. package/framework/agents/backend-agent.md +150 -0
  5. package/framework/agents/code-simplicity-reviewer.md +86 -0
  6. package/framework/agents/frontend-agent.md +111 -0
  7. package/framework/agents/kieran-typescript-reviewer.md +96 -0
  8. package/framework/agents/performance-oracle.md +111 -0
  9. package/framework/agents/qualia-codebase-mapper.md +760 -0
  10. package/framework/agents/qualia-debugger.md +1203 -0
  11. package/framework/agents/qualia-executor.md +881 -0
  12. package/framework/agents/qualia-integration-checker.md +423 -0
  13. package/framework/agents/qualia-phase-researcher.md +453 -0
  14. package/framework/agents/qualia-plan-checker.md +699 -0
  15. package/framework/agents/qualia-planner.md +1241 -0
  16. package/framework/agents/qualia-project-researcher.md +602 -0
  17. package/framework/agents/qualia-research-synthesizer.md +236 -0
  18. package/framework/agents/qualia-roadmapper.md +605 -0
  19. package/framework/agents/qualia-verifier.md +685 -0
  20. package/framework/agents/team-orchestrator.md +228 -0
  21. package/framework/agents/teams/full-stack-team.md +48 -0
  22. package/framework/agents/teams/optimize-team.md +53 -0
  23. package/framework/agents/teams/review-team.md +62 -0
  24. package/framework/agents/teams/ship-team.md +86 -0
  25. package/framework/agents/test-agent.md +182 -0
  26. package/framework/askpass.sh +2 -0
  27. package/framework/commands/design.md +53 -0
  28. package/framework/commands/quick-db.md +22 -0
  29. package/framework/config/retention.json +35 -0
  30. package/framework/core/PRINCIPLES.md +77 -0
  31. package/framework/hooks/auto-format.sh +45 -0
  32. package/framework/hooks/block-env-edit.sh +42 -0
  33. package/framework/hooks/branch-guard.sh +46 -0
  34. package/framework/hooks/confirm-delete.sh +56 -0
  35. package/framework/hooks/migration-validate.sh +68 -0
  36. package/framework/hooks/notification-speak.sh +15 -0
  37. package/framework/hooks/pre-commit.sh +80 -0
  38. package/framework/hooks/pre-compact.sh +55 -0
  39. package/framework/hooks/pre-deploy-gate.sh +151 -0
  40. package/framework/hooks/qualia-colors.sh +32 -0
  41. package/framework/hooks/retention-cleanup.sh +43 -0
  42. package/framework/hooks/save-session-state.sh +153 -0
  43. package/framework/hooks/session-context-loader.sh +28 -0
  44. package/framework/hooks/session-learn.sh +30 -0
  45. package/framework/knowledge/claudecode-bible.md +1384 -0
  46. package/framework/knowledge/client-prefs.md +22 -0
  47. package/framework/knowledge/common-fixes.md +25 -0
  48. package/framework/knowledge/deployment-map.md +35 -0
  49. package/framework/knowledge/email-signature.html +1 -0
  50. package/framework/knowledge/employees.md +8 -0
  51. package/framework/knowledge/learned-patterns.md +51 -0
  52. package/framework/knowledge/optimization-research-2026.md +137 -0
  53. package/framework/knowledge/qualia-context.md +67 -0
  54. package/framework/knowledge/supabase-patterns.md +50 -0
  55. package/framework/knowledge/voice-agent-patterns.md +46 -0
  56. package/framework/qualia-engine/VERSION +1 -0
  57. package/framework/qualia-engine/bin/qualia-tools.js +2160 -0
  58. package/framework/qualia-engine/bin/qualia-tools.test.js +1054 -0
  59. package/framework/qualia-engine/references/checkpoints.md +775 -0
  60. package/framework/qualia-engine/references/continuation-format.md +249 -0
  61. package/framework/qualia-engine/references/decimal-phase-calculation.md +65 -0
  62. package/framework/qualia-engine/references/design-quality.md +56 -0
  63. package/framework/qualia-engine/references/git-integration.md +254 -0
  64. package/framework/qualia-engine/references/git-planning-commit.md +50 -0
  65. package/framework/qualia-engine/references/model-profile-resolution.md +32 -0
  66. package/framework/qualia-engine/references/model-profiles.md +73 -0
  67. package/framework/qualia-engine/references/phase-argument-parsing.md +61 -0
  68. package/framework/qualia-engine/references/planning-config.md +195 -0
  69. package/framework/qualia-engine/references/questioning.md +141 -0
  70. package/framework/qualia-engine/references/tdd.md +263 -0
  71. package/framework/qualia-engine/references/ui-brand.md +160 -0
  72. package/framework/qualia-engine/references/verification-patterns.md +612 -0
  73. package/framework/qualia-engine/templates/DEBUG.md +159 -0
  74. package/framework/qualia-engine/templates/DESIGN.md +81 -0
  75. package/framework/qualia-engine/templates/UAT.md +247 -0
  76. package/framework/qualia-engine/templates/codebase/architecture.md +255 -0
  77. package/framework/qualia-engine/templates/codebase/concerns.md +310 -0
  78. package/framework/qualia-engine/templates/codebase/conventions.md +307 -0
  79. package/framework/qualia-engine/templates/codebase/integrations.md +280 -0
  80. package/framework/qualia-engine/templates/codebase/stack.md +186 -0
  81. package/framework/qualia-engine/templates/codebase/structure.md +285 -0
  82. package/framework/qualia-engine/templates/codebase/testing.md +480 -0
  83. package/framework/qualia-engine/templates/config.json +35 -0
  84. package/framework/qualia-engine/templates/context.md +283 -0
  85. package/framework/qualia-engine/templates/continue-here.md +78 -0
  86. package/framework/qualia-engine/templates/debug-subagent-prompt.md +91 -0
  87. package/framework/qualia-engine/templates/discovery.md +146 -0
  88. package/framework/qualia-engine/templates/milestone-archive.md +123 -0
  89. package/framework/qualia-engine/templates/milestone.md +115 -0
  90. package/framework/qualia-engine/templates/phase-prompt.md +567 -0
  91. package/framework/qualia-engine/templates/planner-subagent-prompt.md +117 -0
  92. package/framework/qualia-engine/templates/project.md +184 -0
  93. package/framework/qualia-engine/templates/projects/ai-agent.md +156 -0
  94. package/framework/qualia-engine/templates/projects/mobile-app.md +181 -0
  95. package/framework/qualia-engine/templates/projects/voice-agent.md +134 -0
  96. package/framework/qualia-engine/templates/projects/website.md +137 -0
  97. package/framework/qualia-engine/templates/requirements.md +231 -0
  98. package/framework/qualia-engine/templates/research-project/ARCHITECTURE.md +204 -0
  99. package/framework/qualia-engine/templates/research-project/FEATURES.md +147 -0
  100. package/framework/qualia-engine/templates/research-project/PITFALLS.md +200 -0
  101. package/framework/qualia-engine/templates/research-project/STACK.md +120 -0
  102. package/framework/qualia-engine/templates/research-project/SUMMARY.md +170 -0
  103. package/framework/qualia-engine/templates/research.md +552 -0
  104. package/framework/qualia-engine/templates/roadmap.md +202 -0
  105. package/framework/qualia-engine/templates/state.md +176 -0
  106. package/framework/qualia-engine/templates/summary-complex.md +59 -0
  107. package/framework/qualia-engine/templates/summary-minimal.md +41 -0
  108. package/framework/qualia-engine/templates/summary-standard.md +48 -0
  109. package/framework/qualia-engine/templates/summary.md +246 -0
  110. package/framework/qualia-engine/templates/user-setup.md +311 -0
  111. package/framework/qualia-engine/templates/verification-report.md +322 -0
  112. package/framework/qualia-engine/workflows/add-phase.md +179 -0
  113. package/framework/qualia-engine/workflows/add-todo.md +157 -0
  114. package/framework/qualia-engine/workflows/audit-milestone.md +241 -0
  115. package/framework/qualia-engine/workflows/check-todos.md +176 -0
  116. package/framework/qualia-engine/workflows/complete-milestone.md +858 -0
  117. package/framework/qualia-engine/workflows/diagnose-issues.md +219 -0
  118. package/framework/qualia-engine/workflows/discovery-phase.md +289 -0
  119. package/framework/qualia-engine/workflows/discuss-phase.md +534 -0
  120. package/framework/qualia-engine/workflows/execute-phase.md +559 -0
  121. package/framework/qualia-engine/workflows/execute-plan.md +438 -0
  122. package/framework/qualia-engine/workflows/help.md +470 -0
  123. package/framework/qualia-engine/workflows/insert-phase.md +220 -0
  124. package/framework/qualia-engine/workflows/list-phase-assumptions.md +178 -0
  125. package/framework/qualia-engine/workflows/map-codebase.md +327 -0
  126. package/framework/qualia-engine/workflows/new-milestone.md +363 -0
  127. package/framework/qualia-engine/workflows/new-project.md +1037 -0
  128. package/framework/qualia-engine/workflows/pause-work.md +122 -0
  129. package/framework/qualia-engine/workflows/plan-milestone-gaps.md +256 -0
  130. package/framework/qualia-engine/workflows/plan-phase.md +422 -0
  131. package/framework/qualia-engine/workflows/progress.md +354 -0
  132. package/framework/qualia-engine/workflows/quick.md +252 -0
  133. package/framework/qualia-engine/workflows/remove-phase.md +326 -0
  134. package/framework/qualia-engine/workflows/research-phase.md +74 -0
  135. package/framework/qualia-engine/workflows/resume-project.md +306 -0
  136. package/framework/qualia-engine/workflows/set-profile.md +80 -0
  137. package/framework/qualia-engine/workflows/settings.md +145 -0
  138. package/framework/qualia-engine/workflows/transition.md +556 -0
  139. package/framework/qualia-engine/workflows/update.md +197 -0
  140. package/framework/qualia-engine/workflows/verify-phase.md +195 -0
  141. package/framework/qualia-engine/workflows/verify-work.md +625 -0
  142. package/framework/rules/context7.md +11 -0
  143. package/framework/rules/deployment.md +29 -0
  144. package/framework/rules/frontend.md +33 -0
  145. package/framework/rules/security.md +12 -0
  146. package/framework/rules/speed.md +20 -0
  147. package/framework/scripts/__pycache__/say.cpython-314.pyc +0 -0
  148. package/framework/scripts/apply-retention.sh +120 -0
  149. package/framework/scripts/bootstrap-pop-os.sh +354 -0
  150. package/framework/scripts/claude-voice +13 -0
  151. package/framework/scripts/cleanup.sh +131 -0
  152. package/framework/scripts/cowork-mode.sh +141 -0
  153. package/framework/scripts/generate-project-claude-md.sh +153 -0
  154. package/framework/scripts/load-test-webhook.js +172 -0
  155. package/framework/scripts/say.py +236 -0
  156. package/framework/scripts/showcase-video-recorder/ffmpeg-builder.js +167 -0
  157. package/framework/scripts/showcase-video-recorder/playwright-helpers.js +216 -0
  158. package/framework/scripts/speak.py +55 -0
  159. package/framework/scripts/speak.sh +18 -0
  160. package/framework/scripts/status.sh +138 -0
  161. package/framework/scripts/sync-to-framework.sh +65 -0
  162. package/framework/scripts/voice-hotkey.py +227 -0
  163. package/framework/scripts/voice-input.sh +51 -0
  164. package/framework/skills/animate/SKILL.md +202 -0
  165. package/framework/skills/bolder/SKILL.md +144 -0
  166. package/framework/skills/browser-qa/SKILL.md +536 -0
  167. package/framework/skills/clarify/SKILL.md +179 -0
  168. package/framework/skills/colorize/SKILL.md +170 -0
  169. package/framework/skills/critique/SKILL.md +126 -0
  170. package/framework/skills/deep-research/SKILL.md +271 -0
  171. package/framework/skills/delight/SKILL.md +329 -0
  172. package/framework/skills/deploy/SKILL.md +261 -0
  173. package/framework/skills/deploy-verify/SKILL.md +377 -0
  174. package/framework/skills/deploy-verify/scripts/canary-check.sh +206 -0
  175. package/framework/skills/deploy-verify/scripts/check-console-errors.js +147 -0
  176. package/framework/skills/deploy-verify/scripts/check-cwv.js +139 -0
  177. package/framework/skills/deploy-verify/scripts/project-detect.sh +84 -0
  178. package/framework/skills/deploy-verify/scripts/verify.sh +548 -0
  179. package/framework/skills/design-quieter/SKILL.md +130 -0
  180. package/framework/skills/distill/SKILL.md +149 -0
  181. package/framework/skills/docs-lookup/SKILL.md +78 -0
  182. package/framework/skills/fcm-notifications/SKILL.md +125 -0
  183. package/framework/skills/financial-ledger/SKILL.md +1039 -0
  184. package/framework/skills/frontend-master/NOTICE.md +4 -0
  185. package/framework/skills/frontend-master/SKILL.md +127 -0
  186. package/framework/skills/frontend-master/reference/color-and-contrast.md +132 -0
  187. package/framework/skills/frontend-master/reference/interaction-design.md +123 -0
  188. package/framework/skills/frontend-master/reference/motion-design.md +99 -0
  189. package/framework/skills/frontend-master/reference/responsive-design.md +114 -0
  190. package/framework/skills/frontend-master/reference/spatial-design.md +100 -0
  191. package/framework/skills/frontend-master/reference/typography.md +131 -0
  192. package/framework/skills/frontend-master/reference/ux-writing.md +107 -0
  193. package/framework/skills/harden/SKILL.md +357 -0
  194. package/framework/skills/i18n-rtl/SKILL.md +752 -0
  195. package/framework/skills/learn/SKILL.md +71 -0
  196. package/framework/skills/memory/SKILL.md +50 -0
  197. package/framework/skills/mobile-expo/SKILL.md +864 -0
  198. package/framework/skills/mobile-expo/references/store-checklist.md +550 -0
  199. package/framework/skills/nestjs-backend/README.md +73 -0
  200. package/framework/skills/nestjs-backend/SKILL.md +446 -0
  201. package/framework/skills/nestjs-backend/references/templates.md +1173 -0
  202. package/framework/skills/normalize/SKILL.md +79 -0
  203. package/framework/skills/onboard/SKILL.md +242 -0
  204. package/framework/skills/polish/SKILL.md +209 -0
  205. package/framework/skills/pr/SKILL.md +66 -0
  206. package/framework/skills/qualia/SKILL.md +153 -0
  207. package/framework/skills/qualia-add-todo/SKILL.md +68 -0
  208. package/framework/skills/qualia-audit-milestone/SKILL.md +92 -0
  209. package/framework/skills/qualia-check-todos/SKILL.md +55 -0
  210. package/framework/skills/qualia-complete-milestone/SKILL.md +108 -0
  211. package/framework/skills/qualia-debug/SKILL.md +149 -0
  212. package/framework/skills/qualia-design/SKILL.md +203 -0
  213. package/framework/skills/qualia-discuss-phase/SKILL.md +72 -0
  214. package/framework/skills/qualia-execute-phase/SKILL.md +86 -0
  215. package/framework/skills/qualia-help/SKILL.md +67 -0
  216. package/framework/skills/qualia-idk/SKILL.md +352 -0
  217. package/framework/skills/qualia-list-phase-assumptions/SKILL.md +67 -0
  218. package/framework/skills/qualia-new-milestone/SKILL.md +72 -0
  219. package/framework/skills/qualia-new-project/SKILL.md +92 -0
  220. package/framework/skills/qualia-optimize/SKILL.md +417 -0
  221. package/framework/skills/qualia-pause-work/SKILL.md +96 -0
  222. package/framework/skills/qualia-plan-milestone-gaps/SKILL.md +57 -0
  223. package/framework/skills/qualia-plan-phase/SKILL.md +101 -0
  224. package/framework/skills/qualia-progress/SKILL.md +53 -0
  225. package/framework/skills/qualia-quick/SKILL.md +89 -0
  226. package/framework/skills/qualia-research-phase/SKILL.md +88 -0
  227. package/framework/skills/qualia-resume-work/SKILL.md +62 -0
  228. package/framework/skills/qualia-review/SKILL.md +263 -0
  229. package/framework/skills/qualia-start/SKILL.md +182 -0
  230. package/framework/skills/qualia-verify-work/SKILL.md +105 -0
  231. package/framework/skills/qualia-workflow/SKILL.md +130 -0
  232. package/framework/skills/rag/SKILL.md +750 -0
  233. package/framework/skills/responsive/SKILL.md +231 -0
  234. package/framework/skills/retro/SKILL.md +284 -0
  235. package/framework/skills/sakani-conventions/SKILL.md +136 -0
  236. package/framework/skills/sakani-conventions/evals/evals.json +23 -0
  237. package/framework/skills/sakani-conventions/references/entities.md +365 -0
  238. package/framework/skills/sakani-conventions/references/error-codes.md +95 -0
  239. package/framework/skills/seo-master/SKILL.md +490 -0
  240. package/framework/skills/seo-master/references/checklist.md +199 -0
  241. package/framework/skills/seo-master/references/structured-data.md +609 -0
  242. package/framework/skills/ship/SKILL.md +202 -0
  243. package/framework/skills/stack-researcher/SKILL.md +215 -0
  244. package/framework/skills/status/SKILL.md +154 -0
  245. package/framework/skills/status/scripts/health-check.sh +562 -0
  246. package/framework/skills/subscription-payments/SKILL.md +250 -0
  247. package/framework/skills/supabase/SKILL.md +973 -0
  248. package/framework/skills/supabase/references/templates.md +159 -0
  249. package/framework/skills/team/SKILL.md +67 -0
  250. package/framework/skills/test-runner/SKILL.md +202 -0
  251. package/framework/skills/voice-agent/SKILL.md +407 -0
  252. package/framework/skills/zoho-workflow/SKILL.md +51 -0
  253. package/framework/statusline-command.sh +117 -0
  254. package/package.json +24 -0
  255. package/profiles/fawzi.json +16 -0
  256. package/profiles/hasan.json +16 -0
  257. package/profiles/moayad.json +16 -0
  258. package/templates/CLAUDE-owner.md +52 -0
  259. package/templates/CLAUDE.md.hbs +58 -0
  260. package/templates/env.claude.template +12 -0
  261. package/templates/settings.json +141 -0
@@ -0,0 +1,750 @@
1
+ ---
2
+ name: rag
3
+ description: Build production RAG (Retrieval Augmented Generation) systems — Supabase pgvector setup, document chunking, embedding pipelines, retrieval + reranking, Claude generation with context injection. Full Next.js API route wiring.
4
+ tags: [rag, embeddings, pgvector, supabase, claude, ai, vector-search]
5
+ ---
6
+
7
+ # RAG Builder
8
+
9
+ Build production-grade RAG systems on your stack: Supabase pgvector + Claude API + Next.js.
10
+
11
+ **Announce at start:** "Activating RAG builder. Let me set up your retrieval-augmented generation pipeline."
12
+
13
+ ## Phase 1: Database Setup (Supabase pgvector)
14
+
15
+ ### Enable pgvector Extension
16
+
17
+ ```sql
18
+ -- Migration: supabase/migrations/YYYYMMDD_enable_pgvector.sql
19
+ CREATE EXTENSION IF NOT EXISTS vector WITH SCHEMA extensions;
20
+ ```
21
+
22
+ ### Documents Table
23
+
24
+ ```sql
25
+ CREATE TABLE documents (
26
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
27
+ title TEXT NOT NULL,
28
+ source_url TEXT,
29
+ source_type TEXT NOT NULL DEFAULT 'manual', -- 'manual', 'web', 'pdf', 'api'
30
+ metadata JSONB DEFAULT '{}',
31
+ created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
32
+ updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
33
+ );
34
+
35
+ ALTER TABLE documents ENABLE ROW LEVEL SECURITY;
36
+ ```
37
+
38
+ ### Chunks Table (with embeddings)
39
+
40
+ ```sql
41
+ CREATE TABLE document_chunks (
42
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
43
+ document_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
44
+ content TEXT NOT NULL,
45
+ chunk_index INT NOT NULL,
46
+ token_count INT,
47
+ embedding vector(1024), -- Voyage 4-lite default (adjust per provider, see Phase 3)
48
+ metadata JSONB DEFAULT '{}',
49
+ created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
50
+
51
+ UNIQUE(document_id, chunk_index)
52
+ );
53
+
54
+ ALTER TABLE document_chunks ENABLE ROW LEVEL SECURITY;
55
+
56
+ -- HNSW index for fast similarity search (better than ivfflat for < 1M rows)
57
+ CREATE INDEX idx_chunks_embedding ON document_chunks
58
+ USING hnsw (embedding vector_cosine_ops)
59
+ WITH (m = 16, ef_construction = 64);
60
+
61
+ -- For filtering by document
62
+ CREATE INDEX idx_chunks_document ON document_chunks(document_id);
63
+ ```
64
+
65
+ ### Match Function (RPC)
66
+
67
+ ```sql
68
+ CREATE OR REPLACE FUNCTION match_documents(
69
+ query_embedding vector(1024),
70
+ match_threshold FLOAT DEFAULT 0.7,
71
+ match_count INT DEFAULT 5,
72
+ filter_metadata JSONB DEFAULT '{}'
73
+ )
74
+ RETURNS TABLE (
75
+ id UUID,
76
+ document_id UUID,
77
+ content TEXT,
78
+ metadata JSONB,
79
+ similarity FLOAT
80
+ )
81
+ LANGUAGE plpgsql
82
+ AS $$
83
+ BEGIN
84
+ RETURN QUERY
85
+ SELECT
86
+ dc.id,
87
+ dc.document_id,
88
+ dc.content,
89
+ dc.metadata,
90
+ 1 - (dc.embedding <=> query_embedding) AS similarity
91
+ FROM document_chunks dc
92
+ WHERE 1 - (dc.embedding <=> query_embedding) > match_threshold
93
+ AND (filter_metadata = '{}' OR dc.metadata @> filter_metadata)
94
+ ORDER BY dc.embedding <=> query_embedding
95
+ LIMIT match_count;
96
+ END;
97
+ $$;
98
+ ```
99
+
100
+ ### Multi-tenant variant (add user_id or org_id scoping)
101
+
102
+ ```sql
103
+ -- Add user_id to documents for RLS
104
+ ALTER TABLE documents ADD COLUMN user_id UUID REFERENCES auth.users(id);
105
+
106
+ CREATE POLICY "users_read_own_docs" ON documents
107
+ FOR SELECT USING (user_id = auth.uid());
108
+
109
+ CREATE POLICY "users_read_own_chunks" ON document_chunks
110
+ FOR SELECT USING (
111
+ EXISTS (
112
+ SELECT 1 FROM documents d
113
+ WHERE d.id = document_chunks.document_id
114
+ AND d.user_id = auth.uid()
115
+ )
116
+ );
117
+ ```
118
+
119
+ ## Phase 2: Document Chunking
120
+
121
+ ### Chunking Strategy
122
+
123
+ Use **recursive character splitting** with overlap. This is the most reliable general-purpose strategy.
124
+
125
+ ```typescript
126
+ // lib/rag/chunker.ts
127
+
128
+ interface ChunkOptions {
129
+ maxTokens?: number; // default 512
130
+ overlapTokens?: number; // default 50
131
+ separators?: string[];
132
+ }
133
+
134
+ const DEFAULT_SEPARATORS = ['\n\n', '\n', '. ', ', ', ' ', ''];
135
+
136
+ export function chunkText(
137
+ text: string,
138
+ options: ChunkOptions = {}
139
+ ): string[] {
140
+ const {
141
+ maxTokens = 512,
142
+ overlapTokens = 50,
143
+ separators = DEFAULT_SEPARATORS,
144
+ } = options;
145
+
146
+ // Rough token estimate: 1 token ~ 4 chars
147
+ const maxChars = maxTokens * 4;
148
+ const overlapChars = overlapTokens * 4;
149
+
150
+ return recursiveSplit(text, maxChars, overlapChars, separators);
151
+ }
152
+
153
+ function recursiveSplit(
154
+ text: string,
155
+ maxChars: number,
156
+ overlapChars: number,
157
+ separators: string[]
158
+ ): string[] {
159
+ if (text.length <= maxChars) return [text.trim()].filter(Boolean);
160
+
161
+ const sep = separators.find(s => text.includes(s)) ?? '';
162
+ const parts = text.split(sep);
163
+ const chunks: string[] = [];
164
+ let current = '';
165
+
166
+ for (const part of parts) {
167
+ const candidate = current ? current + sep + part : part;
168
+ if (candidate.length > maxChars && current) {
169
+ chunks.push(current.trim());
170
+ // Keep overlap from end of previous chunk
171
+ const overlapText = current.slice(-overlapChars);
172
+ current = overlapText + sep + part;
173
+ } else {
174
+ current = candidate;
175
+ }
176
+ }
177
+ if (current.trim()) chunks.push(current.trim());
178
+
179
+ // Recursively split any chunks that are still too large
180
+ return chunks.flatMap(chunk =>
181
+ chunk.length > maxChars
182
+ ? recursiveSplit(chunk, maxChars, overlapChars, separators.slice(1))
183
+ : [chunk]
184
+ );
185
+ }
186
+
187
+ // Estimate token count (use tiktoken for exact counts if needed)
188
+ export function estimateTokens(text: string): number {
189
+ return Math.ceil(text.length / 4);
190
+ }
191
+ ```
192
+
193
+ ### Specialized chunkers
194
+
195
+ ```typescript
196
+ // Markdown-aware chunking (respects headings)
197
+ export function chunkMarkdown(markdown: string, maxTokens = 512): string[] {
198
+ const sections = markdown.split(/(?=^#{1,3} )/m);
199
+ return sections.flatMap(section =>
200
+ chunkText(section, { maxTokens, separators: ['\n\n', '\n', '. ', ' ', ''] })
201
+ );
202
+ }
203
+
204
+ // Code-aware chunking (respects function boundaries)
205
+ export function chunkCode(code: string, maxTokens = 512): string[] {
206
+ const functionPattern = /(?=(?:export\s+)?(?:async\s+)?(?:function|const|class)\s)/;
207
+ const sections = code.split(functionPattern);
208
+ return sections.flatMap(section =>
209
+ chunkText(section, { maxTokens, separators: ['\n\n', '\n', ' ', ''] })
210
+ );
211
+ }
212
+ ```
213
+
214
+ ## Embedding Model Landscape (March 2026)
215
+
216
+ Pick based on your needs:
217
+
218
+ | Model | Provider | Dims | Price/MTok | Best For |
219
+ |-------|----------|------|------------|----------|
220
+ | **voyage-4-large** | Voyage AI (MongoDB) | 1024 | ~$0.12 | Best retrieval quality (MoE arch) |
221
+ | **voyage-4** | Voyage AI | 1024 | ~$0.06 | Great quality, mid-size cost |
222
+ | **voyage-4-lite** | Voyage AI | 1024 | ~$0.02 | Production sweet spot (quality/cost) |
223
+ | **gemini-embedding-001** | Google | 3072 | Free tier / $0.01 | Highest MTEB score, free quota |
224
+ | **Gemini Embedding 2** | Google | 3072 | Preview | Multimodal (text+image+video+audio) |
225
+ | **text-embedding-3-small** | OpenAI | 1536 | $0.02 | Reliable, mature ecosystem |
226
+ | **text-embedding-3-large** | OpenAI | 3072 | $0.13 | Higher quality OpenAI option |
227
+ | **Cohere Embed v4** | Cohere | 1536 | $0.12 | Multimodal (text+image), 128k ctx |
228
+ | **Qwen3-Embedding-8B** | Qwen (open) | 4096 | Self-host | #1 MTEB multilingual, 32k ctx |
229
+ | **e5-small** | Microsoft (open) | 384 | Self-host | Fastest (<30ms), 100% Top-5 |
230
+
231
+ **Key insights:**
232
+ - **Voyage 4 series** has shared embedding space — you can embed docs with `voyage-4-large` and query with `voyage-4-lite` (asymmetric retrieval, saves cost)
233
+ - **Google deprecated `text-embedding-004`** in Jan 2026 — use `gemini-embedding-001` instead
234
+ - **Gemini Embedding 2** (preview) is the first production multimodal embedding — text, images, video, audio in one vector space
235
+ - All modern models support **Matryoshka embeddings** — reduce dims (e.g. 1024 -> 256) with minimal quality loss
236
+
237
+ ### Recommended default: Voyage 4-lite (best value) or Gemini Embedding 001 (free tier)
238
+
239
+ ## Phase 3: Embedding Pipeline
240
+
241
+ ### Option A: Voyage AI (recommended for retrieval quality)
242
+
243
+ ```typescript
244
+ // lib/rag/embeddings.ts
245
+
246
+ export async function generateEmbedding(
247
+ text: string,
248
+ inputType: 'query' | 'document' = 'query'
249
+ ): Promise<number[]> {
250
+ const response = await fetch('https://api.voyageai.com/v1/embeddings', {
251
+ method: 'POST',
252
+ headers: {
253
+ 'Content-Type': 'application/json',
254
+ 'Authorization': `Bearer ${process.env.VOYAGE_API_KEY}`,
255
+ },
256
+ body: JSON.stringify({
257
+ model: 'voyage-4-lite', // 1024 dims, fast + cheap
258
+ input: [text],
259
+ input_type: inputType, // 'document' when indexing, 'query' when searching
260
+ output_dimension: 1024, // Can reduce to 512/256 via Matryoshka
261
+ }),
262
+ });
263
+ const data = await response.json();
264
+ return data.data[0].embedding;
265
+ }
266
+
267
+ export async function generateEmbeddings(
268
+ texts: string[],
269
+ inputType: 'query' | 'document' = 'document'
270
+ ): Promise<number[][]> {
271
+ // Voyage supports batch embedding
272
+ const response = await fetch('https://api.voyageai.com/v1/embeddings', {
273
+ method: 'POST',
274
+ headers: {
275
+ 'Content-Type': 'application/json',
276
+ 'Authorization': `Bearer ${process.env.VOYAGE_API_KEY}`,
277
+ },
278
+ body: JSON.stringify({
279
+ model: 'voyage-4-lite',
280
+ input: texts,
281
+ input_type: inputType,
282
+ output_dimension: 1024,
283
+ }),
284
+ });
285
+ const data = await response.json();
286
+ return data.data.map((d: { embedding: number[] }) => d.embedding);
287
+ }
288
+ ```
289
+
290
+ ### Option B: Google Gemini Embedding (free tier, highest MTEB)
291
+
292
+ ```typescript
293
+ // lib/rag/embeddings-gemini.ts
294
+ import { GoogleGenAI } from '@google/genai';
295
+
296
+ const genai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
297
+
298
+ export async function generateEmbedding(text: string): Promise<number[]> {
299
+ const response = await genai.models.embedContent({
300
+ model: 'gemini-embedding-001',
301
+ contents: text,
302
+ config: { outputDimensionality: 1536 }, // Default 3072, can reduce to 1536/768
303
+ });
304
+ return response.embeddings![0].values!;
305
+ }
306
+
307
+ export async function generateEmbeddings(texts: string[]): Promise<number[][]> {
308
+ // Batch embed
309
+ const results = await Promise.all(
310
+ texts.map(text =>
311
+ genai.models.embedContent({
312
+ model: 'gemini-embedding-001',
313
+ contents: text,
314
+ config: { outputDimensionality: 1536 },
315
+ })
316
+ )
317
+ );
318
+ return results.map(r => r.embeddings![0].values!);
319
+ }
320
+ ```
321
+
322
+ ### Option C: OpenAI (mature, wide ecosystem support)
323
+
324
+ ```typescript
325
+ // lib/rag/embeddings-openai.ts
326
+ import OpenAI from 'openai';
327
+
328
+ const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
329
+
330
+ export async function generateEmbedding(text: string): Promise<number[]> {
331
+ const response = await openai.embeddings.create({
332
+ model: 'text-embedding-3-small', // $0.02/MTok, 1536 dims
333
+ input: text,
334
+ });
335
+ return response.data[0].embedding;
336
+ }
337
+
338
+ export async function generateEmbeddings(texts: string[]): Promise<number[][]> {
339
+ const response = await openai.embeddings.create({
340
+ model: 'text-embedding-3-small',
341
+ input: texts, // Up to 2048 inputs per batch
342
+ });
343
+ return response.data.map(d => d.embedding);
344
+ }
345
+ ```
346
+
347
+ ### Vector dimension by provider
348
+
349
+ ```sql
350
+ -- Match your chosen provider:
351
+ embedding vector(1024) -- Voyage 4 series (default 1024)
352
+ embedding vector(1536) -- OpenAI text-embedding-3-small / Gemini (reduced) / Cohere Embed v4
353
+ embedding vector(3072) -- OpenAI text-embedding-3-large / Gemini (full)
354
+ embedding vector(384) -- e5-small (self-hosted, fastest)
355
+ ```
356
+
357
+ ### Ingest Pipeline
358
+
359
+ ```typescript
360
+ // lib/rag/ingest.ts
361
+ import { createClient } from '@/lib/supabase/server';
362
+ import { chunkText, estimateTokens } from './chunker';
363
+ import { generateEmbeddings } from './embeddings';
364
+
365
+ export async function ingestDocument(
366
+ title: string,
367
+ content: string,
368
+ metadata: Record<string, unknown> = {}
369
+ ) {
370
+ const supabase = await createClient();
371
+
372
+ // 1. Create document record
373
+ const { data: doc, error: docError } = await supabase
374
+ .from('documents')
375
+ .insert({ title, metadata, source_type: 'manual' })
376
+ .select('id')
377
+ .single();
378
+
379
+ if (docError) throw docError;
380
+
381
+ // 2. Chunk the content
382
+ const chunks = chunkText(content, { maxTokens: 512, overlapTokens: 50 });
383
+
384
+ // 3. Generate embeddings in batches
385
+ const batchSize = 100;
386
+ for (let i = 0; i < chunks.length; i += batchSize) {
387
+ const batch = chunks.slice(i, i + batchSize);
388
+ const embeddings = await generateEmbeddings(batch);
389
+
390
+ // 4. Insert chunks with embeddings
391
+ const rows = batch.map((chunk, j) => ({
392
+ document_id: doc.id,
393
+ content: chunk,
394
+ chunk_index: i + j,
395
+ token_count: estimateTokens(chunk),
396
+ embedding: JSON.stringify(embeddings[j]),
397
+ }));
398
+
399
+ const { error } = await supabase.from('document_chunks').insert(rows);
400
+ if (error) throw error;
401
+ }
402
+
403
+ return doc.id;
404
+ }
405
+ ```
406
+
407
+ ## Phase 4: Retrieval
408
+
409
+ ### Basic Similarity Search
410
+
411
+ ```typescript
412
+ // lib/rag/retrieve.ts
413
+ import { createClient } from '@/lib/supabase/server';
414
+ import { generateEmbedding } from './embeddings';
415
+
416
+ export async function retrieveContext(
417
+ query: string,
418
+ options: {
419
+ matchThreshold?: number;
420
+ matchCount?: number;
421
+ filterMetadata?: Record<string, unknown>;
422
+ } = {}
423
+ ) {
424
+ const {
425
+ matchThreshold = 0.7,
426
+ matchCount = 5,
427
+ filterMetadata = {},
428
+ } = options;
429
+
430
+ const supabase = await createClient();
431
+ const queryEmbedding = await generateEmbedding(query);
432
+
433
+ const { data, error } = await supabase.rpc('match_documents', {
434
+ query_embedding: JSON.stringify(queryEmbedding),
435
+ match_threshold: matchThreshold,
436
+ match_count: matchCount,
437
+ filter_metadata: filterMetadata,
438
+ });
439
+
440
+ if (error) throw error;
441
+ return data as Array<{
442
+ id: string;
443
+ document_id: string;
444
+ content: string;
445
+ metadata: Record<string, unknown>;
446
+ similarity: number;
447
+ }>;
448
+ }
449
+ ```
450
+
451
+ ### Hybrid Search (vector + full-text)
452
+
453
+ ```sql
454
+ -- Add full-text search column
455
+ ALTER TABLE document_chunks ADD COLUMN fts tsvector
456
+ GENERATED ALWAYS AS (to_tsvector('english', content)) STORED;
457
+
458
+ CREATE INDEX idx_chunks_fts ON document_chunks USING gin(fts);
459
+
460
+ -- Hybrid search function
461
+ CREATE OR REPLACE FUNCTION hybrid_search(
462
+ query_text TEXT,
463
+ query_embedding vector(1024),
464
+ match_count INT DEFAULT 5,
465
+ keyword_weight FLOAT DEFAULT 0.3,
466
+ semantic_weight FLOAT DEFAULT 0.7
467
+ )
468
+ RETURNS TABLE (
469
+ id UUID,
470
+ document_id UUID,
471
+ content TEXT,
472
+ metadata JSONB,
473
+ score FLOAT
474
+ )
475
+ LANGUAGE plpgsql
476
+ AS $$
477
+ BEGIN
478
+ RETURN QUERY
479
+ WITH semantic AS (
480
+ SELECT dc.id, 1 - (dc.embedding <=> query_embedding) AS sim
481
+ FROM document_chunks dc
482
+ ORDER BY dc.embedding <=> query_embedding
483
+ LIMIT match_count * 2
484
+ ),
485
+ keyword AS (
486
+ SELECT dc.id, ts_rank(dc.fts, websearch_to_tsquery('english', query_text)) AS rank
487
+ FROM document_chunks dc
488
+ WHERE dc.fts @@ websearch_to_tsquery('english', query_text)
489
+ LIMIT match_count * 2
490
+ ),
491
+ combined AS (
492
+ SELECT
493
+ COALESCE(s.id, k.id) AS chunk_id,
494
+ (COALESCE(s.sim, 0) * semantic_weight + COALESCE(k.rank, 0) * keyword_weight) AS combined_score
495
+ FROM semantic s
496
+ FULL OUTER JOIN keyword k ON s.id = k.id
497
+ )
498
+ SELECT dc.id, dc.document_id, dc.content, dc.metadata, c.combined_score AS score
499
+ FROM combined c
500
+ JOIN document_chunks dc ON dc.id = c.chunk_id
501
+ ORDER BY c.combined_score DESC
502
+ LIMIT match_count;
503
+ END;
504
+ $$;
505
+ ```
506
+
507
+ ### Reranking (optional, improves quality)
508
+
509
+ ```typescript
510
+ // lib/rag/rerank.ts
511
+
512
+ export async function rerankResults(
513
+ query: string,
514
+ results: Array<{ content: string; [key: string]: unknown }>
515
+ ): Promise<typeof results> {
516
+ // Cohere Rerank API
517
+ const response = await fetch('https://api.cohere.com/v2/rerank', {
518
+ method: 'POST',
519
+ headers: {
520
+ 'Content-Type': 'application/json',
521
+ 'Authorization': `Bearer ${process.env.COHERE_API_KEY}`,
522
+ },
523
+ body: JSON.stringify({
524
+ model: 'rerank-v3.5',
525
+ query,
526
+ documents: results.map(r => r.content),
527
+ top_n: Math.min(results.length, 5),
528
+ }),
529
+ });
530
+
531
+ const data = await response.json();
532
+ return data.results.map((r: { index: number }) => results[r.index]);
533
+ }
534
+ ```
535
+
536
+ ## Phase 5: Generation (Claude)
537
+
538
+ ### RAG Query with Claude
539
+
540
+ ```typescript
541
+ // lib/rag/generate.ts
542
+ import Anthropic from '@anthropic-ai/sdk';
543
+ import { retrieveContext } from './retrieve';
544
+
545
+ const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
546
+
547
+ export async function ragQuery(
548
+ userQuery: string,
549
+ options: {
550
+ systemPrompt?: string;
551
+ matchCount?: number;
552
+ matchThreshold?: number;
553
+ } = {}
554
+ ) {
555
+ const {
556
+ systemPrompt = 'You are a helpful assistant. Answer questions based on the provided context. If the context does not contain the answer, say so clearly.',
557
+ matchCount = 5,
558
+ matchThreshold = 0.7,
559
+ } = options;
560
+
561
+ // 1. Retrieve relevant context
562
+ const context = await retrieveContext(userQuery, { matchCount, matchThreshold });
563
+
564
+ if (context.length === 0) {
565
+ return {
566
+ answer: 'I could not find relevant information to answer your question.',
567
+ sources: [],
568
+ };
569
+ }
570
+
571
+ // 2. Build context block
572
+ const contextBlock = context
573
+ .map((c, i) => `[Source ${i + 1}] (similarity: ${c.similarity.toFixed(3)})\n${c.content}`)
574
+ .join('\n\n---\n\n');
575
+
576
+ // 3. Generate with Claude
577
+ const message = await anthropic.messages.create({
578
+ model: 'claude-sonnet-4-6',
579
+ max_tokens: 1024,
580
+ system: `${systemPrompt}\n\n<context>\n${contextBlock}\n</context>`,
581
+ messages: [{ role: 'user', content: userQuery }],
582
+ });
583
+
584
+ const answer = message.content[0].type === 'text' ? message.content[0].text : '';
585
+
586
+ return {
587
+ answer,
588
+ sources: context.map(c => ({
589
+ content: c.content.slice(0, 200),
590
+ similarity: c.similarity,
591
+ document_id: c.document_id,
592
+ })),
593
+ usage: message.usage,
594
+ };
595
+ }
596
+ ```
597
+
598
+ ### Streaming variant
599
+
600
+ ```typescript
601
+ // lib/rag/generate-stream.ts
602
+ import Anthropic from '@anthropic-ai/sdk';
603
+ import { retrieveContext } from './retrieve';
604
+
605
+ const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
606
+
607
+ export async function ragQueryStream(
608
+ userQuery: string,
609
+ systemPrompt = 'Answer based on the provided context.',
610
+ ) {
611
+ const context = await retrieveContext(userQuery, { matchCount: 5 });
612
+
613
+ const contextBlock = context
614
+ .map((c, i) => `[Source ${i + 1}]\n${c.content}`)
615
+ .join('\n\n---\n\n');
616
+
617
+ return anthropic.messages.stream({
618
+ model: 'claude-sonnet-4-6',
619
+ max_tokens: 1024,
620
+ system: `${systemPrompt}\n\n<context>\n${contextBlock}\n</context>`,
621
+ messages: [{ role: 'user', content: userQuery }],
622
+ });
623
+ }
624
+ ```
625
+
626
+ ## Phase 6: Next.js API Routes
627
+
628
+ ### Chat endpoint (streaming)
629
+
630
+ ```typescript
631
+ // app/api/chat/route.ts
632
+ import { ragQueryStream } from '@/lib/rag/generate-stream';
633
+
634
+ export async function POST(req: Request) {
635
+ const { message } = await req.json();
636
+
637
+ if (!message || typeof message !== 'string') {
638
+ return Response.json({ error: 'Message is required' }, { status: 400 });
639
+ }
640
+
641
+ const stream = await ragQueryStream(message);
642
+
643
+ return new Response(stream.toReadableStream(), {
644
+ headers: { 'Content-Type': 'text/event-stream' },
645
+ });
646
+ }
647
+ ```
648
+
649
+ ### Ingest endpoint
650
+
651
+ ```typescript
652
+ // app/api/ingest/route.ts
653
+ import { ingestDocument } from '@/lib/rag/ingest';
654
+ import { z } from 'zod';
655
+
656
+ const IngestSchema = z.object({
657
+ title: z.string().min(1),
658
+ content: z.string().min(1),
659
+ metadata: z.record(z.unknown()).optional(),
660
+ });
661
+
662
+ export async function POST(req: Request) {
663
+ const body = await req.json();
664
+ const parsed = IngestSchema.safeParse(body);
665
+
666
+ if (!parsed.success) {
667
+ return Response.json({ error: parsed.error.flatten() }, { status: 400 });
668
+ }
669
+
670
+ const docId = await ingestDocument(
671
+ parsed.data.title,
672
+ parsed.data.content,
673
+ parsed.data.metadata
674
+ );
675
+
676
+ return Response.json({ documentId: docId });
677
+ }
678
+ ```
679
+
680
+ ## Quick Start Checklist
681
+
682
+ When user asks to build RAG, follow this order:
683
+
684
+ 1. **Database**: Run pgvector migration (Phase 1)
685
+ 2. **Chunker**: Create `lib/rag/chunker.ts` (Phase 2)
686
+ 3. **Embeddings**: Create `lib/rag/embeddings.ts` with chosen provider (Phase 3)
687
+ 4. **Ingest**: Create `lib/rag/ingest.ts` (Phase 3)
688
+ 5. **Retrieve**: Create `lib/rag/retrieve.ts` (Phase 4)
689
+ 6. **Generate**: Create `lib/rag/generate.ts` (Phase 5)
690
+ 7. **API Routes**: Wire up endpoints (Phase 6)
691
+ 8. **Test**: Ingest a sample doc, query it, verify results
692
+
693
+ ## Key Decisions to Ask User
694
+
695
+ - **Embedding provider**: Voyage 4-lite (best value), Gemini Embedding 001 (free tier, highest MTEB), or OpenAI (mature ecosystem)?
696
+ - **Vector dimensions**: 1024 (Voyage 4), 1536 (OpenAI/Gemini reduced), 3072 (Gemini/OpenAI full)?
697
+ - **Hybrid search**: Pure vector or vector + full-text keyword? (hybrid recommended for production)
698
+ - **Reranking**: Add Cohere rerank step? Pair Voyage with Voyage Reranker, or use Cohere rerank-v3.5?
699
+ - **Multi-tenant**: Scope documents per user/org?
700
+ - **Generation model**: Claude Sonnet 4.6 (fast + cheap) vs Opus 4.6 (highest quality)?
701
+ - **Multimodal**: Need image/video/audio embeddings? Use Gemini Embedding 2 or voyage-multimodal-3.5
702
+ - **Asymmetric retrieval**: Voyage 4 shared space lets you embed docs with large model, query with lite (saves cost)
703
+
704
+ ## Environment Variables Needed
705
+
706
+ ```env
707
+ # Embeddings (pick one)
708
+ VOYAGE_API_KEY=pa-... # Voyage 4 series (recommended)
709
+ GEMINI_API_KEY=... # Google Gemini Embedding
710
+ OPENAI_API_KEY=sk-... # OpenAI text-embedding-3
711
+
712
+ # Generation
713
+ ANTHROPIC_API_KEY=sk-ant-...
714
+
715
+ # Optional: Reranking
716
+ COHERE_API_KEY=...
717
+ ```
718
+
719
+ ## Do You Need Pinecone?
720
+
721
+ **No.** Supabase pgvector handles everything for typical RAG workloads:
722
+ - HNSW indexes for fast similarity search
723
+ - Hybrid search (vector + full-text) via SQL
724
+ - RLS for multi-tenant isolation
725
+ - No extra service, no extra cost, no vendor lock-in
726
+
727
+ **When Pinecone makes sense** (rare for Qualia projects):
728
+ - 10M+ vectors where pgvector HNSW gets slow
729
+ - Need serverless auto-scaling with zero ops
730
+ - Multi-region replication requirements
731
+ - Already paying for Pinecone in another system
732
+
733
+ For your scale (< 1M vectors per project), Supabase pgvector is the right call.
734
+
735
+ ## Performance Tips
736
+
737
+ - **Chunk size**: 512 tokens is the sweet spot. Too small = noisy, too large = diluted.
738
+ - **Overlap**: 50-100 tokens prevents splitting context at chunk boundaries.
739
+ - **HNSW index**: Use `ef_construction=64, m=16` for < 1M rows. Increase for larger datasets.
740
+ - **Batch embeddings**: Always batch (up to 2048 per OpenAI call). Never embed one at a time.
741
+ - **Cache embeddings**: Store query embeddings for repeated queries.
742
+ - **Threshold tuning**: Start at 0.7, lower to 0.5 if too few results, raise to 0.8 if too noisy.
743
+
744
+ ## Trigger Phrases
745
+
746
+ - "build RAG" / "set up RAG" / "create RAG pipeline"
747
+ - "vector search" / "semantic search" / "pgvector"
748
+ - "embed documents" / "embedding pipeline"
749
+ - "knowledge base" / "document Q&A" / "chat with docs"
750
+ - "retrieval augmented generation"