ccbot-cli 2.0.0 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/bin/adapters/claude.js +150 -0
- package/bin/adapters/codex.js +439 -0
- package/bin/install.js +509 -349
- package/bin/lib/ccline.js +82 -0
- package/bin/lib/utils.js +87 -34
- package/bin/uninstall.js +48 -0
- package/config/AGENTS.md +630 -0
- package/config/CLAUDE.md +229 -20
- package/config/ccline/config.toml +161 -0
- package/config/codex-config.example.toml +22 -0
- package/config/settings.example.json +32 -0
- package/output-styles/abyss-cultivator.md +399 -0
- package/package.json +14 -5
- package/skills/SKILL.md +159 -0
- package/skills/domains/ai/SKILL.md +34 -0
- package/skills/domains/ai/agent-dev.md +242 -0
- package/skills/domains/ai/llm-security.md +288 -0
- package/skills/domains/ai/prompt-and-eval.md +279 -0
- package/skills/domains/ai/rag-system.md +542 -0
- package/skills/domains/architecture/SKILL.md +42 -0
- package/skills/domains/architecture/api-design.md +225 -0
- package/skills/domains/architecture/caching.md +299 -0
- package/skills/domains/architecture/cloud-native.md +285 -0
- package/skills/domains/architecture/message-queue.md +329 -0
- package/skills/domains/architecture/security-arch.md +297 -0
- package/skills/domains/data-engineering/SKILL.md +207 -0
- package/skills/domains/development/SKILL.md +46 -0
- package/skills/domains/development/cpp.md +246 -0
- package/skills/domains/development/go.md +323 -0
- package/skills/domains/development/java.md +277 -0
- package/skills/domains/development/python.md +288 -0
- package/skills/domains/development/rust.md +313 -0
- package/skills/domains/development/shell.md +313 -0
- package/skills/domains/development/typescript.md +277 -0
- package/skills/domains/devops/SKILL.md +39 -0
- package/skills/domains/devops/cost-optimization.md +272 -0
- package/skills/domains/devops/database.md +217 -0
- package/skills/domains/devops/devsecops.md +198 -0
- package/skills/domains/devops/git-workflow.md +181 -0
- package/skills/domains/devops/observability.md +280 -0
- package/skills/domains/devops/performance.md +336 -0
- package/skills/domains/devops/testing.md +283 -0
- package/skills/domains/frontend-design/SKILL.md +38 -0
- package/skills/domains/frontend-design/claymorphism/SKILL.md +119 -0
- package/skills/domains/frontend-design/claymorphism/references/tokens.css +52 -0
- package/skills/domains/frontend-design/component-patterns.md +202 -0
- package/skills/domains/frontend-design/engineering.md +287 -0
- package/skills/domains/frontend-design/glassmorphism/SKILL.md +140 -0
- package/skills/domains/frontend-design/glassmorphism/references/tokens.css +32 -0
- package/skills/domains/frontend-design/liquid-glass/SKILL.md +137 -0
- package/skills/domains/frontend-design/liquid-glass/references/tokens.css +81 -0
- package/skills/domains/frontend-design/neubrutalism/SKILL.md +143 -0
- package/skills/domains/frontend-design/neubrutalism/references/tokens.css +44 -0
- package/skills/domains/frontend-design/state-management.md +680 -0
- package/skills/domains/frontend-design/ui-aesthetics.md +110 -0
- package/skills/domains/frontend-design/ux-principles.md +156 -0
- package/skills/domains/infrastructure/SKILL.md +200 -0
- package/skills/domains/mobile/SKILL.md +224 -0
- package/skills/domains/orchestration/SKILL.md +29 -0
- package/skills/domains/orchestration/multi-agent.md +263 -0
- package/skills/domains/security/SKILL.md +54 -0
- package/skills/domains/security/blue-team.md +436 -0
- package/skills/domains/security/code-audit.md +265 -0
- package/skills/domains/security/pentest.md +226 -0
- package/skills/domains/security/red-team.md +375 -0
- package/skills/domains/security/threat-intel.md +372 -0
- package/skills/domains/security/vuln-research.md +369 -0
- package/skills/orchestration/multi-agent/SKILL.md +493 -0
- package/skills/run_skill.js +129 -0
- package/skills/tools/gen-docs/SKILL.md +116 -0
- package/skills/tools/gen-docs/scripts/doc_generator.js +435 -0
- package/skills/tools/lib/shared.js +98 -0
- package/skills/tools/verify-change/SKILL.md +140 -0
- package/skills/tools/verify-change/scripts/change_analyzer.js +289 -0
- package/skills/tools/verify-module/SKILL.md +127 -0
- package/skills/tools/verify-module/scripts/module_scanner.js +171 -0
- package/skills/tools/verify-quality/SKILL.md +160 -0
- package/skills/tools/verify-quality/scripts/quality_checker.js +337 -0
- package/skills/tools/verify-security/SKILL.md +143 -0
- package/skills/tools/verify-security/scripts/security_scanner.js +283 -0
- package/bin/lib/registry.js +0 -61
- package/config/.claudeignore +0 -11
|
@@ -0,0 +1,542 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: rag-system
|
|
3
|
+
description: RAG 检索增强生成架构。向量数据库、Embedding、检索策略、重排算法、混合检索。当用户提到 RAG、检索增强、向量数据库、Embedding、重排、LangChain、LlamaIndex 时使用。
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# 🔮 丹鼎秘典 · RAG 系统 (Retrieval-Augmented Generation)
|
|
7
|
+
|
|
8
|
+
## RAG 架构
|
|
9
|
+
|
|
10
|
+
```
|
|
11
|
+
查询 → Embedding → 向量检索 → 重排 → 上下文注入 → LLM 生成
|
|
12
|
+
│ │ │ │ │ │
|
|
13
|
+
└─ 改写 ──┴─ 混合检索 ─┴─ 相关性 ─┴─ 压缩 ──┴─ 答案 + 引用
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
### 核心流程
|
|
17
|
+
```python
|
|
18
|
+
from langchain.embeddings import OpenAIEmbeddings
|
|
19
|
+
from langchain.vectorstores import Chroma
|
|
20
|
+
from langchain.chat_models import ChatOpenAI
|
|
21
|
+
from langchain.chains import RetrievalQA
|
|
22
|
+
|
|
23
|
+
# 1. 文档加载与切分
|
|
24
|
+
from langchain.document_loaders import TextLoader
|
|
25
|
+
from langchain.text_splitter import RecursiveCharacterTextSplitter
|
|
26
|
+
|
|
27
|
+
loader = TextLoader("docs.txt")
|
|
28
|
+
documents = loader.load()
|
|
29
|
+
|
|
30
|
+
splitter = RecursiveCharacterTextSplitter(
|
|
31
|
+
chunk_size=1000,
|
|
32
|
+
chunk_overlap=200,
|
|
33
|
+
separators=["\n\n", "\n", "。", ".", " "]
|
|
34
|
+
)
|
|
35
|
+
chunks = splitter.split_documents(documents)
|
|
36
|
+
|
|
37
|
+
# 2. 向量化与存储
|
|
38
|
+
embeddings = OpenAIEmbeddings()
|
|
39
|
+
vectorstore = Chroma.from_documents(chunks, embeddings)
|
|
40
|
+
|
|
41
|
+
# 3. 检索与生成
|
|
42
|
+
llm = ChatOpenAI(model="gpt-4", temperature=0)
|
|
43
|
+
qa_chain = RetrievalQA.from_chain_type(
|
|
44
|
+
llm=llm,
|
|
45
|
+
retriever=vectorstore.as_retriever(search_kwargs={"k": 5}),
|
|
46
|
+
return_source_documents=True
|
|
47
|
+
)
|
|
48
|
+
|
|
49
|
+
result = qa_chain({"query": "什么是 RAG?"})
|
|
50
|
+
print(result["result"])
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## 向量数据库对比
|
|
54
|
+
|
|
55
|
+
| 数据库 | 类型 | 索引算法 | 适用场景 | 部署 |
|
|
56
|
+
|--------|------|----------|----------|------|
|
|
57
|
+
| Pinecone | 托管 | HNSW | 生产级、高并发 | 云端 |
|
|
58
|
+
| Weaviate | 开源 | HNSW | 多模态、GraphQL | 自托管/云 |
|
|
59
|
+
| Qdrant | 开源 | HNSW | 高性能、过滤 | 自托管/云 |
|
|
60
|
+
| Chroma | 开源 | HNSW | 快速原型、本地 | 本地/内存 |
|
|
61
|
+
| Milvus | 开源 | IVF/HNSW | 大规模、分布式 | 自托管 |
|
|
62
|
+
| Faiss | 库 | IVF/PQ | 研究、离线 | 本地 |
|
|
63
|
+
|
|
64
|
+
### Pinecone 示例
|
|
65
|
+
```python
|
|
66
|
+
import pinecone
|
|
67
|
+
from langchain.vectorstores import Pinecone
|
|
68
|
+
|
|
69
|
+
pinecone.init(api_key="YOUR_KEY", environment="us-west1-gcp")
|
|
70
|
+
|
|
71
|
+
index_name = "rag-index"
|
|
72
|
+
if index_name not in pinecone.list_indexes():
|
|
73
|
+
pinecone.create_index(
|
|
74
|
+
name=index_name,
|
|
75
|
+
dimension=1536, # OpenAI ada-002
|
|
76
|
+
metric="cosine"
|
|
77
|
+
)
|
|
78
|
+
|
|
79
|
+
vectorstore = Pinecone.from_documents(
|
|
80
|
+
documents=chunks,
|
|
81
|
+
embedding=embeddings,
|
|
82
|
+
index_name=index_name
|
|
83
|
+
)
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### Qdrant 示例
|
|
87
|
+
```python
|
|
88
|
+
from qdrant_client import QdrantClient
|
|
89
|
+
from langchain.vectorstores import Qdrant
|
|
90
|
+
|
|
91
|
+
client = QdrantClient(host="localhost", port=6333)
|
|
92
|
+
|
|
93
|
+
vectorstore = Qdrant.from_documents(
|
|
94
|
+
documents=chunks,
|
|
95
|
+
embedding=embeddings,
|
|
96
|
+
collection_name="knowledge_base",
|
|
97
|
+
client=client
|
|
98
|
+
)
|
|
99
|
+
|
|
100
|
+
# 带过滤的检索
|
|
101
|
+
results = vectorstore.similarity_search(
|
|
102
|
+
query="RAG 架构",
|
|
103
|
+
k=5,
|
|
104
|
+
filter={"source": "technical_docs"}
|
|
105
|
+
)
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
## Embedding 模型选择
|
|
109
|
+
|
|
110
|
+
### 模型对比
|
|
111
|
+
| 模型 | 维度 | 性能 | 成本 | 适用场景 |
|
|
112
|
+
|------|------|------|------|----------|
|
|
113
|
+
| OpenAI ada-002 | 1536 | 高 | 中 | 通用、多语言 |
|
|
114
|
+
| Cohere embed-v3 | 1024 | 高 | 中 | 多语言、压缩 |
|
|
115
|
+
| BGE-large-zh | 1024 | 高 | 免费 | 中文优化 |
|
|
116
|
+
| E5-large-v2 | 1024 | 中 | 免费 | 开源、通用 |
|
|
117
|
+
| text2vec-base | 768 | 中 | 免费 | 中文、轻量 |
|
|
118
|
+
|
|
119
|
+
### 本地 Embedding
|
|
120
|
+
```python
|
|
121
|
+
from langchain.embeddings import HuggingFaceEmbeddings
|
|
122
|
+
|
|
123
|
+
# BGE 中文模型
|
|
124
|
+
embeddings = HuggingFaceEmbeddings(
|
|
125
|
+
model_name="BAAI/bge-large-zh-v1.5",
|
|
126
|
+
model_kwargs={'device': 'cuda'},
|
|
127
|
+
encode_kwargs={'normalize_embeddings': True}
|
|
128
|
+
)
|
|
129
|
+
|
|
130
|
+
# 批量编码
|
|
131
|
+
texts = ["文档1", "文档2", "文档3"]
|
|
132
|
+
vectors = embeddings.embed_documents(texts)
|
|
133
|
+
|
|
134
|
+
# 查询编码(带指令)
|
|
135
|
+
query_vector = embeddings.embed_query("为这个句子生成表示")
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### 多模态 Embedding
|
|
139
|
+
```python
|
|
140
|
+
from langchain.embeddings import OpenAIEmbeddings
|
|
141
|
+
|
|
142
|
+
# CLIP 图文联合
|
|
143
|
+
class MultiModalEmbedding:
|
|
144
|
+
def __init__(self):
|
|
145
|
+
self.text_model = OpenAIEmbeddings()
|
|
146
|
+
self.image_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
|
|
147
|
+
|
|
148
|
+
def embed_image(self, image_path: str):
|
|
149
|
+
image = Image.open(image_path)
|
|
150
|
+
return self.image_model.encode_image(image)
|
|
151
|
+
|
|
152
|
+
def embed_text(self, text: str):
|
|
153
|
+
return self.text_model.embed_query(text)
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
## 检索策略
|
|
157
|
+
|
|
158
|
+
### Dense 检索(向量)
|
|
159
|
+
```python
|
|
160
|
+
# 余弦相似度检索
|
|
161
|
+
retriever = vectorstore.as_retriever(
|
|
162
|
+
search_type="similarity",
|
|
163
|
+
search_kwargs={"k": 5}
|
|
164
|
+
)
|
|
165
|
+
|
|
166
|
+
# MMR(最大边际相关性)- 多样性
|
|
167
|
+
retriever = vectorstore.as_retriever(
|
|
168
|
+
search_type="mmr",
|
|
169
|
+
search_kwargs={"k": 5, "fetch_k": 20, "lambda_mult": 0.5}
|
|
170
|
+
)
|
|
171
|
+
|
|
172
|
+
# 相似度阈值过滤
|
|
173
|
+
retriever = vectorstore.as_retriever(
|
|
174
|
+
search_type="similarity_score_threshold",
|
|
175
|
+
search_kwargs={"score_threshold": 0.8, "k": 5}
|
|
176
|
+
)
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Sparse 检索(BM25)
|
|
180
|
+
```python
|
|
181
|
+
from langchain.retrievers import BM25Retriever
|
|
182
|
+
|
|
183
|
+
# BM25 关键词检索
|
|
184
|
+
bm25_retriever = BM25Retriever.from_documents(chunks)
|
|
185
|
+
bm25_retriever.k = 5
|
|
186
|
+
|
|
187
|
+
results = bm25_retriever.get_relevant_documents("RAG 系统")
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
### Hybrid 混合检索
|
|
191
|
+
```python
|
|
192
|
+
from langchain.retrievers import EnsembleRetriever
|
|
193
|
+
|
|
194
|
+
# 向量 + BM25 混合
|
|
195
|
+
ensemble_retriever = EnsembleRetriever(
|
|
196
|
+
retrievers=[vectorstore.as_retriever(), bm25_retriever],
|
|
197
|
+
weights=[0.6, 0.4] # 向量权重 60%,BM25 权重 40%
|
|
198
|
+
)
|
|
199
|
+
|
|
200
|
+
results = ensemble_retriever.get_relevant_documents("查询")
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
### 多路召回
|
|
204
|
+
```python
|
|
205
|
+
class MultiRecallRetriever:
|
|
206
|
+
def __init__(self, vector_store, bm25_retriever, graph_retriever):
|
|
207
|
+
self.retrievers = {
|
|
208
|
+
"vector": vector_store.as_retriever(search_kwargs={"k": 10}),
|
|
209
|
+
"bm25": bm25_retriever,
|
|
210
|
+
"graph": graph_retriever
|
|
211
|
+
}
|
|
212
|
+
|
|
213
|
+
def retrieve(self, query: str, top_k: int = 5):
|
|
214
|
+
all_docs = []
|
|
215
|
+
for name, retriever in self.retrievers.items():
|
|
216
|
+
docs = retriever.get_relevant_documents(query)
|
|
217
|
+
all_docs.extend([(doc, name) for doc in docs])
|
|
218
|
+
|
|
219
|
+
# 去重 + 重排
|
|
220
|
+
unique_docs = self._deduplicate(all_docs)
|
|
221
|
+
return self._rerank(unique_docs, query)[:top_k]
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
## 重排算法
|
|
225
|
+
|
|
226
|
+
### Cross-Encoder 重排
|
|
227
|
+
```python
|
|
228
|
+
from sentence_transformers import CrossEncoder
|
|
229
|
+
|
|
230
|
+
class Reranker:
|
|
231
|
+
def __init__(self):
|
|
232
|
+
self.model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')
|
|
233
|
+
|
|
234
|
+
def rerank(self, query: str, documents: list, top_k: int = 5):
|
|
235
|
+
pairs = [[query, doc.page_content] for doc in documents]
|
|
236
|
+
scores = self.model.predict(pairs)
|
|
237
|
+
|
|
238
|
+
# 按分数排序
|
|
239
|
+
ranked = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)
|
|
240
|
+
return [doc for doc, score in ranked[:top_k]]
|
|
241
|
+
|
|
242
|
+
# 使用
|
|
243
|
+
reranker = Reranker()
|
|
244
|
+
initial_docs = vectorstore.similarity_search(query, k=20)
|
|
245
|
+
final_docs = reranker.rerank(query, initial_docs, top_k=5)
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
### Cohere Rerank API
|
|
249
|
+
```python
|
|
250
|
+
import cohere
|
|
251
|
+
|
|
252
|
+
co = cohere.Client("YOUR_API_KEY")
|
|
253
|
+
|
|
254
|
+
def cohere_rerank(query: str, documents: list, top_k: int = 5):
|
|
255
|
+
results = co.rerank(
|
|
256
|
+
query=query,
|
|
257
|
+
documents=[doc.page_content for doc in documents],
|
|
258
|
+
top_n=top_k,
|
|
259
|
+
model="rerank-multilingual-v2.0"
|
|
260
|
+
)
|
|
261
|
+
|
|
262
|
+
return [documents[r.index] for r in results]
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
### LLM 重排
|
|
266
|
+
```python
|
|
267
|
+
from langchain.chat_models import ChatOpenAI
|
|
268
|
+
|
|
269
|
+
def llm_rerank(query: str, documents: list, top_k: int = 3):
|
|
270
|
+
llm = ChatOpenAI(model="gpt-4", temperature=0)
|
|
271
|
+
|
|
272
|
+
prompt = f"""给定查询和文档列表,按相关性排序(1最相关)。
|
|
273
|
+
|
|
274
|
+
查询: {query}
|
|
275
|
+
|
|
276
|
+
文档:
|
|
277
|
+
{chr(10).join([f"{i+1}. {doc.page_content[:200]}" for i, doc in enumerate(documents)])}
|
|
278
|
+
|
|
279
|
+
输出格式: 1,3,2,5,4(仅数字和逗号)"""
|
|
280
|
+
|
|
281
|
+
ranking = llm.predict(prompt).strip().split(',')
|
|
282
|
+
return [documents[int(i)-1] for i in ranking[:top_k]]
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
## 文档切分策略
|
|
286
|
+
|
|
287
|
+
### 递归切分
|
|
288
|
+
```python
|
|
289
|
+
from langchain.text_splitter import RecursiveCharacterTextSplitter
|
|
290
|
+
|
|
291
|
+
splitter = RecursiveCharacterTextSplitter(
|
|
292
|
+
chunk_size=1000,
|
|
293
|
+
chunk_overlap=200,
|
|
294
|
+
length_function=len,
|
|
295
|
+
separators=["\n\n", "\n", "。", ".", " ", ""]
|
|
296
|
+
)
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
### 语义切分
|
|
300
|
+
```python
|
|
301
|
+
from langchain.text_splitter import SemanticChunker
|
|
302
|
+
|
|
303
|
+
semantic_splitter = SemanticChunker(
|
|
304
|
+
embeddings=embeddings,
|
|
305
|
+
breakpoint_threshold_type="percentile", # 或 "standard_deviation"
|
|
306
|
+
breakpoint_threshold_amount=95
|
|
307
|
+
)
|
|
308
|
+
|
|
309
|
+
chunks = semantic_splitter.split_text(long_text)
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
### Markdown 结构化切分
|
|
313
|
+
```python
|
|
314
|
+
from langchain.text_splitter import MarkdownHeaderTextSplitter
|
|
315
|
+
|
|
316
|
+
headers_to_split_on = [
|
|
317
|
+
("#", "Header 1"),
|
|
318
|
+
("##", "Header 2"),
|
|
319
|
+
("###", "Header 3"),
|
|
320
|
+
]
|
|
321
|
+
|
|
322
|
+
markdown_splitter = MarkdownHeaderTextSplitter(headers_to_split_on)
|
|
323
|
+
chunks = markdown_splitter.split_text(markdown_text)
|
|
324
|
+
```
|
|
325
|
+
|
|
326
|
+
## 查询优化
|
|
327
|
+
|
|
328
|
+
### 查询改写
|
|
329
|
+
```python
|
|
330
|
+
from langchain.prompts import ChatPromptTemplate
|
|
331
|
+
|
|
332
|
+
query_rewrite_prompt = ChatPromptTemplate.from_template("""
|
|
333
|
+
将用户查询改写为更适合检索的形式。
|
|
334
|
+
|
|
335
|
+
原始查询: {query}
|
|
336
|
+
|
|
337
|
+
改写要求:
|
|
338
|
+
1. 补全省略信息
|
|
339
|
+
2. 扩展同义词
|
|
340
|
+
3. 拆分复合问题
|
|
341
|
+
|
|
342
|
+
改写后查询:""")
|
|
343
|
+
|
|
344
|
+
def rewrite_query(query: str):
|
|
345
|
+
chain = query_rewrite_prompt | llm
|
|
346
|
+
return chain.invoke({"query": query}).content
|
|
347
|
+
```
|
|
348
|
+
|
|
349
|
+
### 多查询生成
|
|
350
|
+
```python
|
|
351
|
+
from langchain.retrievers.multi_query import MultiQueryRetriever
|
|
352
|
+
|
|
353
|
+
multi_query_retriever = MultiQueryRetriever.from_llm(
|
|
354
|
+
retriever=vectorstore.as_retriever(),
|
|
355
|
+
llm=llm
|
|
356
|
+
)
|
|
357
|
+
|
|
358
|
+
# 自动生成 3-5 个变体查询
|
|
359
|
+
results = multi_query_retriever.get_relevant_documents("RAG 是什么?")
|
|
360
|
+
```
|
|
361
|
+
|
|
362
|
+
### HyDE(假设文档嵌入)
|
|
363
|
+
```python
|
|
364
|
+
def hyde_retrieval(query: str):
|
|
365
|
+
# 1. 让 LLM 生成假设答案
|
|
366
|
+
hyde_prompt = f"请详细回答: {query}"
|
|
367
|
+
hypothetical_doc = llm.predict(hyde_prompt)
|
|
368
|
+
|
|
369
|
+
# 2. 用假设答案检索
|
|
370
|
+
results = vectorstore.similarity_search(hypothetical_doc, k=5)
|
|
371
|
+
return results
|
|
372
|
+
```
|
|
373
|
+
|
|
374
|
+
## 上下文压缩
|
|
375
|
+
|
|
376
|
+
### LLM 压缩器
|
|
377
|
+
```python
|
|
378
|
+
from langchain.retrievers import ContextualCompressionRetriever
|
|
379
|
+
from langchain.retrievers.document_compressors import LLMChainExtractor
|
|
380
|
+
|
|
381
|
+
compressor = LLMChainExtractor.from_llm(llm)
|
|
382
|
+
|
|
383
|
+
compression_retriever = ContextualCompressionRetriever(
|
|
384
|
+
base_compressor=compressor,
|
|
385
|
+
base_retriever=vectorstore.as_retriever(search_kwargs={"k": 10})
|
|
386
|
+
)
|
|
387
|
+
|
|
388
|
+
# 检索 10 个文档,压缩后返回最相关片段
|
|
389
|
+
compressed_docs = compression_retriever.get_relevant_documents(query)
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
### Embedding 过滤
|
|
393
|
+
```python
|
|
394
|
+
from langchain.retrievers.document_compressors import EmbeddingsFilter
|
|
395
|
+
|
|
396
|
+
embeddings_filter = EmbeddingsFilter(
|
|
397
|
+
embeddings=embeddings,
|
|
398
|
+
similarity_threshold=0.76
|
|
399
|
+
)
|
|
400
|
+
|
|
401
|
+
compression_retriever = ContextualCompressionRetriever(
|
|
402
|
+
base_compressor=embeddings_filter,
|
|
403
|
+
base_retriever=vectorstore.as_retriever(search_kwargs={"k": 20})
|
|
404
|
+
)
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
## 完整 RAG Pipeline
|
|
408
|
+
|
|
409
|
+
### LangChain 实现
|
|
410
|
+
```python
|
|
411
|
+
from langchain.chains import ConversationalRetrievalChain
|
|
412
|
+
from langchain.memory import ConversationBufferMemory
|
|
413
|
+
|
|
414
|
+
# 记忆
|
|
415
|
+
memory = ConversationBufferMemory(
|
|
416
|
+
memory_key="chat_history",
|
|
417
|
+
return_messages=True,
|
|
418
|
+
output_key="answer"
|
|
419
|
+
)
|
|
420
|
+
|
|
421
|
+
# 对话式 RAG
|
|
422
|
+
qa_chain = ConversationalRetrievalChain.from_llm(
|
|
423
|
+
llm=llm,
|
|
424
|
+
retriever=vectorstore.as_retriever(search_kwargs={"k": 5}),
|
|
425
|
+
memory=memory,
|
|
426
|
+
return_source_documents=True,
|
|
427
|
+
verbose=True
|
|
428
|
+
)
|
|
429
|
+
|
|
430
|
+
# 多轮对话
|
|
431
|
+
result1 = qa_chain({"question": "什么是 RAG?"})
|
|
432
|
+
result2 = qa_chain({"question": "它有什么优势?"}) # 自动引用上下文
|
|
433
|
+
```
|
|
434
|
+
|
|
435
|
+
### LlamaIndex 实现
|
|
436
|
+
```python
|
|
437
|
+
from llama_index import VectorStoreIndex, ServiceContext
|
|
438
|
+
from llama_index.llms import OpenAI
|
|
439
|
+
from llama_index.embeddings import OpenAIEmbedding
|
|
440
|
+
|
|
441
|
+
# 服务上下文
|
|
442
|
+
service_context = ServiceContext.from_defaults(
|
|
443
|
+
llm=OpenAI(model="gpt-4", temperature=0),
|
|
444
|
+
embed_model=OpenAIEmbedding()
|
|
445
|
+
)
|
|
446
|
+
|
|
447
|
+
# 构建索引
|
|
448
|
+
index = VectorStoreIndex.from_documents(
|
|
449
|
+
documents,
|
|
450
|
+
service_context=service_context
|
|
451
|
+
)
|
|
452
|
+
|
|
453
|
+
# 查询引擎
|
|
454
|
+
query_engine = index.as_query_engine(
|
|
455
|
+
similarity_top_k=5,
|
|
456
|
+
response_mode="compact" # 或 "tree_summarize", "refine"
|
|
457
|
+
)
|
|
458
|
+
|
|
459
|
+
response = query_engine.query("什么是 RAG?")
|
|
460
|
+
print(response.response)
|
|
461
|
+
print(response.source_nodes) # 引用来源
|
|
462
|
+
```
|
|
463
|
+
|
|
464
|
+
## 高级 RAG 模式
|
|
465
|
+
|
|
466
|
+
### Self-RAG(自我反思)
|
|
467
|
+
```python
|
|
468
|
+
class SelfRAG:
|
|
469
|
+
def __init__(self, llm, retriever):
|
|
470
|
+
self.llm = llm
|
|
471
|
+
self.retriever = retriever
|
|
472
|
+
|
|
473
|
+
def query(self, question: str):
|
|
474
|
+
# 1. 判断是否需要检索
|
|
475
|
+
need_retrieval = self._check_retrieval_need(question)
|
|
476
|
+
|
|
477
|
+
if not need_retrieval:
|
|
478
|
+
return self.llm.predict(question)
|
|
479
|
+
|
|
480
|
+
# 2. 检索
|
|
481
|
+
docs = self.retriever.get_relevant_documents(question)
|
|
482
|
+
|
|
483
|
+
# 3. 生成答案
|
|
484
|
+
answer = self._generate_with_docs(question, docs)
|
|
485
|
+
|
|
486
|
+
# 4. 自我评估
|
|
487
|
+
if self._verify_answer(question, answer, docs):
|
|
488
|
+
return answer
|
|
489
|
+
else:
|
|
490
|
+
# 重新检索或生成
|
|
491
|
+
return self._fallback_generate(question)
|
|
492
|
+
```
|
|
493
|
+
|
|
494
|
+
### RAPTOR(递归摘要)
|
|
495
|
+
```python
|
|
496
|
+
from langchain.chains.summarize import load_summarize_chain
|
|
497
|
+
|
|
498
|
+
def raptor_indexing(documents, levels=3):
|
|
499
|
+
current_docs = documents
|
|
500
|
+
all_summaries = []
|
|
501
|
+
|
|
502
|
+
for level in range(levels):
|
|
503
|
+
# 聚类
|
|
504
|
+
clusters = cluster_documents(current_docs, n_clusters=10)
|
|
505
|
+
|
|
506
|
+
# 每个簇生成摘要
|
|
507
|
+
summaries = []
|
|
508
|
+
for cluster in clusters:
|
|
509
|
+
summary = summarize_chain.run(cluster)
|
|
510
|
+
summaries.append(summary)
|
|
511
|
+
|
|
512
|
+
all_summaries.extend(summaries)
|
|
513
|
+
current_docs = summaries
|
|
514
|
+
|
|
515
|
+
# 索引原文档 + 各层摘要
|
|
516
|
+
vectorstore.add_documents(documents + all_summaries)
|
|
517
|
+
```
|
|
518
|
+
|
|
519
|
+
## 工具与框架
|
|
520
|
+
|
|
521
|
+
| 工具 | 类型 | 特点 |
|
|
522
|
+
|------|------|------|
|
|
523
|
+
| LangChain | 框架 | 生态丰富、组件化 |
|
|
524
|
+
| LlamaIndex | 框架 | 索引优化、查询引擎 |
|
|
525
|
+
| Haystack | 框架 | 生产级、Pipeline |
|
|
526
|
+
| Pinecone | 向量库 | 托管、高性能 |
|
|
527
|
+
| Qdrant | 向量库 | 开源、过滤强 |
|
|
528
|
+
| Weaviate | 向量库 | 多模态、GraphQL |
|
|
529
|
+
| Cohere | API | Embedding + Rerank |
|
|
530
|
+
|
|
531
|
+
## 最佳实践
|
|
532
|
+
|
|
533
|
+
- ✅ 文档切分:chunk_size 500-1500,overlap 10-20%
|
|
534
|
+
- ✅ 检索数量:初召回 10-20,重排后 3-5
|
|
535
|
+
- ✅ 混合检索:向量 + BM25 权重 6:4 或 7:3
|
|
536
|
+
- ✅ 元数据过滤:时间、来源、类型
|
|
537
|
+
- ✅ 引用来源:返回 source_documents
|
|
538
|
+
- ✅ 缓存:相同查询缓存结果
|
|
539
|
+
- ✅ 监控:检索延迟、相关性、答案质量
|
|
540
|
+
- ❌ 避免:chunk 过大/过小、无重排、无压缩
|
|
541
|
+
|
|
542
|
+
---
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: architecture
|
|
3
|
+
description: 架构设计能力索引。API设计、安全架构、云原生、数据安全。当用户提到架构、设计、API、云原生时路由到此。
|
|
4
|
+
license: MIT
|
|
5
|
+
user-invocable: false
|
|
6
|
+
disable-model-invocation: false
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# 🏗 阵法秘典 · 架构设计能力中枢
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
## 能力矩阵
|
|
13
|
+
|
|
14
|
+
| Skill | 定位 | 核心能力 |
|
|
15
|
+
|-------|------|----------|
|
|
16
|
+
| [api-design](api-design.md) | API 设计 | RESTful、GraphQL、OpenAPI |
|
|
17
|
+
| [security-arch](security-arch.md) | 安全架构 | 零信任、IAM、威胁建模、数据安全、合规审计 |
|
|
18
|
+
| [cloud-native](cloud-native.md) | 云原生 | 容器、K8s、Serverless |
|
|
19
|
+
| [message-queue](message-queue.md) | 消息队列 | Kafka、RabbitMQ、事件驱动 |
|
|
20
|
+
| [caching](caching.md) | 缓存策略 | Redis、CDN、缓存一致性 |
|
|
21
|
+
|
|
22
|
+
## 架构原则
|
|
23
|
+
|
|
24
|
+
```yaml
|
|
25
|
+
SOLID:
|
|
26
|
+
- S: 单一职责
|
|
27
|
+
- O: 开闭原则
|
|
28
|
+
- L: 里氏替换
|
|
29
|
+
- I: 接口隔离
|
|
30
|
+
- D: 依赖倒置
|
|
31
|
+
|
|
32
|
+
分布式:
|
|
33
|
+
- CAP 定理
|
|
34
|
+
- BASE 理论
|
|
35
|
+
- 最终一致性
|
|
36
|
+
|
|
37
|
+
安全:
|
|
38
|
+
- 纵深防御
|
|
39
|
+
- 最小权限
|
|
40
|
+
- 零信任
|
|
41
|
+
```
|
|
42
|
+
|