kc-beta 0.1.2 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/kc-beta.js +14 -2
- package/package.json +1 -1
- package/src/agent/context-window.js +151 -0
- package/src/agent/context.js +8 -4
- package/src/agent/engine.js +261 -8
- package/src/agent/event-log.js +111 -0
- package/src/agent/llm-client.js +352 -59
- package/src/agent/pipelines/base.js +6 -0
- package/src/agent/pipelines/distillation.js +18 -0
- package/src/agent/pipelines/extraction.js +21 -0
- package/src/agent/pipelines/initializer.js +75 -14
- package/src/agent/pipelines/production-qc.js +19 -0
- package/src/agent/pipelines/skill-authoring.js +14 -0
- package/src/agent/pipelines/skill-testing.js +20 -0
- package/src/agent/retry.js +83 -0
- package/src/agent/session-state.js +79 -0
- package/src/agent/skill-loader.js +13 -1
- package/src/agent/token-counter.js +62 -0
- package/src/agent/tools/document-parse.js +104 -21
- package/src/agent/tools/document-search.js +24 -8
- package/src/agent/tools/sandbox-exec.js +16 -5
- package/src/agent/tools/web-search.js +107 -0
- package/src/agent/tools/worker-llm-call.js +14 -5
- package/src/agent/tools/workspace-file.js +47 -20
- package/src/agent/workspace.js +24 -1
- package/src/cli/components.js +24 -5
- package/src/cli/config.js +340 -0
- package/src/cli/index.js +113 -11
- package/src/cli/onboard.js +216 -53
- package/src/config.js +63 -10
- package/src/model-tiers.json +153 -0
- package/src/providers.js +367 -0
- package/template/AGENT.md +20 -0
- package/template/skills/en/meta/compliance-judgment/SKILL.md +10 -42
- package/template/skills/en/meta/document-chunking/SKILL.md +32 -0
- package/template/skills/en/meta/document-parsing/SKILL.md +11 -18
- package/template/skills/en/meta/entity-extraction/SKILL.md +13 -28
- package/template/skills/en/meta/tree-processing/SKILL.md +19 -1
- package/template/skills/en/meta-meta/auto-model-selection/SKILL.md +53 -0
- package/template/skills/en/meta-meta/pdf-review-dashboard/SKILL.md +57 -0
- package/template/skills/en/meta-meta/pdf-review-dashboard/scripts/generate_review.js +262 -0
- package/template/skills/en/meta-meta/rule-extraction/SKILL.md +24 -1
- package/template/skills/en/meta-meta/skill-authoring/SKILL.md +6 -0
- package/template/skills/en/meta-meta/skill-to-workflow/SKILL.md +4 -0
- package/template/skills/zh/meta/compliance-judgment/SKILL.md +41 -262
- package/template/skills/zh/meta/document-chunking/SKILL.md +32 -0
- package/template/skills/zh/meta/document-parsing/SKILL.md +65 -132
- package/template/skills/zh/meta/entity-extraction/SKILL.md +68 -230
- package/template/skills/zh/meta/tree-processing/SKILL.md +82 -194
- package/template/skills/zh/meta-meta/auto-model-selection/SKILL.md +51 -0
- package/template/skills/zh/meta-meta/pdf-review-dashboard/SKILL.md +55 -0
- package/template/skills/zh/meta-meta/pdf-review-dashboard/scripts/generate_review.js +262 -0
- package/template/skills/zh/meta-meta/rule-extraction/SKILL.md +79 -164
- package/template/skills/zh/meta-meta/skill-authoring/SKILL.md +64 -185
- package/template/skills/zh/meta-meta/skill-to-workflow/SKILL.md +95 -216
|
@@ -3,273 +3,152 @@ name: skill-to-workflow
|
|
|
3
3
|
description: Distill a proven verification skill into a Python workflow with worker LLM prompts. Use when a rule skill has been tested and reaches the SKILL_ACCURACY threshold defined in .env. Covers the decision of what to implement as code vs LLM calls, prompt engineering for small context windows, model tier selection and progressive downgrade, and testing workflows against the coding agent's own results as ground truth. Also use when optimizing existing workflows for cost or speed.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
#
|
|
6
|
+
# Skill to Workflow
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
The skill is the ground truth. The workflow is a cheaper, faster approximation. Your job is to make the approximation as good as the original while being as cheap as possible.
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
## Engineering Goal
|
|
11
11
|
|
|
12
|
-
|
|
12
|
+
Optimize the full chain: **shortest workflow** (fewest nodes) → **smallest model per node** (cheapest tier that meets accuracy) → **shortest prompt per model** (minimum tokens). This is the engineering objective — not prompt template sophistication or framework compliance.
|
|
13
13
|
|
|
14
|
-
|
|
14
|
+
## When to Start
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
A skill is ready for workflow distillation when:
|
|
17
|
+
- It has been tested on all documents in Samples/.
|
|
18
|
+
- Its accuracy meets or exceeds the SKILL_ACCURACY threshold in `.env`.
|
|
19
|
+
- Edge cases are documented in the skill's `assets/corner_cases.json`.
|
|
20
|
+
- You understand the rule well enough to explain exactly how you verify it.
|
|
17
21
|
|
|
18
|
-
|
|
22
|
+
If any of these are not true, go back and iterate on the skill first.
|
|
19
23
|
|
|
20
|
-
|
|
24
|
+
## The Distillation Decision
|
|
21
25
|
|
|
22
|
-
|
|
23
|
-
2. **边界案例已充分记录**:至少覆盖了已知的主要例外情形
|
|
24
|
-
3. **判定逻辑已稳定**:最近两轮迭代没有对核心判定逻辑做出修改
|
|
26
|
+
For each step in your skill-based verification process, ask:
|
|
25
27
|
|
|
26
|
-
|
|
28
|
+
### Can this be done with regex or Python? (Cost: zero)
|
|
29
|
+
- Date extraction with known formats → regex
|
|
30
|
+
- Numeric comparison against threshold → Python arithmetic
|
|
31
|
+
- Chinese numeral conversion → Python lookup table
|
|
32
|
+
- Format validation (ID numbers, codes) → regex
|
|
33
|
+
- Table cell extraction from structured markdown → string manipulation
|
|
27
34
|
|
|
28
|
-
|
|
35
|
+
If yes, write it as code. These are free, fast, and deterministic.
|
|
29
36
|
|
|
30
|
-
|
|
37
|
+
### Does this require language understanding? (Cost: worker LLM call)
|
|
38
|
+
- Finding the relevant section in a document → LLM
|
|
39
|
+
- Extracting an entity described in natural language → LLM
|
|
40
|
+
- Judging semantic adequacy ("adequate risk disclosure") → LLM
|
|
41
|
+
- Resolving ambiguous references → LLM
|
|
31
42
|
|
|
32
|
-
|
|
43
|
+
If yes, design a worker LLM prompt. Use the smallest model tier that maintains accuracy.
|
|
33
44
|
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
- 字段存在性检查
|
|
37
|
-
- 格式标准化(大小写、日期格式转换)
|
|
38
|
-
- 枚举值校验(货币代码、国别代码)
|
|
39
|
-
- 数学运算(价税合计 = 不含税金额 × (1 + 税率))
|
|
45
|
+
### The hybrid approach (most common)
|
|
46
|
+
Most rules are a mix: regex extracts the number, Python compares it to the threshold, LLM handles the exceptional cases. Design the workflow as a pipeline where cheap steps run first and expensive steps run only when needed.
|
|
40
47
|
|
|
41
|
-
|
|
48
|
+
## Workflow Structure
|
|
42
49
|
|
|
43
|
-
|
|
44
|
-
- 理解自然语言描述的业务含义
|
|
45
|
-
- 判断两段文字的语义是否一致
|
|
46
|
-
- 识别和解析复杂的表格结构
|
|
47
|
-
- 分类判断(如:该笔费用属于哪个科目)
|
|
48
|
-
|
|
49
|
-
### 混合方案(推荐)
|
|
50
|
-
|
|
51
|
-
大多数核查规则的最优实现是混合方案:
|
|
52
|
-
|
|
53
|
-
```
|
|
54
|
-
1. Python 预处理(格式化、提取结构化字段)
|
|
55
|
-
2. LLM 调用(语义理解、非结构化信息提取)
|
|
56
|
-
3. Python 后处理(逻辑判断、计算、格式化输出)
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
把 LLM 调用夹在中间,用代码限制它的输入范围和输出格式。这样既能利用 LLM 的语义能力,又能用代码保证确定性。
|
|
60
|
-
|
|
61
|
-
## 工作流文件结构
|
|
50
|
+
A workflow is a Python file (or small set of files) in `workflows/`:
|
|
62
51
|
|
|
63
52
|
```
|
|
64
|
-
workflows/
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
53
|
+
workflows/
|
|
54
|
+
rule_001_capital_adequacy/
|
|
55
|
+
workflow_v1.py # The main workflow script
|
|
56
|
+
prompts/
|
|
57
|
+
extract.txt # Worker LLM prompt for extraction
|
|
58
|
+
judge.txt # Worker LLM prompt for judgment (if needed)
|
|
59
|
+
config.json # Model assignments, thresholds
|
|
71
60
|
```
|
|
72
61
|
|
|
73
|
-
|
|
62
|
+
The workflow file should have a clear entry point:
|
|
74
63
|
|
|
75
64
|
```python
|
|
76
|
-
|
|
77
|
-
R001 - 发票日期有效性核查工作流
|
|
78
|
-
蒸馏自: rule-skills/R001-invoice-date-validity/
|
|
79
|
-
技能准确率: 95%
|
|
80
|
-
蒸馏日期: 2025-04-01
|
|
81
|
-
"""
|
|
82
|
-
|
|
83
|
-
import json
|
|
84
|
-
import os
|
|
85
|
-
from pathlib import Path
|
|
86
|
-
|
|
87
|
-
def run_verification(document_data: dict, config: dict) -> dict:
|
|
65
|
+
def verify(document_text: str, config: dict) -> dict:
|
|
88
66
|
"""
|
|
89
|
-
工作流入口函数。
|
|
90
|
-
|
|
91
|
-
Args:
|
|
92
|
-
document_data: 待核查的单据数据
|
|
93
|
-
config: 运行时配置(模型选择、API地址等)
|
|
94
|
-
|
|
95
67
|
Returns:
|
|
96
|
-
|
|
68
|
+
{
|
|
69
|
+
"rule_id": "R001",
|
|
70
|
+
"result": "pass" | "fail" | "missing" | "error",
|
|
71
|
+
"extracted_value": ...,
|
|
72
|
+
"confidence": 0.0-1.0,
|
|
73
|
+
"comment": "..." (only when fail),
|
|
74
|
+
"model_used": "...",
|
|
75
|
+
"llm_calls": int,
|
|
76
|
+
"llm_tokens": int
|
|
77
|
+
}
|
|
97
78
|
"""
|
|
98
|
-
# 步骤1: 预处理(纯代码)
|
|
99
|
-
# 步骤2: LLM提取(如需要)
|
|
100
|
-
# 步骤3: 逻辑判断(纯代码)
|
|
101
|
-
# 步骤4: 格式化输出
|
|
102
|
-
pass
|
|
103
|
-
```
|
|
104
|
-
|
|
105
|
-
入口函数必须是 `run_verification`,签名固定。这样质量监控和批量处理可以统一调度。
|
|
106
|
-
|
|
107
|
-
### config.json
|
|
108
|
-
|
|
109
|
-
```json
|
|
110
|
-
{
|
|
111
|
-
"rule_id": "R001",
|
|
112
|
-
"rule_name": "发票日期有效性",
|
|
113
|
-
"distilled_from": "rule-skills/R001-invoice-date-validity/",
|
|
114
|
-
"version": "v1",
|
|
115
|
-
"model_tier": "TIER3",
|
|
116
|
-
"llm_steps": ["extract_dates"],
|
|
117
|
-
"code_steps": ["normalize_format", "compare_dates", "format_output"],
|
|
118
|
-
"estimated_cost_per_doc": 0.002,
|
|
119
|
-
"api_base_url": "${API_BASE_URL}",
|
|
120
|
-
"api_key": "${API_KEY}"
|
|
121
|
-
}
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
## Worker LLM的提示词工程
|
|
125
|
-
|
|
126
|
-
Worker LLM不是你。它的上下文窗口更小,推理能力更弱,对业务背景一无所知。提示词必须为它的局限性做设计。
|
|
127
|
-
|
|
128
|
-
### 自包含原则
|
|
129
|
-
|
|
130
|
-
提示词不能假设Worker LLM知道任何背景信息。所有必要的上下文都要在提示词中显式提供:
|
|
131
|
-
|
|
132
|
-
```markdown
|
|
133
|
-
你是一个单据信息提取助手。你的任务是从以下发票文本中提取开票日期。
|
|
134
|
-
|
|
135
|
-
提取规则:
|
|
136
|
-
- 查找「开票日期」或「Date of Issue」字段
|
|
137
|
-
- 日期格式统一输出为 YYYY-MM-DD
|
|
138
|
-
- 如果找不到日期,输出 null
|
|
139
|
-
- 只提取日期,不要做任何判断
|
|
140
|
-
|
|
141
|
-
发票文本:
|
|
142
|
-
{invoice_text}
|
|
143
|
-
```
|
|
144
|
-
|
|
145
|
-
### 结构化输出强制
|
|
146
|
-
|
|
147
|
-
Worker LLM的输出必须是可解析的。在提示词中明确要求 JSON 格式输出:
|
|
148
|
-
|
|
149
|
-
```markdown
|
|
150
|
-
请严格按照以下 JSON 格式输出,不要输出任何其他内容:
|
|
151
|
-
|
|
152
|
-
{
|
|
153
|
-
"invoice_date": "YYYY-MM-DD 或 null",
|
|
154
|
-
"extraction_confidence": "high / medium / low"
|
|
155
|
-
}
|
|
156
79
|
```
|
|
157
80
|
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
不要把整篇文档丢给Worker LLM。只传入它需要处理的那部分内容:
|
|
161
|
-
|
|
162
|
-
- 如果只需要提取发票日期,只传发票头部区域的文本
|
|
163
|
-
- 如果需要比对合同信息,只传合同中的相关条款段落
|
|
164
|
-
- 上下文越窄,提取越准,成本越低
|
|
165
|
-
|
|
166
|
-
### 使用单据语言
|
|
167
|
-
|
|
168
|
-
提示词的指令语言应该与单据语言一致。核查中文单据时,提示词用中文写。这样可以避免Worker LLM在语言切换中引入错误。
|
|
169
|
-
|
|
170
|
-
### 少量示例策略
|
|
171
|
-
|
|
172
|
-
在提示词中提供 1-2 个精简的输入输出示例,但不要过多:
|
|
173
|
-
|
|
174
|
-
- Worker LLM的上下文窗口有限,示例太多会挤占正文空间
|
|
175
|
-
- 选择最典型的正例和一个常见的异常例
|
|
176
|
-
- 示例要简短,只展示关键特征
|
|
177
|
-
|
|
178
|
-
## 模型层级选择与逐级降级
|
|
81
|
+
This is a reference, not a rigid contract. Adapt the structure to the specific rule. The important thing is that every workflow produces a result that can be compared against the skill-based ground truth.
|
|
179
82
|
|
|
180
|
-
|
|
83
|
+
## Prompt Engineering for Worker LLMs
|
|
181
84
|
|
|
182
|
-
|
|
85
|
+
Worker LLMs have smaller context windows (typically 16K-32K tokens). Design prompts that:
|
|
183
86
|
|
|
184
|
-
-
|
|
185
|
-
|
|
186
|
-
-
|
|
187
|
-
|
|
87
|
+
1. **Are self-contained.** Include everything the model needs in the prompt. Do not assume the model has context from previous calls.
|
|
88
|
+
2. **Specify the output format.** "Return a JSON object with fields: value, confidence, reasoning." Structured output reduces parsing errors.
|
|
89
|
+
3. **Include the narrowed context.** Do not send the entire document. Use the tree-processing pipeline (full document → relevant chapter → relevant section) to narrow the context before calling the worker LLM.
|
|
90
|
+
4. **Are written in the document's language.** Chinese documents get Chinese prompts. English documents get English prompts. Do not mix languages in a single prompt.
|
|
91
|
+
5. **Provide examples sparingly.** One or two examples help. Ten examples waste context window and risk overfitting.
|
|
188
92
|
|
|
189
|
-
|
|
93
|
+
## Model Tier Selection
|
|
190
94
|
|
|
191
|
-
|
|
192
|
-
1. 用 TIER1 运行全部测试样本,确立准确率天花板
|
|
193
|
-
2. 用 TIER2 运行同一批测试样本,与 TIER1 结果对比
|
|
194
|
-
3. 如果 TIER2 准确率接近 TIER1 → 继续尝试 TIER3
|
|
195
|
-
4. 如果 TIER3 仍然接近 → 继续尝试 TIER4
|
|
196
|
-
5. 选择满足 WORKFLOW_ACCURACY 阈值的最低层级
|
|
197
|
-
6. 如果 TIER1 本身都不达标 → 回到技能层面检查提示词设计
|
|
198
|
-
```
|
|
199
|
-
|
|
200
|
-
注意:不同步骤可以使用不同层级。比如日期提取用 TIER4,语义判断用 TIER2。在 config.json 中按步骤记录最优层级。
|
|
201
|
-
|
|
202
|
-
### 正式降级协议
|
|
203
|
-
|
|
204
|
-
以下数值和流程是推荐起点,编程智能体和开发者用户应根据实际情况自由调整。重要的是模式本身(测试 → 对比 → 记录 → 退化时重评),而非具体数字。
|
|
95
|
+
Start with the highest tier (TIER1) for each step. Measure accuracy. Then try lower tiers:
|
|
205
96
|
|
|
206
|
-
|
|
97
|
+
1. Run the workflow with TIER1 on all Samples/. Record accuracy per step.
|
|
98
|
+
2. For each step, try TIER2. If accuracy stays above WORKFLOW_ACCURACY, keep TIER2.
|
|
99
|
+
3. Continue downgrading per step until accuracy drops below threshold.
|
|
100
|
+
4. Record the optimal tier per step in `config.json`.
|
|
207
101
|
|
|
208
|
-
|
|
102
|
+
Different steps within the same workflow can use different model tiers. Extraction might need TIER2 while judgment might work fine with TIER3.
|
|
209
103
|
|
|
210
|
-
|
|
104
|
+
### Formal Downgrade Protocol
|
|
211
105
|
|
|
212
|
-
|
|
106
|
+
The basic approach above works, but a more rigorous protocol prevents premature tier commitments:
|
|
213
107
|
|
|
214
|
-
|
|
108
|
+
**Direction**: Start top-down (TIER1 → TIER4) to establish the accuracy ceiling first. You need to know the best possible accuracy before trading it for cost savings.
|
|
215
109
|
|
|
216
|
-
|
|
110
|
+
**Minimum test runs**: Run at least a meaningful number of documents (e.g., min(10, total_samples)) at each candidate tier before making a tier decision. Small samples are unreliable — a 3-document test could be misleading.
|
|
217
111
|
|
|
218
|
-
|
|
112
|
+
**Accuracy delta trigger**: If a lower tier's accuracy is significantly below the higher tier (e.g., >5 percentage points), stay at the higher tier for that step. If the delta is within tolerance, use the cheaper tier.
|
|
219
113
|
|
|
220
|
-
|
|
114
|
+
**Per-step independence**: Each workflow step is assessed separately. Record the optimal tier per step in `config.json`. Do not assume the whole workflow must use one tier.
|
|
221
115
|
|
|
222
|
-
|
|
116
|
+
**Re-assessment trigger**: If production quality control shows a step's accuracy degrading (e.g., due to new document formats), re-run the tier assessment for that step.
|
|
223
117
|
|
|
224
|
-
|
|
118
|
+
**Model-task recommendation list**: Maintain a per-project mapping of (task_type → recommended_tier) based on your testing experience. Over time, these lists can be collected across projects to build generalized tier recommendations.
|
|
225
119
|
|
|
226
|
-
|
|
227
|
-
- **字段提取准确性**:工作流提取的字段值是否与技能提取的一致
|
|
228
|
-
- **置信度校准**:工作流报告高置信度的案例,是否确实准确率更高
|
|
120
|
+
All numbers here (10 documents, 5 percentage points, etc.) are recommended starting points. The coding agent and developer user should calibrate these — or replace them entirely with a different assessment approach — based on their specific volume, accuracy requirements, and cost constraints. The pattern matters: **test at each tier → compare accuracy → commit when within tolerance → re-assess on degradation**.
|
|
229
121
|
|
|
230
|
-
|
|
122
|
+
This follows the same tier-transition framework as parser escalation in `document-parsing`: a quality/accuracy score drives the decision to stay, escalate, or skip.
|
|
231
123
|
|
|
232
|
-
|
|
233
|
-
工作流准确率 = 与技能判定一致的案例数 / 总案例数
|
|
234
|
-
```
|
|
235
|
-
|
|
236
|
-
分别计算总体准确率和分类准确率(通过、不通过、无法核查各自的准确率),避免类别不均衡导致的误判。
|
|
124
|
+
## Testing Against Ground Truth
|
|
237
125
|
|
|
238
|
-
|
|
126
|
+
The coding agent's skill-based results are the ground truth. For each document in Samples/:
|
|
239
127
|
|
|
240
|
-
|
|
128
|
+
1. Run the workflow.
|
|
129
|
+
2. Compare the workflow's result against the skill-based result.
|
|
130
|
+
3. Log discrepancies: which step failed, what was expected vs actual.
|
|
131
|
+
4. Compute accuracy: `(matching results) / (total documents)`.
|
|
132
|
+
5. If accuracy < WORKFLOW_ACCURACY, diagnose and fix. Use `evolution-loop` methodology.
|
|
241
133
|
|
|
242
|
-
|
|
243
|
-
- `workflow_v2.py` → 优化提示词后的版本
|
|
244
|
-
- `workflow_v3.py` → 更换模型层级后的版本
|
|
134
|
+
## Versioning
|
|
245
135
|
|
|
246
|
-
|
|
136
|
+
Each iteration of a workflow is a new version file: `workflow_v1.py`, `workflow_v2.py`, etc. Track which version is active in `config.json`. See `version-control` skill for the full methodology.
|
|
247
137
|
|
|
248
|
-
##
|
|
138
|
+
## Cost Tracking
|
|
249
139
|
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
"rule_id": "R001",
|
|
255
|
-
"workflow_version": "v2",
|
|
256
|
-
"document_id": "DOC-001",
|
|
257
|
-
"llm_calls": 2,
|
|
258
|
-
"total_tokens": 1850,
|
|
259
|
-
"estimated_cost_usd": 0.003,
|
|
260
|
-
"model_used": "TIER3",
|
|
261
|
-
"timestamp": "2025-04-01T10:30:00Z"
|
|
262
|
-
}
|
|
263
|
-
```
|
|
140
|
+
Track the cost of each workflow run:
|
|
141
|
+
- Number of LLM calls per document.
|
|
142
|
+
- Total tokens consumed per document.
|
|
143
|
+
- Model tier used per call.
|
|
264
144
|
|
|
265
|
-
|
|
145
|
+
This data helps the developer user understand the production cost and informs further optimization.
|
|
266
146
|
|
|
267
|
-
##
|
|
147
|
+
## Worker LLM API
|
|
268
148
|
|
|
269
|
-
|
|
149
|
+
Worker LLMs are accessed via SiliconFlow API. Connection details are in `.env`:
|
|
150
|
+
- `SILICONFLOW_API_KEY` for authentication
|
|
151
|
+
- `SILICONFLOW_BASE_URL` for the API endpoint
|
|
152
|
+
- Model names in `TIER1` through `TIER4`
|
|
270
153
|
|
|
271
|
-
|
|
272
|
-
- 使用标准的 OpenAI 兼容接口格式
|
|
273
|
-
- 设置合理的超时和重试机制
|
|
274
|
-
- 对 API 错误做好降级处理(如某模型不可用时切换到备选模型)
|
|
275
|
-
- 记录每次调用的 token 用量和响应时间
|
|
154
|
+
See `references/worker-llm-catalog.md` for current model capabilities and context window sizes.
|