kc-beta 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (141) hide show
  1. package/bin/kc-beta.js +16 -0
  2. package/package.json +32 -0
  3. package/src/agent/confidence-scorer.js +120 -0
  4. package/src/agent/context.js +124 -0
  5. package/src/agent/corner-case-registry.js +119 -0
  6. package/src/agent/engine.js +224 -0
  7. package/src/agent/events.js +27 -0
  8. package/src/agent/history.js +101 -0
  9. package/src/agent/llm-client.js +131 -0
  10. package/src/agent/pipelines/base.js +14 -0
  11. package/src/agent/pipelines/distillation.js +113 -0
  12. package/src/agent/pipelines/extraction.js +92 -0
  13. package/src/agent/pipelines/index.js +23 -0
  14. package/src/agent/pipelines/initializer.js +163 -0
  15. package/src/agent/pipelines/production-qc.js +99 -0
  16. package/src/agent/pipelines/skill-authoring.js +83 -0
  17. package/src/agent/pipelines/skill-testing.js +111 -0
  18. package/src/agent/tools/agent-tool.js +100 -0
  19. package/src/agent/tools/base.js +35 -0
  20. package/src/agent/tools/dashboard-render.js +146 -0
  21. package/src/agent/tools/document-parse.js +184 -0
  22. package/src/agent/tools/document-search.js +111 -0
  23. package/src/agent/tools/evolution-cycle.js +150 -0
  24. package/src/agent/tools/qc-sample.js +94 -0
  25. package/src/agent/tools/registry.js +55 -0
  26. package/src/agent/tools/rule-catalog.js +113 -0
  27. package/src/agent/tools/sandbox-exec.js +106 -0
  28. package/src/agent/tools/tier-downgrade.js +114 -0
  29. package/src/agent/tools/worker-llm-call.js +109 -0
  30. package/src/agent/tools/workflow-run.js +138 -0
  31. package/src/agent/tools/workspace-file.js +122 -0
  32. package/src/agent/version-manager.js +130 -0
  33. package/src/agent/workspace.js +82 -0
  34. package/src/cli/components.js +164 -0
  35. package/src/cli/index.js +329 -0
  36. package/src/cli/init.js +80 -0
  37. package/src/cli/onboard.js +182 -0
  38. package/src/cli/terminal.js +143 -0
  39. package/src/config.js +93 -0
  40. package/template/.env.template +31 -0
  41. package/template/CLAUDE.md +137 -0
  42. package/template/Input/.gitkeep +0 -0
  43. package/template/Output/.gitkeep +0 -0
  44. package/template/Rules/.gitkeep +0 -0
  45. package/template/Samples/.gitkeep +0 -0
  46. package/template/skills/en/meta/compliance-judgment/SKILL.md +114 -0
  47. package/template/skills/en/meta/compliance-judgment/references/output-format.md +151 -0
  48. package/template/skills/en/meta/confidence-system/SKILL.md +117 -0
  49. package/template/skills/en/meta/corner-case-management/SKILL.md +111 -0
  50. package/template/skills/en/meta/cross-document-verification/SKILL.md +131 -0
  51. package/template/skills/en/meta/cross-document-verification/references/contradiction-taxonomy.md +73 -0
  52. package/template/skills/en/meta/data-sensibility/SKILL.md +115 -0
  53. package/template/skills/en/meta/document-parsing/SKILL.md +108 -0
  54. package/template/skills/en/meta/document-parsing/references/parser-catalog.md +40 -0
  55. package/template/skills/en/meta/entity-extraction/SKILL.md +129 -0
  56. package/template/skills/en/meta/tree-processing/SKILL.md +103 -0
  57. package/template/skills/en/meta-meta/bootstrap-workspace/SKILL.md +70 -0
  58. package/template/skills/en/meta-meta/dashboard-reporting/SKILL.md +106 -0
  59. package/template/skills/en/meta-meta/dashboard-reporting/scripts/generate_dashboard.py +178 -0
  60. package/template/skills/en/meta-meta/evolution-loop/SKILL.md +210 -0
  61. package/template/skills/en/meta-meta/evolution-loop/references/convergence-guide.md +62 -0
  62. package/template/skills/en/meta-meta/quality-control/SKILL.md +138 -0
  63. package/template/skills/en/meta-meta/quality-control/references/qa-layers.md +92 -0
  64. package/template/skills/en/meta-meta/quality-control/references/sampling-strategies.md +76 -0
  65. package/template/skills/en/meta-meta/rule-extraction/SKILL.md +100 -0
  66. package/template/skills/en/meta-meta/rule-extraction/references/chunking-strategies.md +80 -0
  67. package/template/skills/en/meta-meta/rule-graph/SKILL.md +118 -0
  68. package/template/skills/en/meta-meta/skill-authoring/SKILL.md +108 -0
  69. package/template/skills/en/meta-meta/skill-authoring/references/skill-format-spec.md +78 -0
  70. package/template/skills/en/meta-meta/skill-to-workflow/SKILL.md +150 -0
  71. package/template/skills/en/meta-meta/skill-to-workflow/references/worker-llm-catalog.md +50 -0
  72. package/template/skills/en/meta-meta/task-decomposition/SKILL.md +129 -0
  73. package/template/skills/en/meta-meta/task-decomposition/references/decision-matrix.md +81 -0
  74. package/template/skills/en/meta-meta/version-control/SKILL.md +152 -0
  75. package/template/skills/en/meta-meta/version-control/references/trace-id-spec.md +79 -0
  76. package/template/skills/en/skill-creator/LICENSE.txt +202 -0
  77. package/template/skills/en/skill-creator/SKILL.md +479 -0
  78. package/template/skills/en/skill-creator/agents/analyzer.md +274 -0
  79. package/template/skills/en/skill-creator/agents/comparator.md +202 -0
  80. package/template/skills/en/skill-creator/agents/grader.md +223 -0
  81. package/template/skills/en/skill-creator/assets/eval_review.html +146 -0
  82. package/template/skills/en/skill-creator/eval-viewer/generate_review.py +471 -0
  83. package/template/skills/en/skill-creator/eval-viewer/viewer.html +1325 -0
  84. package/template/skills/en/skill-creator/references/schemas.md +430 -0
  85. package/template/skills/en/skill-creator/scripts/__init__.py +0 -0
  86. package/template/skills/en/skill-creator/scripts/aggregate_benchmark.py +401 -0
  87. package/template/skills/en/skill-creator/scripts/generate_report.py +326 -0
  88. package/template/skills/en/skill-creator/scripts/improve_description.py +248 -0
  89. package/template/skills/en/skill-creator/scripts/package_skill.py +136 -0
  90. package/template/skills/en/skill-creator/scripts/quick_validate.py +103 -0
  91. package/template/skills/en/skill-creator/scripts/run_eval.py +310 -0
  92. package/template/skills/en/skill-creator/scripts/run_loop.py +332 -0
  93. package/template/skills/en/skill-creator/scripts/utils.py +47 -0
  94. package/template/skills/zh/meta/compliance-judgment/SKILL.md +303 -0
  95. package/template/skills/zh/meta/compliance-judgment/references/output-format.md +151 -0
  96. package/template/skills/zh/meta/confidence-system/SKILL.md +228 -0
  97. package/template/skills/zh/meta/corner-case-management/SKILL.md +235 -0
  98. package/template/skills/zh/meta/cross-document-verification/SKILL.md +241 -0
  99. package/template/skills/zh/meta/cross-document-verification/references/contradiction-taxonomy.md +73 -0
  100. package/template/skills/zh/meta/data-sensibility/SKILL.md +235 -0
  101. package/template/skills/zh/meta/document-parsing/SKILL.md +168 -0
  102. package/template/skills/zh/meta/document-parsing/references/parser-catalog.md +40 -0
  103. package/template/skills/zh/meta/entity-extraction/SKILL.md +276 -0
  104. package/template/skills/zh/meta/tree-processing/SKILL.md +233 -0
  105. package/template/skills/zh/meta-meta/bootstrap-workspace/SKILL.md +147 -0
  106. package/template/skills/zh/meta-meta/dashboard-reporting/SKILL.md +281 -0
  107. package/template/skills/zh/meta-meta/dashboard-reporting/scripts/generate_dashboard.py +178 -0
  108. package/template/skills/zh/meta-meta/evolution-loop/SKILL.md +302 -0
  109. package/template/skills/zh/meta-meta/evolution-loop/references/convergence-guide.md +62 -0
  110. package/template/skills/zh/meta-meta/quality-control/SKILL.md +269 -0
  111. package/template/skills/zh/meta-meta/quality-control/references/qa-layers.md +92 -0
  112. package/template/skills/zh/meta-meta/quality-control/references/sampling-strategies.md +76 -0
  113. package/template/skills/zh/meta-meta/rule-extraction/SKILL.md +208 -0
  114. package/template/skills/zh/meta-meta/rule-extraction/references/chunking-strategies.md +80 -0
  115. package/template/skills/zh/meta-meta/rule-graph/SKILL.md +203 -0
  116. package/template/skills/zh/meta-meta/skill-authoring/SKILL.md +235 -0
  117. package/template/skills/zh/meta-meta/skill-authoring/references/skill-format-spec.md +78 -0
  118. package/template/skills/zh/meta-meta/skill-to-workflow/SKILL.md +275 -0
  119. package/template/skills/zh/meta-meta/skill-to-workflow/references/worker-llm-catalog.md +50 -0
  120. package/template/skills/zh/meta-meta/task-decomposition/SKILL.md +224 -0
  121. package/template/skills/zh/meta-meta/task-decomposition/references/decision-matrix.md +81 -0
  122. package/template/skills/zh/meta-meta/version-control/SKILL.md +284 -0
  123. package/template/skills/zh/meta-meta/version-control/references/trace-id-spec.md +79 -0
  124. package/template/skills/zh/skill-creator/LICENSE.txt +202 -0
  125. package/template/skills/zh/skill-creator/SKILL.md +479 -0
  126. package/template/skills/zh/skill-creator/agents/analyzer.md +274 -0
  127. package/template/skills/zh/skill-creator/agents/comparator.md +202 -0
  128. package/template/skills/zh/skill-creator/agents/grader.md +223 -0
  129. package/template/skills/zh/skill-creator/assets/eval_review.html +146 -0
  130. package/template/skills/zh/skill-creator/eval-viewer/generate_review.py +471 -0
  131. package/template/skills/zh/skill-creator/eval-viewer/viewer.html +1325 -0
  132. package/template/skills/zh/skill-creator/references/schemas.md +430 -0
  133. package/template/skills/zh/skill-creator/scripts/__init__.py +0 -0
  134. package/template/skills/zh/skill-creator/scripts/aggregate_benchmark.py +401 -0
  135. package/template/skills/zh/skill-creator/scripts/generate_report.py +326 -0
  136. package/template/skills/zh/skill-creator/scripts/improve_description.py +248 -0
  137. package/template/skills/zh/skill-creator/scripts/package_skill.py +136 -0
  138. package/template/skills/zh/skill-creator/scripts/quick_validate.py +103 -0
  139. package/template/skills/zh/skill-creator/scripts/run_eval.py +310 -0
  140. package/template/skills/zh/skill-creator/scripts/run_loop.py +332 -0
  141. package/template/skills/zh/skill-creator/scripts/utils.py +47 -0
@@ -0,0 +1,80 @@
1
+ # Chunking Strategies for Long Documents
2
+
3
+ When regulation documents exceed what you can process in a single pass, use these proven strategies to decompose them into manageable chunks while preserving semantic coherence.
4
+
5
+ ## The Onion Peeler (Primary Strategy)
6
+
7
+ Hierarchical header-based decomposition. Named because you peel the document layer by layer, from the outermost structure inward.
8
+
9
+ ### How It Works
10
+
11
+ 1. **Parse the document's header hierarchy.** Identify all headers by level (H1, H2, H3, etc. — or their equivalents in the document's formatting: "Part I", "Chapter 1", "Section 1.1", "Article 1").
12
+ 2. **Build a tree.** Each header becomes a node. Content between headers belongs to the nearest preceding header at that level.
13
+ 3. **Check sizes.** Walk the tree. If a node's content (including all its children) fits within your processing limit, stop — this node is a chunk.
14
+ 4. **Split only when necessary.** If a node exceeds the limit, descend to its children. Only split when a node is too large AND has sub-headers to split on.
15
+ 5. **Leaf nodes that are still too large** get handled by the wedge-driving fallback (see below).
16
+
17
+ ### Why This Works
18
+
19
+ - Respects the document's own semantic structure. A "Chapter 3: Risk Disclosure" chunk contains exactly what the author intended that chapter to contain.
20
+ - Minimizes information loss. You never cut in the middle of a thought.
21
+ - Produces chunks of varying size — and that is fine. A short chapter is better as one chunk than split into artificial halves.
22
+
23
+ ### Pattern Discovery Shortcut
24
+
25
+ Before building a full parser, explore several sample documents for structural patterns:
26
+ - Do all chapter titles start with "Chapter X" or "第X章"?
27
+ - Are sections numbered consistently (1.1, 1.2, 1.3)?
28
+ - Are there visual markers (bold text, specific fonts, horizontal rules)?
29
+
30
+ If you find consistent patterns, a regex-based splitter is faster and more reliable than LLM-based structure detection. For example:
31
+ - `^第[一二三四五六七八九十百]+章` for Chinese chapter headers
32
+ - `^Chapter \d+` for English chapter headers
33
+ - `^\d+\.\d+` for numbered sections
34
+
35
+ Always validate the regex against multiple documents before committing to it.
36
+
37
+ ## Wedge Driving (Fallback Strategy)
38
+
39
+ For content without clear headers — dense legal text, continuous prose, or leaf nodes from the onion peeler that are still too large.
40
+
41
+ ### How It Works
42
+
43
+ The algorithm uses a **rolling context window** to process documents of arbitrary length without loading the full text at once.
44
+
45
+ **Step 1: Window the content.** Load up to MAX_TOKENS (e.g., 100K tokens — configurable) of the remaining unprocessed text into a window. If the remaining text fits in a single chunk, stop — no further splitting needed.
46
+
47
+ **Step 2: Ask an LLM for cut points.** Prompt the LLM to identify 1-3 natural break points within the window where topic or subject changes. For each cut point, the LLM returns:
48
+ - `tokens_before`: ~K tokens (default K=50) immediately BEFORE the cut, copied verbatim from the text.
49
+ - `tokens_after`: ~K tokens immediately AFTER the cut, copied verbatim.
50
+ - `chunk_title`: a 5-10 word title describing the chunk that precedes the cut.
51
+
52
+ Using token count (not word count) gives consistent granularity across languages — critical for Chinese text which has no whitespace-delimited words.
53
+
54
+ **Step 3: Locate the cuts via fuzzy matching.** The LLM's quoted tokens will not be a perfect match to the source text (minor paraphrasing, whitespace differences, encoding artifacts). Use Levenshtein distance (edit distance) to find the best match:
55
+ 1. Search the source text for the position that best matches `tokens_before`. Require at least 70% similarity (similarity = 1 - edit_distance / max_length).
56
+ 2. The cut position is immediately after the matched `tokens_before` region.
57
+ 3. Verify by checking that `tokens_after` appears near the cut position. If `tokens_after` cannot be matched, fall back to the position derived from `tokens_before` alone.
58
+
59
+ **Step 4: Slide and repeat.** Create a chunk from the text before the first confirmed cut. Move the window forward: the new window starts from the last cut point. Repeat until all remaining text fits in a single chunk.
60
+
61
+ ### Why This Works
62
+
63
+ - The LLM identifies semantic boundaries, not arbitrary character counts.
64
+ - The LLM never regenerates text — it only quotes positions. No hallucination risk.
65
+ - K-token quoting with Levenshtein matching is language-agnostic. It works for Chinese, English, and mixed-language documents equally well.
66
+ - The rolling window means documents of any length can be processed incrementally — the algorithm is not bounded by context window size.
67
+ - Fuzzy matching handles the inevitable small differences between the LLM's quoted text and the actual source.
68
+
69
+ ### When to Use
70
+
71
+ - Only when the onion peeler cannot split further (no sub-headers available).
72
+ - For documents with no structural markup at all.
73
+ - Cost consideration: this requires LLM calls. Use the cheapest model that can identify topic boundaries (often TIER3 or TIER4 is sufficient).
74
+
75
+ ## Practical Guidelines
76
+
77
+ - **Chunk size depends on the downstream task.** For rule extraction by the coding agent, chunks can be large (100K+ tokens). For worker LLM processing, chunks must fit in 16K-32K context.
78
+ - **Preserve context.** When splitting, include the parent header chain as context. A chunk from "Part II > Chapter 3 > Section 3.2" should include those headers so downstream processing knows where the content belongs.
79
+ - **Cache the tree.** Once a document's structure is parsed, save the tree. Multiple rules may need content from the same document, and re-parsing is wasteful.
80
+ - **Log your chunking decisions.** Which strategy was used, how many chunks were produced, their sizes. This helps debug downstream issues.
@@ -0,0 +1,203 @@
1
+ ---
2
+ name: rule-graph
3
+ description: Build and maintain a graph of relationships between verification rules — shared entities, logical dependencies, and conflicts. Use when analyzing the impact of a regulation change, when optimizing extraction to avoid duplicate work, when checking rule catalog completeness, or when rolling up document-level results into a summary. Critical constraint — the graph is an overlay for analysis, NOT a prerequisite for execution. Every rule must remain independently runnable.
4
+ ---
5
+
6
+ # 规则关系图谱
7
+
8
+ 规则不是孤立存在的。它们共享提取实体、彼此之间存在逻辑依赖、有时甚至相互矛盾。规则关系图谱把这些关系显式化,让你能把规则目录作为一个体系来分析,而不只是一个扁平的清单。
9
+
10
+ 但图谱也可能变成陷阱。如果规则脱离了图谱就跑不了,那你就建了一个单体系统。图谱是分析的透镜——永远不能成为执行的闸门。
11
+
12
+ ## 独立性优先——铁律
13
+
14
+ **每条规则必须独立可执行。** 这不是建议,这是规则图谱设计中最重要的约束,没有之一。
15
+
16
+ 企业系统多数是基于规则的。银行的现有合规平台会单独调用规则 R001 检查资本充足率,完全不涉及其他规则。如果 R001 要求先跑 R003 才能执行,集成就断了。如果 R001 要求加载整张图谱,集成也断了。
17
+
18
+ 这个约束来自实际的企业集成场景:
19
+ - 信贷审批系统可能只调用与当前业务类型相关的 3-5 条规则,而不是全部 50 条
20
+ - 监管报送系统可能只关心资本充足率相关的规则,不需要尽调类规则
21
+ - 不同部门可能使用不同的规则子集——风控部用风险类规则,合规部用披露类规则
22
+ - 规则可能被嵌入到现有的 BPM(业务流程管理)系统中,作为一个节点被调用
23
+
24
+ 在所有这些场景中,规则必须是自包含的。它的工作流收到输入(文档或提取结果),产出判定结果,不依赖任何其他规则的运行状态。
25
+
26
+ **图谱是叠加层(overlay)。** 它为编程智能体的分析和优化服务,不在规则之间创建硬依赖。任何一条规则,从图谱中取出来、放到一个独立系统中,必须能仅凭自身输入产出正确结果。
27
+
28
+ 当你发现真正的依赖关系(规则 B 只在规则 A 判定通过后才有意义),把它编码为图谱中的元数据。规则 B 的工作流必须处理规则 A 结果不可用的情况——要么自行内联计算,要么将 A 的结果作为可选输入接受,要么标记自身结果为 incomplete。
29
+
30
+ ## 图谱捕获的四类关系
31
+
32
+ ### 共享实体(Shared Entity)
33
+
34
+ 多条规则从同一文档区域提取同一实体。
35
+
36
+ **示例:** R001(资本充足率核查)和 R004(核心一级资本充足率核查)都需要从资产负债表中提取资本数据。图谱记录这个共享关系,优化器可以提取一次、多处复用。
37
+
38
+ 银行监管规则中常见的共享实体:
39
+ - 资本数据(资本充足率、核心一级资本充足率、杠杆率都需要)
40
+ - 不良贷款数据(不良贷款率、拨备覆盖率、拨贷比都需要)
41
+ - 资产负债表数据(多个流动性指标共享)
42
+ - 贷款基础信息(金额、期限、利率被多条审查规则引用)
43
+
44
+ 但共享提取是优化手段,不是必要条件。每条规则的工作流必须包含自己的提取逻辑作为默认路径。共享提取是规则批量运行时可以走的快速路径。
45
+
46
+ ### 逻辑依赖(Logical Dependency)
47
+
48
+ 规则 B 只在规则 A 的结果满足某个条件时才适用。
49
+
50
+ **示例:**
51
+ - 「如果借款人被评定为高风险客户(R012),则需要增强尽调文件(R013)」
52
+ - 「如果贷款金额超过 1000 万(R020),则需要经分行审批(R021)」
53
+ - 「如果抵押物为住宅房产(R030),则按住宅 LTV 上限核查(R031);如果为商业地产(R032),则按商业 LTV 上限核查(R033)」
54
+
55
+ 图谱捕获这些依赖,用途是:
56
+ - 展示结果时,R013 在 R012 的上下文中呈现
57
+ - R012 的逻辑变更时,你知道 R013 可能受影响
58
+ - 完整性检查时,确保条件分支的所有路径都有对应规则
59
+
60
+ 但 R013 的工作流必须在 R012 结果不可用时仍能运行——要么自行判断高风险分类,要么标记为「无法确定适用性」。
61
+
62
+ ### 规则冲突(Conflict)
63
+
64
+ 两条规则可能产生矛盾的指导。
65
+
66
+ **示例:**
67
+ - 监管要求 A 要求披露所有关联交易明细;监管要求 B 的隐私条款限制某些交易细节的披露
68
+ - 总行风控政策要求 LTV 不超过 70%;地方分行的灵活政策允许特定园区项目 LTV 放宽至 80%
69
+ - 旧版监管文件要求拨备覆盖率 ≥ 150%;新版过渡期政策暂时放宽至 ≥ 120%
70
+
71
+ 图谱标记冲突,让编程智能体上报给开发者用户决策,而不是默默选择其中一条。冲突不是 bug——它是现实监管环境中的常态,需要人来裁决优先级。
72
+
73
+ **冲突处理原则:**
74
+ - 时间优先:新规优先于旧规,但要注意过渡期安排
75
+ - 层级优先:上位法优先于下位法(银保监会 > 地方银监局 > 行内制度)
76
+ - 特别法优先:专门规定优先于一般规定
77
+ - 无法确定时:上报开发者用户,附上两条规则的原文和矛盾点
78
+
79
+ ### 共享角落案例(Shared Corner Case)
80
+
81
+ 影响多条规则的边缘情况。
82
+
83
+ **示例:**
84
+ - 文档中的合并单元格导致多条规则的表格提取失败
85
+ - 非标准日期格式(如「二〇二四年元月」)影响所有涉及日期的规则
86
+ - 某类特殊业务(如银团贷款)的文档结构与常规贷款不同,影响多条审查规则
87
+
88
+ 图谱关联这些规则到共享的角落案例,一处修复、多处感知。
89
+
90
+ ## 四个用途
91
+
92
+ ### 1. 影响分析(Impact Analysis)
93
+
94
+ 监管规则变更时,哪些核查规则受影响?从变更规则出发遍历图谱:
95
+
96
+ - **一度关联**:与变更规则共享实体或有依赖关系的规则
97
+ - **二度关联**:依赖于受影响规则的规则
98
+ - **冲突关联**:冲突解决方案可能需要重新审视的规则
99
+
100
+ **实际场景:** 银保监会修订了资本充足率的计算口径。从 R001(资本充足率)出发:
101
+ ```
102
+ R001(资本充足率)
103
+ ├── shares_entity → R004(核心一级资本充足率)—— 共享资本数据,可能受影响
104
+ ├── shares_entity → R007(杠杆率)—— 共享资本数据,可能受影响
105
+ ├── depends_on ← R015(综合评级)—— 使用 R001 结果,需要重新测试
106
+ └── conflicts_with → R050(过渡期政策)—— 冲突解决规则可能需要更新
107
+ ```
108
+
109
+ 没有图谱,影响分析需要逐条阅读所有规则查找关联。有了图谱,它是一次遍历。
110
+
111
+ ### 2. 优化(Optimization)
112
+
113
+ 批量处理文档时,图谱帮助识别优化机会:
114
+
115
+ - **共享提取**:实体 X 只需提取一次,供 R001、R004、R007 使用
116
+ - **执行排序**:如果 R012 先于 R013 运行,R013 可以复用 R012 的结果作为快捷路径(但绝不能依赖它)
117
+ - **并行分组**:没有边连接的规则可以并行运行
118
+
119
+ **示例:** 50 条规则中,通过共享实体分析发现:
120
+ - 资本相关实体被 8 条规则共享 → 提取一次,节省 7 次重复提取
121
+ - 贷款基础信息被 15 条规则共享 → 提取一次,节省 14 次
122
+ - 无关联的 20 条规则可以完全并行
123
+
124
+ 优化是投机性的(opportunistic)。它让批处理更快,但绝不能让单条规则依赖于批处理上下文。一条规则被单独拉出来运行,必须照样能工作。
125
+
126
+ ### 3. 完整性检查(Completeness Checking)
127
+
128
+ 将法规条文段落映射到规则。图谱帮助识别:
129
+
130
+ - **未覆盖段落**:法规中有实质性要求但没有对应规则的段落
131
+ - **过度覆盖段落**:被多条重叠规则覆盖的段落(潜在冗余)
132
+ - **覆盖率目标**:实质性法规段落中至少 95% 映射到至少一条规则
133
+
134
+ 完整性检查是规则目录的健康指标。覆盖率低说明提取不充分;重叠高说明规则粒度可能需要调整。
135
+
136
+ ### 4. 文档级结果汇总(Document-Level Rollup)
137
+
138
+ 展示一份文档的多规则核查结果时,用图谱实现:
139
+
140
+ - **分组展示**:资本充足率相关规则放在一起,信息披露规则放在一起
141
+ - **依赖排序**:R012(风险分类)展示在 R013(增强尽调)之前
142
+ - **冲突高亮**:两条规则产出矛盾结果时,突出冲突关系而非淹没在平铺列表中
143
+ - **严重度聚合**:5 条相关规则全部失败,这个集群的严重性可能高于 5 条不相关的失败
144
+
145
+ ## 图谱表示方式
146
+
147
+ 图谱存储为规则目录中的 JSON 邻接表。
148
+
149
+ ```json
150
+ {
151
+ "nodes": {
152
+ "R001": {"name": "资本充足率", "category": "资本监管"},
153
+ "R004": {"name": "核心一级资本充足率", "category": "资本监管"},
154
+ "R007": {"name": "杠杆率", "category": "资本监管"},
155
+ "R012": {"name": "借款人风险分类", "category": "风险评估"},
156
+ "R013": {"name": "增强尽调要求", "category": "尽职调查"},
157
+ "R015": {"name": "综合评级", "category": "综合评估"},
158
+ "R020": {"name": "大额贷款认定", "category": "授信管理"},
159
+ "R021": {"name": "分行审批权限", "category": "授信管理"},
160
+ "R031": {"name": "住宅LTV上限", "category": "抵押物管理"},
161
+ "R033": {"name": "商业地产LTV上限", "category": "抵押物管理"},
162
+ "R050": {"name": "过渡期政策适用", "category": "政策适用"}
163
+ },
164
+ "edges": [
165
+ {"from": "R001", "to": "R004", "type": "shares_entity", "entity": "资本数据(核心一级资本、附属资本、风险加权资产)"},
166
+ {"from": "R001", "to": "R007", "type": "shares_entity", "entity": "资本数据(一级资本、调整后表内外资产余额)"},
167
+ {"from": "R012", "to": "R013", "type": "depends_on", "condition": "R012判定为高风险客户"},
168
+ {"from": "R020", "to": "R021", "type": "depends_on", "condition": "R020判定贷款金额超过大额标准"},
169
+ {"from": "R001", "to": "R050", "type": "conflicts_with", "detail": "正式标准 vs 过渡期放宽标准,需确认适用期"},
170
+ {"from": "R015", "to": "R001", "type": "depends_on", "condition": "综合评级需要资本充足率结果"},
171
+ {"from": "R031", "to": "R033", "type": "conflicts_with", "detail": "同一抵押物可能在住宅和商业之间界定模糊"}
172
+ ]
173
+ }
174
+ ```
175
+
176
+ **边类型说明:**
177
+
178
+ | 边类型 | 含义 | 用途 |
179
+ |--------|------|------|
180
+ | `shares_entity` | 两条规则提取相同实体 | 优化提取,减少重复工作 |
181
+ | `depends_on` | 目标规则的适用性取决于源规则的结果 | 仅为元数据,不是硬执行依赖 |
182
+ | `conflicts_with` | 规则可能产出矛盾指导 | 需要上报机制 |
183
+ | `shares_corner_case` | 两条规则受同一边缘情况影响 | 一处修复、多处感知 |
184
+
185
+ 图谱在规则目录发生变更时更新——新增规则、修改规则、废止规则。通过 `dashboard-reporting` 可视化为节点-链接图。
186
+
187
+ ## 集成方式
188
+
189
+ **输入:** `rule-extraction` 产出的规则目录。图谱在规则提取完成后构建,随规则演化而更新。
190
+
191
+ **赋能:**
192
+ - `skill-authoring`:编写新规则技能时,查图谱了解相关规则,在 SKILL.md 中注明共享实体和依赖关系。
193
+ - `quality-control`:某条规则准确率下降时,查图谱定位可能连带受影响的下游规则。
194
+
195
+ **输出到:**
196
+ - `dashboard-reporting`:图谱可视化、影响分析结果、覆盖率指标。
197
+
198
+ ## 注意事项
199
+
200
+ - 图谱的维护成本应该低于它带来的收益。如果规则数量少于 10 条,手动追踪关联关系可能比维护图谱更高效。
201
+ - 图谱不替代人的判断。它帮你发现关联,但关联的业务含义需要开发者用户确认。
202
+ - 新增规则时先查图谱——看它与哪些现有规则有关联,避免重复建设。
203
+ - 废止规则时也查图谱——确保没有下游规则依赖它的结果(即使是软依赖)。
@@ -0,0 +1,235 @@
1
+ ---
2
+ name: skill-authoring
3
+ description: Write each verification rule into a Claude Code skill folder following the official skill format. Use when converting extracted rules into skill folders, when iterating on existing rule skills after testing, or when the developer user wants to capture domain knowledge as a skill. Each skill folder must be self-contained with business logic in SKILL.md, code in scripts/, regulation context in references/, and sample data in assets/. Also use the bundled skill-creator for the full eval/iterate workflow.
4
+ ---
5
+
6
+ # 核查规则的技能文件夹编写
7
+
8
+ ## 核心原则
9
+
10
+ 每条规则变成一个技能文件夹。文件夹必须自包含——把执行这条核查所需的一切信息都放进去。想象一下:如果另一个编程智能体只看这个文件夹、不看其他任何东西,它能不能正确执行这条核查?如果不能,说明文件夹内容不完整。
11
+
12
+ ## 技能文件夹结构
13
+
14
+ 每条规则的技能文件夹位于 `rule-skills/` 目录下,命名规范为 `R{编号}-{英文短名}/`:
15
+
16
+ ```
17
+ rule-skills/R001-invoice-date-validity/
18
+ ├── SKILL.md # 核查逻辑的完整描述(技能主文件)
19
+ ├── scripts/ # 确定性操作的代码脚本
20
+ │ ├── extract_date.py # 日期字段提取
21
+ │ └── validate.py # 格式校验逻辑
22
+ ├── references/ # 法规原文及解读
23
+ │ └── regulation.md # 相关法规条文的逐字摘录
24
+ ├── assets/ # 样本数据和边界案例
25
+ │ ├── samples.json # 测试样本(含预期结果)
26
+ │ └── corner_cases.json # 边界案例集
27
+ └── CHANGELOG.md # 变更记录
28
+ ```
29
+
30
+ ## 编写 SKILL.md
31
+
32
+ SKILL.md 是技能文件夹的灵魂。编程智能体在执行核查时首先读取这个文件,它必须提供足够清晰的指导。
33
+
34
+ ### 前置元数据(Frontmatter)
35
+
36
+ ```yaml
37
+ ---
38
+ name: R001-invoice-date-validity
39
+ description: Verify that invoice date falls within the contract validity period and complies with statutory time limits. Use when processing invoices against their corresponding contracts. Checks date format, date range, and cross-references with contract effective/expiry dates.
40
+ ---
41
+ ```
42
+
43
+ **name 必须与文件夹名一致。**
44
+
45
+ **description 要写得「强势」**——明确告诉系统什么时候该调用这个技能。不要含糊,要具体列出触发场景。description 保持英文以兼容系统调度。
46
+
47
+ ### 正文结构
48
+
49
+ 正文使用单据所使用的语言书写。如果核查的是中文单据,正文写中文;如果是英文单据,正文写英文。以下以中文单据为例。
50
+
51
+ #### 一、核查目标
52
+
53
+ 用一两句话说明这条规则要验证什么。
54
+
55
+ ```
56
+ 核查发票开具日期是否落在对应合同的有效期范围内,
57
+ 且是否符合法定的开票时限要求。
58
+ ```
59
+
60
+ #### 二、待核查字段的定位
61
+
62
+ 明确告诉编程智能体在单据中的什么位置去找需要核查的字段:
63
+
64
+ - 发票上的「开票日期」字段——通常位于发票右上角区域
65
+ - 合同中的「合同生效日期」和「合同到期日期」——通常位于合同首页或末页
66
+ - 如果是电子发票,日期字段的 JSON 路径是 `invoice.issue_date`
67
+
68
+ #### 三、提取规范
69
+
70
+ 规定字段提取后的标准化格式:
71
+
72
+ - 所有日期统一转换为 `YYYY-MM-DD` 格式
73
+ - 如果原始日期为中文格式(如「2025年3月15日」),需先解析为标准格式
74
+ - 缺失字段标记为 `null`,不要猜测或填充默认值
75
+
76
+ #### 四、判定逻辑
77
+
78
+ 用清晰的条件表达式描述核查逻辑:
79
+
80
+ ```
81
+ IF 发票日期 IS NULL:
82
+ 结论 = "无法核查",原因 = "发票日期缺失"
83
+ ELIF 合同生效日期 IS NULL OR 合同到期日期 IS NULL:
84
+ 结论 = "无法核查",原因 = "合同期限信息不完整"
85
+ ELIF 发票日期 < 合同生效日期:
86
+ 结论 = "不通过",原因 = "发票开具日期早于合同生效日期"
87
+ ELIF 发票日期 > 合同到期日期:
88
+ 结论 = "不通过",原因 = "发票开具日期晚于合同到期日期"
89
+ ELSE:
90
+ 结论 = "通过"
91
+ ```
92
+
93
+ #### 五、边界情况与例外
94
+
95
+ 列举已知的特殊情形:
96
+
97
+ - 合同存在展期或续签:以最新的到期日期为准
98
+ - 开票日期恰好等于合同生效日期或到期日期:视为通过
99
+ - 框架合同下的订单发票:以订单对应的执行期限为准,而非框架合同整体期限
100
+ - 补开发票的情形:部分法规允许在合同到期后一定期限内补开
101
+
102
+ #### 六、输出格式
103
+
104
+ 规定核查结果的标准输出结构:
105
+
106
+ ```json
107
+ {
108
+ "rule_id": "R001",
109
+ "rule_name": "发票日期有效性",
110
+ "verdict": "pass | fail | unable_to_verify",
111
+ "confidence": 0.95,
112
+ "details": {
113
+ "invoice_date": "2025-03-15",
114
+ "contract_start": "2025-01-01",
115
+ "contract_end": "2025-12-31"
116
+ },
117
+ "comment": "发票日期在合同有效期内",
118
+ "source_ref": "《增值税发票管理办法》第十五条"
119
+ }
120
+ ```
121
+
122
+ #### 七、批注说明
123
+
124
+ 如果核查发现问题,批注(comment)应当专业、简洁、可操作:
125
+
126
+ - 正例:「发票日期 2025-03-15 晚于合同到期日 2025-02-28,差异 15 天」
127
+ - 反例:「日期有问题」——信息量不足,不可操作
128
+
129
+ ## 编写 scripts/
130
+
131
+ scripts 目录存放确定性操作的 Python 脚本。规则是:**凡是不需要 LLM 判断的操作,都用代码实现。**
132
+
133
+ 适合用代码的操作:
134
+ - 日期格式解析与标准化
135
+ - 金额的数字与大写转换校验
136
+ - 税率计算与验证
137
+ - 正则表达式匹配(发票号码格式、统一社会信用代码格式)
138
+ - 字段存在性检查
139
+
140
+ 不适合用代码的操作:
141
+ - 从非结构化文本中提取语义信息
142
+ - 判断描述性内容是否合规
143
+ - 理解自然语言表述的业务含义
144
+
145
+ 脚本要求:
146
+ - 纯 Python,不依赖重型第三方库(允许 `re`、`json`、`datetime` 等标准库)
147
+ - 函数有清晰的输入输出类型注解
148
+ - 包含基本的错误处理
149
+ - 可独立运行测试
150
+
151
+ ## 编写 references/
152
+
153
+ 将与本规则直接相关的法规原文逐字摘录到 `references/regulation.md` 中。
154
+
155
+ 要求:
156
+ - 逐字引用,不要改写或概括
157
+ - 标注出处:法规名称、发布机构、文号、条款编号
158
+ - 如果开发者用户对模糊条文给出了解读,以「业务解读」标注记录
159
+ - 如果规则涉及多部法规,分段引用,注明各自适用范围
160
+
161
+ 这个文件的目的是让核查结论可追溯。任何核查结论都应该能在 references 中找到法规依据。
162
+
163
+ ## 编写 assets/
164
+
165
+ ### samples.json
166
+
167
+ 提供至少 3-5 个测试样本,覆盖以下场景:
168
+
169
+ ```json
170
+ [
171
+ {
172
+ "id": "S001",
173
+ "description": "标准通过案例",
174
+ "input": { "invoice_date": "2025-06-15", "contract_start": "2025-01-01", "contract_end": "2025-12-31" },
175
+ "expected_verdict": "pass"
176
+ },
177
+ {
178
+ "id": "S002",
179
+ "description": "发票日期超出合同期限",
180
+ "input": { "invoice_date": "2026-02-01", "contract_start": "2025-01-01", "contract_end": "2025-12-31" },
181
+ "expected_verdict": "fail"
182
+ },
183
+ {
184
+ "id": "S003",
185
+ "description": "合同日期缺失",
186
+ "input": { "invoice_date": "2025-06-15", "contract_start": null, "contract_end": null },
187
+ "expected_verdict": "unable_to_verify"
188
+ }
189
+ ]
190
+ ```
191
+
192
+ ### corner_cases.json
193
+
194
+ 记录在测试过程中发现的边界案例,格式与 samples.json 一致,但额外包含 `discovered_in` 字段标注发现来源(测试轮次、生产核查等)。
195
+
196
+ ## 技能迭代
197
+
198
+ 技能不是一次写完就定型的。通过测试和实际核查,技能会不断演化。
199
+
200
+ ### 迭代触发条件
201
+
202
+ - 测试中发现误判(漏报或误报)
203
+ - 开发者用户反馈规则描述不准确
204
+ - 发现新的边界案例
205
+ - 法规变更影响本规则
206
+
207
+ ### 迭代操作规范
208
+
209
+ 1. 在 CHANGELOG.md 中记录变更内容、原因、日期
210
+ 2. 更新 SKILL.md 中的相关段落
211
+ 3. 如果变更涉及判定逻辑,同步更新 scripts/
212
+ 4. 将新发现的边界案例添加到 corner_cases.json
213
+ 5. 重新运行全部测试样本,确认无回归
214
+
215
+ ### CHANGELOG.md 格式
216
+
217
+ ```markdown
218
+ ## v1.1 - 2025-04-01
219
+ - 新增边界案例:框架合同下的订单发票处理逻辑
220
+ - 修正判定逻辑:开票日期等于合同到期日视为通过
221
+ - 来源:第二轮测试反馈
222
+
223
+ ## v1.0 - 2025-03-20
224
+ - 初始版本,基于《增值税发票管理办法》第十五条提取
225
+ ```
226
+
227
+ ## 语言选择原则
228
+
229
+ SKILL.md 正文使用待核查单据的语言:
230
+
231
+ - 核查中文单据 → 正文写中文
232
+ - 核查英文单据 → 正文写英文
233
+ - 核查中英双语单据 → 正文写中文,关键英文术语保留原文
234
+
235
+ frontmatter 中的 name 和 description 始终使用英文,以确保系统层面的兼容性。
@@ -0,0 +1,78 @@
1
+ # Claude Code Skill Format Specification
2
+
3
+ Distilled from the official Anthropic skill-creator. This is the authoritative reference for writing correctly formatted skill folders.
4
+
5
+ ## Skill Folder Structure
6
+
7
+ ```
8
+ skill-name/
9
+ ├── SKILL.md (required) Metadata + instructions
10
+ ├── scripts/ (optional) Executable code
11
+ ├── references/ (optional) Detailed documentation, loaded on demand
12
+ └── assets/ (optional) Templates, data files, images
13
+ ```
14
+
15
+ The directory name must match the `name` field in SKILL.md frontmatter exactly.
16
+
17
+ ## SKILL.md Format
18
+
19
+ ### Frontmatter (YAML)
20
+
21
+ ```yaml
22
+ ---
23
+ name: skill-identifier
24
+ description: What this skill does and when to use it.
25
+ ---
26
+ ```
27
+
28
+ **Required fields:**
29
+
30
+ | Field | Constraints |
31
+ |-------|-------------|
32
+ | `name` | Max 64 chars. Lowercase letters, numbers, hyphens only. No leading/trailing/consecutive hyphens. Must match parent directory name. |
33
+ | `description` | Max 1024 chars. Non-empty. Describe what it does AND when to use it. |
34
+
35
+ **Optional fields:** `license`, `compatibility`, `metadata`
36
+
37
+ ### Description Best Practices
38
+
39
+ The description is the primary triggering mechanism — Claude uses it to decide when to invoke the skill. Make descriptions "pushy" to combat under-triggering:
40
+
41
+ - Include both capability AND trigger contexts.
42
+ - Use specific keywords the user might mention.
43
+ - List concrete use cases.
44
+ - State what NOT to use it for.
45
+ - Aim for 100-200 words.
46
+
47
+ ### Markdown Body
48
+
49
+ Following the frontmatter, write instructions in Markdown. Guidelines:
50
+
51
+ - **Under 500 lines.** If approaching this, move detail to references/.
52
+ - **Imperative form.** "Extract the value" not "The value should be extracted."
53
+ - **Explain the why** behind instructions, not just the what.
54
+ - **Include examples** when they clarify the expected behavior. One or two well-chosen examples beat ten mediocre ones.
55
+
56
+ ## Progressive Disclosure
57
+
58
+ Skills use three-level loading:
59
+
60
+ 1. **Metadata** (name + description): Always in context. ~100 tokens.
61
+ 2. **SKILL.md body**: Loaded when skill triggers. <500 lines ideal.
62
+ 3. **Bundled resources**: Loaded on demand. Unlimited size. Scripts can execute without loading.
63
+
64
+ Reference files clearly from SKILL.md with guidance on when to read them. For large reference files (>300 lines), include a table of contents.
65
+
66
+ ## File Referencing
67
+
68
+ Use relative paths from the skill root:
69
+ ```markdown
70
+ See [the reference guide](references/regulation.md) for the full regulation text.
71
+ Run the check script: `python scripts/check.py`
72
+ ```
73
+
74
+ ## Naming Conventions
75
+
76
+ - Directory names: lowercase with hyphens (`my-skill`, not `MySkill` or `my_skill`)
77
+ - Keep names short and descriptive
78
+ - For rule skills, prefix with the rule ID: `rule-001-capital-adequacy`