kc-beta 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/kc-beta.js +16 -0
- package/package.json +32 -0
- package/src/agent/confidence-scorer.js +120 -0
- package/src/agent/context.js +124 -0
- package/src/agent/corner-case-registry.js +119 -0
- package/src/agent/engine.js +224 -0
- package/src/agent/events.js +27 -0
- package/src/agent/history.js +101 -0
- package/src/agent/llm-client.js +131 -0
- package/src/agent/pipelines/base.js +14 -0
- package/src/agent/pipelines/distillation.js +113 -0
- package/src/agent/pipelines/extraction.js +92 -0
- package/src/agent/pipelines/index.js +23 -0
- package/src/agent/pipelines/initializer.js +163 -0
- package/src/agent/pipelines/production-qc.js +99 -0
- package/src/agent/pipelines/skill-authoring.js +83 -0
- package/src/agent/pipelines/skill-testing.js +111 -0
- package/src/agent/tools/agent-tool.js +100 -0
- package/src/agent/tools/base.js +35 -0
- package/src/agent/tools/dashboard-render.js +146 -0
- package/src/agent/tools/document-parse.js +184 -0
- package/src/agent/tools/document-search.js +111 -0
- package/src/agent/tools/evolution-cycle.js +150 -0
- package/src/agent/tools/qc-sample.js +94 -0
- package/src/agent/tools/registry.js +55 -0
- package/src/agent/tools/rule-catalog.js +113 -0
- package/src/agent/tools/sandbox-exec.js +106 -0
- package/src/agent/tools/tier-downgrade.js +114 -0
- package/src/agent/tools/worker-llm-call.js +109 -0
- package/src/agent/tools/workflow-run.js +138 -0
- package/src/agent/tools/workspace-file.js +122 -0
- package/src/agent/version-manager.js +130 -0
- package/src/agent/workspace.js +82 -0
- package/src/cli/components.js +164 -0
- package/src/cli/index.js +329 -0
- package/src/cli/init.js +80 -0
- package/src/cli/onboard.js +182 -0
- package/src/cli/terminal.js +143 -0
- package/src/config.js +93 -0
- package/template/.env.template +31 -0
- package/template/CLAUDE.md +137 -0
- package/template/Input/.gitkeep +0 -0
- package/template/Output/.gitkeep +0 -0
- package/template/Rules/.gitkeep +0 -0
- package/template/Samples/.gitkeep +0 -0
- package/template/skills/en/meta/compliance-judgment/SKILL.md +114 -0
- package/template/skills/en/meta/compliance-judgment/references/output-format.md +151 -0
- package/template/skills/en/meta/confidence-system/SKILL.md +117 -0
- package/template/skills/en/meta/corner-case-management/SKILL.md +111 -0
- package/template/skills/en/meta/cross-document-verification/SKILL.md +131 -0
- package/template/skills/en/meta/cross-document-verification/references/contradiction-taxonomy.md +73 -0
- package/template/skills/en/meta/data-sensibility/SKILL.md +115 -0
- package/template/skills/en/meta/document-parsing/SKILL.md +108 -0
- package/template/skills/en/meta/document-parsing/references/parser-catalog.md +40 -0
- package/template/skills/en/meta/entity-extraction/SKILL.md +129 -0
- package/template/skills/en/meta/tree-processing/SKILL.md +103 -0
- package/template/skills/en/meta-meta/bootstrap-workspace/SKILL.md +70 -0
- package/template/skills/en/meta-meta/dashboard-reporting/SKILL.md +106 -0
- package/template/skills/en/meta-meta/dashboard-reporting/scripts/generate_dashboard.py +178 -0
- package/template/skills/en/meta-meta/evolution-loop/SKILL.md +210 -0
- package/template/skills/en/meta-meta/evolution-loop/references/convergence-guide.md +62 -0
- package/template/skills/en/meta-meta/quality-control/SKILL.md +138 -0
- package/template/skills/en/meta-meta/quality-control/references/qa-layers.md +92 -0
- package/template/skills/en/meta-meta/quality-control/references/sampling-strategies.md +76 -0
- package/template/skills/en/meta-meta/rule-extraction/SKILL.md +100 -0
- package/template/skills/en/meta-meta/rule-extraction/references/chunking-strategies.md +80 -0
- package/template/skills/en/meta-meta/rule-graph/SKILL.md +118 -0
- package/template/skills/en/meta-meta/skill-authoring/SKILL.md +108 -0
- package/template/skills/en/meta-meta/skill-authoring/references/skill-format-spec.md +78 -0
- package/template/skills/en/meta-meta/skill-to-workflow/SKILL.md +150 -0
- package/template/skills/en/meta-meta/skill-to-workflow/references/worker-llm-catalog.md +50 -0
- package/template/skills/en/meta-meta/task-decomposition/SKILL.md +129 -0
- package/template/skills/en/meta-meta/task-decomposition/references/decision-matrix.md +81 -0
- package/template/skills/en/meta-meta/version-control/SKILL.md +152 -0
- package/template/skills/en/meta-meta/version-control/references/trace-id-spec.md +79 -0
- package/template/skills/en/skill-creator/LICENSE.txt +202 -0
- package/template/skills/en/skill-creator/SKILL.md +479 -0
- package/template/skills/en/skill-creator/agents/analyzer.md +274 -0
- package/template/skills/en/skill-creator/agents/comparator.md +202 -0
- package/template/skills/en/skill-creator/agents/grader.md +223 -0
- package/template/skills/en/skill-creator/assets/eval_review.html +146 -0
- package/template/skills/en/skill-creator/eval-viewer/generate_review.py +471 -0
- package/template/skills/en/skill-creator/eval-viewer/viewer.html +1325 -0
- package/template/skills/en/skill-creator/references/schemas.md +430 -0
- package/template/skills/en/skill-creator/scripts/__init__.py +0 -0
- package/template/skills/en/skill-creator/scripts/aggregate_benchmark.py +401 -0
- package/template/skills/en/skill-creator/scripts/generate_report.py +326 -0
- package/template/skills/en/skill-creator/scripts/improve_description.py +248 -0
- package/template/skills/en/skill-creator/scripts/package_skill.py +136 -0
- package/template/skills/en/skill-creator/scripts/quick_validate.py +103 -0
- package/template/skills/en/skill-creator/scripts/run_eval.py +310 -0
- package/template/skills/en/skill-creator/scripts/run_loop.py +332 -0
- package/template/skills/en/skill-creator/scripts/utils.py +47 -0
- package/template/skills/zh/meta/compliance-judgment/SKILL.md +303 -0
- package/template/skills/zh/meta/compliance-judgment/references/output-format.md +151 -0
- package/template/skills/zh/meta/confidence-system/SKILL.md +228 -0
- package/template/skills/zh/meta/corner-case-management/SKILL.md +235 -0
- package/template/skills/zh/meta/cross-document-verification/SKILL.md +241 -0
- package/template/skills/zh/meta/cross-document-verification/references/contradiction-taxonomy.md +73 -0
- package/template/skills/zh/meta/data-sensibility/SKILL.md +235 -0
- package/template/skills/zh/meta/document-parsing/SKILL.md +168 -0
- package/template/skills/zh/meta/document-parsing/references/parser-catalog.md +40 -0
- package/template/skills/zh/meta/entity-extraction/SKILL.md +276 -0
- package/template/skills/zh/meta/tree-processing/SKILL.md +233 -0
- package/template/skills/zh/meta-meta/bootstrap-workspace/SKILL.md +147 -0
- package/template/skills/zh/meta-meta/dashboard-reporting/SKILL.md +281 -0
- package/template/skills/zh/meta-meta/dashboard-reporting/scripts/generate_dashboard.py +178 -0
- package/template/skills/zh/meta-meta/evolution-loop/SKILL.md +302 -0
- package/template/skills/zh/meta-meta/evolution-loop/references/convergence-guide.md +62 -0
- package/template/skills/zh/meta-meta/quality-control/SKILL.md +269 -0
- package/template/skills/zh/meta-meta/quality-control/references/qa-layers.md +92 -0
- package/template/skills/zh/meta-meta/quality-control/references/sampling-strategies.md +76 -0
- package/template/skills/zh/meta-meta/rule-extraction/SKILL.md +208 -0
- package/template/skills/zh/meta-meta/rule-extraction/references/chunking-strategies.md +80 -0
- package/template/skills/zh/meta-meta/rule-graph/SKILL.md +203 -0
- package/template/skills/zh/meta-meta/skill-authoring/SKILL.md +235 -0
- package/template/skills/zh/meta-meta/skill-authoring/references/skill-format-spec.md +78 -0
- package/template/skills/zh/meta-meta/skill-to-workflow/SKILL.md +275 -0
- package/template/skills/zh/meta-meta/skill-to-workflow/references/worker-llm-catalog.md +50 -0
- package/template/skills/zh/meta-meta/task-decomposition/SKILL.md +224 -0
- package/template/skills/zh/meta-meta/task-decomposition/references/decision-matrix.md +81 -0
- package/template/skills/zh/meta-meta/version-control/SKILL.md +284 -0
- package/template/skills/zh/meta-meta/version-control/references/trace-id-spec.md +79 -0
- package/template/skills/zh/skill-creator/LICENSE.txt +202 -0
- package/template/skills/zh/skill-creator/SKILL.md +479 -0
- package/template/skills/zh/skill-creator/agents/analyzer.md +274 -0
- package/template/skills/zh/skill-creator/agents/comparator.md +202 -0
- package/template/skills/zh/skill-creator/agents/grader.md +223 -0
- package/template/skills/zh/skill-creator/assets/eval_review.html +146 -0
- package/template/skills/zh/skill-creator/eval-viewer/generate_review.py +471 -0
- package/template/skills/zh/skill-creator/eval-viewer/viewer.html +1325 -0
- package/template/skills/zh/skill-creator/references/schemas.md +430 -0
- package/template/skills/zh/skill-creator/scripts/__init__.py +0 -0
- package/template/skills/zh/skill-creator/scripts/aggregate_benchmark.py +401 -0
- package/template/skills/zh/skill-creator/scripts/generate_report.py +326 -0
- package/template/skills/zh/skill-creator/scripts/improve_description.py +248 -0
- package/template/skills/zh/skill-creator/scripts/package_skill.py +136 -0
- package/template/skills/zh/skill-creator/scripts/quick_validate.py +103 -0
- package/template/skills/zh/skill-creator/scripts/run_eval.py +310 -0
- package/template/skills/zh/skill-creator/scripts/run_loop.py +332 -0
- package/template/skills/zh/skill-creator/scripts/utils.py +47 -0
|
@@ -0,0 +1,151 @@
|
|
|
1
|
+
# Lightweight Output Format Specification
|
|
2
|
+
|
|
3
|
+
This document defines the compact text markup format for verification results, its grammar, JSON conversion rules, and edge case handling.
|
|
4
|
+
|
|
5
|
+
## Grammar
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
[RESULT] field_name <- value (constraint) | conf:score | src:location | note:text
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
| Component | Required | Format | Description | Example |
|
|
12
|
+
|-----------|----------|--------|-------------|---------|
|
|
13
|
+
| `[RESULT]` | Yes | One of: PASS, FAIL, MISSING, ERROR, UNCERTAIN | The judgment outcome. | `[FAIL]` |
|
|
14
|
+
| `field_name` | Yes | snake_case identifier | The rule or field being checked. | `capital_adequacy` |
|
|
15
|
+
| `<- value` | No (omit for MISSING) | Free text, no pipes | The extracted value from the document. | `<- 12.5%` |
|
|
16
|
+
| `(constraint)` | No (omit if no constraint) | Parenthesized expression | The expected value or condition. | `(>= 8.0%)` |
|
|
17
|
+
| `conf:score` | Yes | Decimal 0.00-1.00 | Confidence score of the judgment. | `conf:0.95` |
|
|
18
|
+
| `src:location` | No | Page-section reference or trace ID prefix | Source location in the document. | `src:p3-s2` |
|
|
19
|
+
| `note:text` | No | Free text to end of line | Human-readable comment. | `note:Signing overdue by 45 days` |
|
|
20
|
+
|
|
21
|
+
Components after `field_name` are separated by ` | ` (space-pipe-space). The `<- value` and `(constraint)` components appear before the first pipe, space-separated.
|
|
22
|
+
|
|
23
|
+
## Field Definitions
|
|
24
|
+
|
|
25
|
+
### Result Values
|
|
26
|
+
|
|
27
|
+
| Value | Meaning | When to Use |
|
|
28
|
+
|-------|---------|-------------|
|
|
29
|
+
| `PASS` | Entity complies with the rule. | Deterministic or semantic check confirms compliance. |
|
|
30
|
+
| `FAIL` | Entity does not comply. | Clear non-compliance detected. Note is strongly recommended. |
|
|
31
|
+
| `MISSING` | Entity not found in document. | Extraction could not locate the required field. |
|
|
32
|
+
| `ERROR` | Processing failure. | Parsing error, API timeout, unexpected format. |
|
|
33
|
+
| `UNCERTAIN` | Ambiguous judgment. | Borderline values, conflicting evidence, low confidence. |
|
|
34
|
+
|
|
35
|
+
### Confidence Score
|
|
36
|
+
|
|
37
|
+
A decimal between 0.00 and 1.00 representing the system's confidence in the result. For deterministic Python checks, confidence is typically 0.95-1.00. For LLM semantic judgments, confidence reflects the model's self-assessed certainty. Scores below the configured threshold in `.env` trigger human review.
|
|
38
|
+
|
|
39
|
+
### Source Location
|
|
40
|
+
|
|
41
|
+
The `src:` component uses a compact reference format: `p{page}-s{section}`. Example: `src:p3-s2` means page 3, section 2. For trace ID integration, use the trace ID prefix: `src:R001-DOC042-P3-S2` (see Integration with Trace IDs below).
|
|
42
|
+
|
|
43
|
+
## JSON Conversion
|
|
44
|
+
|
|
45
|
+
### Markup to JSON
|
|
46
|
+
|
|
47
|
+
```
|
|
48
|
+
Input: [FAIL] sign_date_gap <- 75d (<= 30d) | conf:0.90 | src:p1-s4 | note:Signing overdue by 45 days
|
|
49
|
+
|
|
50
|
+
Output:
|
|
51
|
+
{
|
|
52
|
+
"field": "sign_date_gap",
|
|
53
|
+
"result": "fail",
|
|
54
|
+
"extracted_value": "75d",
|
|
55
|
+
"expected": "<= 30d",
|
|
56
|
+
"confidence": 0.90,
|
|
57
|
+
"source": "p1-s4",
|
|
58
|
+
"comment": "Signing overdue by 45 days"
|
|
59
|
+
}
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Pseudocode:
|
|
63
|
+
1. Parse `[RESULT]` -> lowercase -> `result` field.
|
|
64
|
+
2. Parse next token -> `field` field.
|
|
65
|
+
3. If `<-` follows, parse until `(` or `|` -> `extracted_value`.
|
|
66
|
+
4. If `(...)` follows, parse contents -> `expected`.
|
|
67
|
+
5. Split remaining by ` | `. For each segment:
|
|
68
|
+
- `conf:X` -> `confidence` (parse as float).
|
|
69
|
+
- `src:X` -> `source`.
|
|
70
|
+
- `note:X` -> `comment`.
|
|
71
|
+
|
|
72
|
+
### JSON to Markup
|
|
73
|
+
|
|
74
|
+
Pseudocode:
|
|
75
|
+
1. `[` + uppercase(`result`) + `] ` + `field`.
|
|
76
|
+
2. If `extracted_value` exists: ` <- ` + `extracted_value`.
|
|
77
|
+
3. If `expected` exists: ` (` + `expected` + `)`.
|
|
78
|
+
4. ` | conf:` + format(`confidence`, 2 decimal places).
|
|
79
|
+
5. If `source` exists: ` | src:` + `source`.
|
|
80
|
+
6. If `comment` exists: ` | note:` + `comment`.
|
|
81
|
+
|
|
82
|
+
## Diff Example
|
|
83
|
+
|
|
84
|
+
Comparing two verification runs is where markup shines.
|
|
85
|
+
|
|
86
|
+
**Markup diff** (clean, scannable):
|
|
87
|
+
```
|
|
88
|
+
[PASS] capital_adequacy <- 12.5% (>= 8.0%) | conf:0.95 | src:p3-s2
|
|
89
|
+
- [PASS] sign_date_gap <- 28d (<= 30d) | conf:0.92 | src:p1-s4
|
|
90
|
+
+ [FAIL] sign_date_gap <- 75d (<= 30d) | conf:0.90 | src:p1-s4 | note:Signing overdue by 45 days
|
|
91
|
+
[MISSING] collateral_value | conf:0.60 | note:Collateral valuation not found
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
**JSON diff** (noisy, hard to scan):
|
|
95
|
+
```json
|
|
96
|
+
{
|
|
97
|
+
"field": "sign_date_gap",
|
|
98
|
+
- "result": "pass",
|
|
99
|
+
+ "result": "fail",
|
|
100
|
+
- "extracted_value": "28d",
|
|
101
|
+
+ "extracted_value": "75d",
|
|
102
|
+
"expected": "<= 30d",
|
|
103
|
+
- "confidence": 0.92,
|
|
104
|
+
+ "confidence": 0.90,
|
|
105
|
+
"source": "p1-s4",
|
|
106
|
+
- "comment": ""
|
|
107
|
+
+ "comment": "Signing overdue by 45 days"
|
|
108
|
+
}
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
The markup diff communicates the same information in one changed line vs. five changed lines.
|
|
112
|
+
|
|
113
|
+
## Edge Cases
|
|
114
|
+
|
|
115
|
+
### Multi-Value Fields
|
|
116
|
+
When a field has multiple extracted values (e.g., the same metric appears in two places with different values), separate values with semicolons:
|
|
117
|
+
```
|
|
118
|
+
[UNCERTAIN] total_assets <- 1,234,567;1,234,590 | conf:0.50 | src:p3-s1;p7-s2 | note:Conflicting values found
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
### Long Notes
|
|
122
|
+
In markup, truncate notes longer than 80 characters with `...`. The full text is preserved in JSON. Example:
|
|
123
|
+
```
|
|
124
|
+
[FAIL] risk_disclosure <- (see detail) | conf:0.85 | note:Missing discussion of liquidity risk, market risk, and operational ri...
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
### Special Characters
|
|
128
|
+
If a value or note contains the pipe character `|`, escape it with a backslash: `\|`. During JSON conversion, unescape back to `|`.
|
|
129
|
+
|
|
130
|
+
### Fields with No Constraint
|
|
131
|
+
Omit the parenthetical entirely:
|
|
132
|
+
```
|
|
133
|
+
[MISSING] collateral_value | conf:0.60 | note:Collateral valuation not found in document
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### Fields with No Extracted Value
|
|
137
|
+
Omit the `<-` component (common for MISSING and ERROR results):
|
|
138
|
+
```
|
|
139
|
+
[ERROR] capital_adequacy | conf:0.00 | note:PDF parsing failed on page 3
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## Integration with Trace IDs
|
|
143
|
+
|
|
144
|
+
The `src:` component can encode trace ID prefixes, linking each result line to the full trace ID defined by `version-control`. Use the trace ID format directly:
|
|
145
|
+
|
|
146
|
+
```
|
|
147
|
+
[PASS] capital_adequacy <- 12.5% (>= 8.0%) | conf:0.95 | src:R001-DOC042-P3-S2
|
|
148
|
+
[FAIL] sign_date_gap <- 75d (<= 30d) | conf:0.90 | src:R003-DOC042-P1-S4 | note:Signing overdue by 45 days
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
When converting to JSON, the `src:` value maps to the `trace_id` field in the full result object. The character range (`C{start}:{end}`) can be appended when full precision is needed: `src:R001-DOC042-P3-S2-C120:180`.
|
|
@@ -0,0 +1,117 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: confidence-system
|
|
3
|
+
description: Design and calibrate confidence scoring for extraction and verification results. Use when building any workflow that needs to quantify trust in its output, when setting up quality control sampling thresholds, or when calibrating existing confidence scores against actual accuracy. Confidence is the bridge between workflows and quality control — high confidence means less review, low confidence means more review. Also use when the quality control skill reports that confidence scores do not correlate with actual correctness.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Confidence System
|
|
7
|
+
|
|
8
|
+
Confidence is not about the model's certainty — it is about your system's track record. A confidence score should predict: "If I see this score, how likely is the result to be correct?" If your 0.9 confidence results are correct 90% of the time, your confidence system is calibrated.
|
|
9
|
+
|
|
10
|
+
## Why Confidence Matters
|
|
11
|
+
|
|
12
|
+
Without confidence, you have two choices:
|
|
13
|
+
1. Review everything (expensive, defeats the purpose of automation).
|
|
14
|
+
2. Review nothing (risky, errors slip through).
|
|
15
|
+
|
|
16
|
+
With confidence, you can review intelligently: spend your review budget where errors are most likely.
|
|
17
|
+
|
|
18
|
+
## Composite Scoring
|
|
19
|
+
|
|
20
|
+
Confidence for a single extraction or judgment result should combine multiple signals. No single signal is reliable enough on its own.
|
|
21
|
+
|
|
22
|
+
### Signal: Extraction Method Prior
|
|
23
|
+
How inherently reliable is the extraction method?
|
|
24
|
+
- Regex match with validated format: 0.90-0.95
|
|
25
|
+
- LLM extraction with structured output: 0.75-0.85
|
|
26
|
+
- LLM extraction with free-form output: 0.60-0.75
|
|
27
|
+
- Fallback or inferred value: 0.40-0.50
|
|
28
|
+
|
|
29
|
+
This is a prior — it reflects the method's general reliability, not this specific result.
|
|
30
|
+
|
|
31
|
+
### Signal: Source Text Presence
|
|
32
|
+
Was the extracted value clearly present in the source text?
|
|
33
|
+
- Exact string found in source: high signal
|
|
34
|
+
- Approximate match found: medium signal
|
|
35
|
+
- No matching text in source (model inferred or generated): low signal
|
|
36
|
+
|
|
37
|
+
This catches hallucination. If the model claims "capital adequacy ratio is 12.5%" but "12.5" does not appear anywhere in the source section, that is a red flag.
|
|
38
|
+
|
|
39
|
+
### Signal: Historical Accuracy
|
|
40
|
+
How often has this rule, on this document type, with this extraction method, been correct in the past?
|
|
41
|
+
- First iteration (no history): use the method prior only.
|
|
42
|
+
- After QC reviews: compute actual accuracy and blend it in.
|
|
43
|
+
|
|
44
|
+
This is the most valuable signal over time. It reflects real performance, not assumptions.
|
|
45
|
+
|
|
46
|
+
### Signal: Corner Case Proximity
|
|
47
|
+
Does this document match any known corner case pattern?
|
|
48
|
+
- Exact match: lower confidence (the standard workflow may not apply).
|
|
49
|
+
- Near miss: slightly lower confidence.
|
|
50
|
+
- No match: neutral (no adjustment).
|
|
51
|
+
|
|
52
|
+
### Combining Signals
|
|
53
|
+
|
|
54
|
+
Start with a simple weighted average:
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
confidence = w1 * method_prior + w2 * source_presence + w3 * historical_accuracy + w4 * corner_case_adjustment
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Initial weights (adjust through calibration):
|
|
61
|
+
- w1 (method): 0.25
|
|
62
|
+
- w2 (source): 0.25
|
|
63
|
+
- w3 (history): 0.35 (most important once available)
|
|
64
|
+
- w4 (corner case): 0.15
|
|
65
|
+
|
|
66
|
+
When historical accuracy is not yet available (early iterations), redistribute its weight to the other signals.
|
|
67
|
+
|
|
68
|
+
## Threshold Bands
|
|
69
|
+
|
|
70
|
+
Define bands that map confidence to review action:
|
|
71
|
+
|
|
72
|
+
| Band | Confidence Range | Action |
|
|
73
|
+
|------|-----------------|--------|
|
|
74
|
+
| High | Above auto-accept threshold | Spot-check only (5-10% random sample) |
|
|
75
|
+
| Medium | Between thresholds | Sample at MONITOR_FREQUENCY rate |
|
|
76
|
+
| Low | Below full-review threshold | Review every result |
|
|
77
|
+
|
|
78
|
+
Starting thresholds: auto-accept = 0.85, full-review = 0.60. These are defaults — calibrate per rule.
|
|
79
|
+
|
|
80
|
+
## Calibration
|
|
81
|
+
|
|
82
|
+
Calibration is the process of checking: "Do my confidence scores actually predict accuracy?"
|
|
83
|
+
|
|
84
|
+
### How to Calibrate
|
|
85
|
+
|
|
86
|
+
After each QC review cycle:
|
|
87
|
+
1. Group reviewed results by confidence band (e.g., 0.0-0.2, 0.2-0.4, ..., 0.8-1.0).
|
|
88
|
+
2. For each band, compute the actual accuracy (% of results that QC confirmed as correct).
|
|
89
|
+
3. Compare actual accuracy to the confidence band's midpoint.
|
|
90
|
+
4. If they match (0.8-0.9 band has ~85% actual accuracy), the system is calibrated.
|
|
91
|
+
5. If they diverge (0.8-0.9 band has only 60% actual accuracy), the confidence is overestimated — adjust weights.
|
|
92
|
+
|
|
93
|
+
### When to Recalibrate
|
|
94
|
+
|
|
95
|
+
- After the first QC review cycle (establishing initial calibration).
|
|
96
|
+
- After a workflow version change (new code may have different reliability characteristics).
|
|
97
|
+
- After confidence thresholds are adjusted.
|
|
98
|
+
- When the QC skill reports that confidence does not predict correctness.
|
|
99
|
+
|
|
100
|
+
## Integration Points
|
|
101
|
+
|
|
102
|
+
- **Entity extraction** assigns initial confidence based on method prior and source presence.
|
|
103
|
+
- **Compliance judgment** may adjust confidence based on the complexity of the judgment.
|
|
104
|
+
- **Quality control** uses confidence bands to determine review sampling.
|
|
105
|
+
- **Evolution loop** uses confidence trends to detect degradation.
|
|
106
|
+
- **Dashboard** displays confidence distribution for developer user visibility.
|
|
107
|
+
|
|
108
|
+
## Keep It Simple Initially
|
|
109
|
+
|
|
110
|
+
Do not over-engineer the confidence system upfront. Start with the method prior alone:
|
|
111
|
+
- Regex: 0.90
|
|
112
|
+
- LLM: 0.75
|
|
113
|
+
- Fallback: 0.50
|
|
114
|
+
|
|
115
|
+
Run QC on the first few batches. See whether these scores predict actual accuracy. If they do, you are done for now. If they do not, add signals incrementally.
|
|
116
|
+
|
|
117
|
+
The confidence system should earn its complexity, not start with it.
|
|
@@ -0,0 +1,111 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: corner-case-management
|
|
3
|
+
description: Identify, catalog, and handle corner cases that do not fit the mainstream verification workflow. Use when the evolution loop classifies a failure as a corner case (affecting less than ~10% of documents), when adding a new edge case to the registry, or when deciding whether a corner case should be promoted to a systemic fix. Also use when designing the corner case detection mechanism for a workflow.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Corner Case Management
|
|
7
|
+
|
|
8
|
+
A good workflow handles 90% of cases cleanly. Corner cases are the other 10%. They are individually rare but collectively significant. The key insight: do NOT patch the main workflow to handle them. That leads to spaghetti logic, fragile code, and regressions.
|
|
9
|
+
|
|
10
|
+
Instead, maintain a separate registry. Check incoming documents against the registry before the standard workflow. Handle matches with specific resolutions.
|
|
11
|
+
|
|
12
|
+
## Philosophy
|
|
13
|
+
|
|
14
|
+
Corner cases are a fact of life in document verification. Financial documents are produced by thousands of different organizations, each with their own formatting quirks, templates, and interpretations of regulations. No workflow will handle every variant.
|
|
15
|
+
|
|
16
|
+
The question is not "how do I eliminate corner cases?" but "how do I manage them efficiently?"
|
|
17
|
+
|
|
18
|
+
The answer is: separate them from the main logic. Keep the workflow clean. Keep the corner cases cataloged, detectable, and resolvable.
|
|
19
|
+
|
|
20
|
+
## The Corner Case Registry
|
|
21
|
+
|
|
22
|
+
A structured file (`corner_cases.json`) in the rule skill's `assets/` directory:
|
|
23
|
+
|
|
24
|
+
```json
|
|
25
|
+
[
|
|
26
|
+
{
|
|
27
|
+
"id": "CC001",
|
|
28
|
+
"rule_id": "R001",
|
|
29
|
+
"description": "Some reports express capital adequacy as a decimal (0.125) instead of percentage (12.5%)",
|
|
30
|
+
"affected_documents": ["report_bank_xyz_2024.pdf"],
|
|
31
|
+
"detection_pattern": {
|
|
32
|
+
"type": "regex",
|
|
33
|
+
"pattern": "资本充足率[::]*\\s*0\\.\\d+",
|
|
34
|
+
"confidence_threshold": 0.8
|
|
35
|
+
},
|
|
36
|
+
"resolution": {
|
|
37
|
+
"type": "code",
|
|
38
|
+
"action": "Multiply extracted value by 100 before threshold comparison",
|
|
39
|
+
"code_snippet": "if value < 1.0: value *= 100"
|
|
40
|
+
},
|
|
41
|
+
"discovered_at": "2026-04-01",
|
|
42
|
+
"iteration": 3,
|
|
43
|
+
"status": "active"
|
|
44
|
+
}
|
|
45
|
+
]
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Each entry captures:
|
|
49
|
+
- **What** the corner case is (description).
|
|
50
|
+
- **How to detect** it (detection pattern with type: regex, keyword, structural, or model-based).
|
|
51
|
+
- **How to resolve** it (resolution with type: code, regex, prompt, or manual).
|
|
52
|
+
- **When** it was discovered and in which iteration.
|
|
53
|
+
- **Status**: active, promoted (moved to main workflow), or deprecated.
|
|
54
|
+
|
|
55
|
+
## Detection During Execution
|
|
56
|
+
|
|
57
|
+
Before running the standard workflow on a document:
|
|
58
|
+
|
|
59
|
+
1. Load the corner case registry for the relevant rule.
|
|
60
|
+
2. Check the document against each active corner case's detection pattern.
|
|
61
|
+
3. If a match exceeds the confidence threshold, apply the specific resolution instead of (or in addition to) the standard workflow.
|
|
62
|
+
4. Log that a corner case was triggered.
|
|
63
|
+
|
|
64
|
+
This is similar to a RAG pipeline with progressive disclosure:
|
|
65
|
+
- The registry is the knowledge base.
|
|
66
|
+
- Detection patterns are the retrieval queries.
|
|
67
|
+
- High confidence thresholds prevent false matches.
|
|
68
|
+
- Only relevant corner cases are loaded into context.
|
|
69
|
+
|
|
70
|
+
## When to Add a Corner Case
|
|
71
|
+
|
|
72
|
+
Add a corner case when:
|
|
73
|
+
- The evolution loop classifies a failure as non-systemic (affects <10% of documents).
|
|
74
|
+
- The failure has a recognizable, describable pattern.
|
|
75
|
+
- The resolution is clear and self-contained.
|
|
76
|
+
|
|
77
|
+
Do NOT add a corner case when:
|
|
78
|
+
- The failure affects many documents (that is a systemic issue — fix the workflow).
|
|
79
|
+
- The failure has no discernible pattern (that may be a data quality issue — escalate to developer user).
|
|
80
|
+
- The resolution would require changing the core judgment logic (that belongs in the main workflow).
|
|
81
|
+
|
|
82
|
+
## When to Promote a Corner Case
|
|
83
|
+
|
|
84
|
+
A corner case should be promoted to the main workflow (i.e., the resolution becomes part of the standard logic) when:
|
|
85
|
+
- It starts appearing in >10% of documents. It is no longer a corner case — it is a pattern.
|
|
86
|
+
- Multiple similar corner cases suggest a common underlying issue.
|
|
87
|
+
- The developer user explicitly says "this is how it always works."
|
|
88
|
+
|
|
89
|
+
When promoting, remove the corner case from the registry and update the workflow. Version both changes.
|
|
90
|
+
|
|
91
|
+
## Human Visibility
|
|
92
|
+
|
|
93
|
+
The corner case registry must be readable and manageable by the developer user:
|
|
94
|
+
- Format it clearly (JSON or a markdown table).
|
|
95
|
+
- Include enough context that a domain expert can understand each case without reading the code.
|
|
96
|
+
- Report new corner cases in the dashboard.
|
|
97
|
+
- Allow the developer user to add corner cases from their own expertise.
|
|
98
|
+
|
|
99
|
+
Developer users often know about edge cases that the coding agent has not yet encountered. They should be able to add entries like:
|
|
100
|
+
- "Bank XYZ always uses a different template for their Q4 reports."
|
|
101
|
+
- "Mutual fund documents from before 2020 follow the old regulation format."
|
|
102
|
+
|
|
103
|
+
These are valuable inputs that prevent future failures.
|
|
104
|
+
|
|
105
|
+
## Corner Case Cost
|
|
106
|
+
|
|
107
|
+
Every corner case has a runtime cost: the detection check runs on every document. Keep the registry lean:
|
|
108
|
+
- Remove deprecated corner cases.
|
|
109
|
+
- Merge similar corner cases into a single entry with a broader pattern.
|
|
110
|
+
- Keep detection patterns efficient (prefer regex over LLM-based detection).
|
|
111
|
+
- Monitor the registry size. If it grows beyond ~50 entries for a single rule, that suggests the workflow itself needs improvement.
|
|
@@ -0,0 +1,131 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: cross-document-verification
|
|
3
|
+
description: Perform case-level analysis across multiple documents for the same transaction. Use when documents do not exist in isolation — main contracts have appendices, loan applications come bundled with income certificates, bank statements, credit reports, and property appraisals. Use to build comparison matrices, detect contradictions (hard mismatches and soft implausibilities), classify severity, and flag fraud signals. Also use when user or end-user reports a cross-document inconsistency — these reports are ground truth and take priority over agent judgment.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Cross-Document Verification
|
|
7
|
+
|
|
8
|
+
Single-document verification asks: does this document comply with the rules? Cross-document verification asks a different question: do all the documents in this transaction tell a consistent story?
|
|
9
|
+
|
|
10
|
+
This is the difference between a document checker and a case analyst. A document checker reviews files one at a time. A case analyst lays them all on the table and looks for the threads that connect — or contradict — each other. When you activate cross-document verification, you are upgrading the system from checker to analyst.
|
|
11
|
+
|
|
12
|
+
## The Case Concept
|
|
13
|
+
|
|
14
|
+
A case is the set of documents that belong to one transaction, one borrower, or one deal. Documents within a case share entities: names, dates, amounts, identifiers. These shared entities are your anchors for comparison.
|
|
15
|
+
|
|
16
|
+
Two common case patterns:
|
|
17
|
+
|
|
18
|
+
1. **Main contract + appendices/supplements.** Same issuer, formally cross-referenced. The main contract states the total, the appendices break it down. The main contract references Appendix B; Appendix B must exist and match. The team has extensive experience with inconsistencies in this pattern — totals that do not match line item sums, referenced appendices that are missing or outdated, version mismatches between main body and supplements.
|
|
19
|
+
|
|
20
|
+
2. **Multi-source bundle.** Loan application + income certificate + bank statement + credit report + property appraisal. Different issuers, different formats, same borrower. The applicant's name, ID number, income, and employment must be consistent across all documents.
|
|
21
|
+
|
|
22
|
+
In both patterns, the shared entities are the anchors. Every anchor that appears in more than one document is a candidate for cross-verification.
|
|
23
|
+
|
|
24
|
+
## Building the Comparison Matrix
|
|
25
|
+
|
|
26
|
+
The comparison matrix is the core artifact. Rows are shared fields. Columns are source documents. Each cell contains the value found in that document for that field, or is marked absent.
|
|
27
|
+
|
|
28
|
+
Example matrix for a loan case:
|
|
29
|
+
|
|
30
|
+
| Field | Application | Income Cert | Bank Statement | Credit Report |
|
|
31
|
+
|-------|-------------|-------------|----------------|---------------|
|
|
32
|
+
| Applicant name | Zhang Wei | Zhang Wei | Zhang Wei | Zhang W. |
|
|
33
|
+
| ID number | 310...1234 | 310...1234 | — | 310...1234 |
|
|
34
|
+
| Monthly income | 85,000 | 82,000 | avg 43,000 | — |
|
|
35
|
+
| Employer | ABC Corp | ABC Corp Ltd | — | ABC Corporation |
|
|
36
|
+
| Employment start | 2019-03 | 2019-06 | — | 2019 |
|
|
37
|
+
|
|
38
|
+
Example matrix for a contract + appendices case:
|
|
39
|
+
|
|
40
|
+
| Field | Main Contract | Appendix A | Appendix B |
|
|
41
|
+
|-------|---------------|------------|------------|
|
|
42
|
+
| Total amount | 5,000,000 | — | 4,850,000 (sum of items) |
|
|
43
|
+
| Party B name | Shenzhen XX Co. | Shenzhen XX Co., Ltd | Shenzhen XX Co. |
|
|
44
|
+
| Effective date | 2024-01-15 | 2024-01-15 | 2024-02-01 |
|
|
45
|
+
|
|
46
|
+
Populate the matrix using output from `entity-extraction`. An empty cell means the field was not found in that document — this is absence, not necessarily an error. But absence in a field that should be present is itself a finding.
|
|
47
|
+
|
|
48
|
+
## Contradiction Types
|
|
49
|
+
|
|
50
|
+
### Hard Contradictions
|
|
51
|
+
|
|
52
|
+
Exact mismatches in fields that must be identical across documents:
|
|
53
|
+
|
|
54
|
+
- **Identity mismatch**: Name spelled differently, ID number differs by digits, date of birth inconsistent.
|
|
55
|
+
- **Amount mismatch**: Main contract states 5,000,000 but appendix line items sum to 4,850,000. Income certificate says 82,000/month but application says 85,000/month.
|
|
56
|
+
- **Date mismatch**: Contract effective date in the main body differs from the date in an appendix.
|
|
57
|
+
|
|
58
|
+
Hard contradictions are binary. Either the values match (within defined tolerance) or they do not.
|
|
59
|
+
|
|
60
|
+
### Soft Contradictions
|
|
61
|
+
|
|
62
|
+
Values that are not directly inconsistent but are implausible when considered together:
|
|
63
|
+
|
|
64
|
+
- Monthly income claimed as 100,000 but bank statement average deposits are 5,000.
|
|
65
|
+
- Property appraised at 3,000,000 but loan amount is 2,800,000 (93% LTV — technically possible but suspicious).
|
|
66
|
+
- Employment start date 2019-03 on the application but 2019-06 on the income certificate — a 3-month gap that could be rounding or could be fabrication.
|
|
67
|
+
|
|
68
|
+
Soft contradictions require thresholds and judgment. They are findings, not verdicts.
|
|
69
|
+
|
|
70
|
+
### Cross-Reference Failures
|
|
71
|
+
|
|
72
|
+
Structural integrity problems within formally linked documents:
|
|
73
|
+
|
|
74
|
+
- Main contract references "Appendix B — Payment Schedule" but Appendix B is titled "Technical Specifications" or is missing entirely.
|
|
75
|
+
- Appendix references a clause in the main contract that does not exist in the provided version.
|
|
76
|
+
- Missing reciprocal reference: Appendix C references the main contract, but the main contract does not list Appendix C.
|
|
77
|
+
- Version mismatch: main contract dated January, appendix dated March with different terms.
|
|
78
|
+
|
|
79
|
+
See `references/contradiction-taxonomy.md` for the full field-level taxonomy with tolerances and severity classifications.
|
|
80
|
+
|
|
81
|
+
## Severity Classification
|
|
82
|
+
|
|
83
|
+
Not all contradictions are equal. Classify by impact:
|
|
84
|
+
|
|
85
|
+
| Severity | Criteria | Example |
|
|
86
|
+
|----------|----------|---------|
|
|
87
|
+
| **Critical** | Identity field mismatch | ID number differs between documents |
|
|
88
|
+
| **High** | Financial amount discrepancy > 10% | Income 85K vs bank avg 43K |
|
|
89
|
+
| **Medium** | Date or employment detail mismatch | Employment start date 3 months apart |
|
|
90
|
+
| **Low** | Formatting or abbreviation difference | "ABC Corp" vs "ABC Corp Ltd" |
|
|
91
|
+
|
|
92
|
+
The developer user sets severity thresholds per field in the project configuration. The defaults above are starting points. Adjust based on the business context — in some scenarios, a 5% amount discrepancy is critical; in others, 15% is acceptable.
|
|
93
|
+
|
|
94
|
+
## Fraud Signal Patterns
|
|
95
|
+
|
|
96
|
+
Cross-document analysis reveals patterns that single-document review cannot detect. Flag these — do not accuse, flag:
|
|
97
|
+
|
|
98
|
+
- **Consistent small discrepancies across documents.** Income inflated by exactly 5% in every document. Dates shifted by exactly one month. This consistency in error suggests coordinated fabrication rather than honest mistakes.
|
|
99
|
+
- **Suspicious document consistency.** Multiple documents from supposedly different issuers use identical formatting, identical phrasing, or identical typos. Legitimate documents from different organizations look different.
|
|
100
|
+
- **Values at regulatory thresholds.** LTV ratio at exactly 69.9% when the limit is 70%. DTI at exactly 49.8% when the limit is 50%. One occurrence is coincidence. A pattern across the case is a signal.
|
|
101
|
+
- **Temporal impossibilities.** Income certificate issued before the employment start date. Bank statement covering a period before the account opening date. Appraisal dated after the loan disbursement.
|
|
102
|
+
|
|
103
|
+
These are signals for escalation, not conclusions. Present them with evidence and let the developer user or downstream review process make the determination.
|
|
104
|
+
|
|
105
|
+
## Workflow Sequence
|
|
106
|
+
|
|
107
|
+
The recommended sequence for cross-document verification within a case:
|
|
108
|
+
|
|
109
|
+
1. **Identify the case boundary.** Which documents belong together? Use shared identifiers (borrower name, loan number, contract reference) to group documents into cases.
|
|
110
|
+
2. **Extract anchors.** Run `entity-extraction` on each document independently. Collect the shared fields.
|
|
111
|
+
3. **Build the matrix.** Populate the comparison matrix. Flag empty cells.
|
|
112
|
+
4. **Detect contradictions.** Apply hard/soft/cross-reference checks per `references/contradiction-taxonomy.md`.
|
|
113
|
+
5. **Classify severity.** Assign severity per the configured thresholds.
|
|
114
|
+
6. **Scan for fraud signals.** Run pattern checks across the matrix.
|
|
115
|
+
7. **Produce the report.** Output the case-level consistency report with all findings.
|
|
116
|
+
|
|
117
|
+
## Integration
|
|
118
|
+
|
|
119
|
+
**Inputs:**
|
|
120
|
+
- Entity extraction results from `entity-extraction` (the raw field values per document).
|
|
121
|
+
- Compliance judgment results from `compliance-judgment` (per-document pass/fail already computed).
|
|
122
|
+
|
|
123
|
+
**Outputs:**
|
|
124
|
+
- Case-level consistency report: the comparison matrix, all contradictions found, severity classifications, and fraud signal flags.
|
|
125
|
+
|
|
126
|
+
**Feeds into:**
|
|
127
|
+
- `confidence-system`: Cross-document contradictions lower confidence in affected fields.
|
|
128
|
+
- `evolution-loop`: Recurring contradiction patterns trigger workflow refinement.
|
|
129
|
+
- `dashboard-reporting`: Case-level view alongside document-level results.
|
|
130
|
+
|
|
131
|
+
**Ground truth principle:** User and end-user contradiction reports feed back as ground truth. When a user or end-user reports a cross-document inconsistency that the system missed, that report is prior to agent judgment. Log it, learn from it, and adjust detection thresholds accordingly. The system's job is to catch what humans catch — and then go further.
|
package/template/skills/en/meta/cross-document-verification/references/contradiction-taxonomy.md
ADDED
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
# Contradiction Taxonomy
|
|
2
|
+
|
|
3
|
+
Field-level reference for cross-document contradiction detection. Use this taxonomy to configure comparison rules per field type.
|
|
4
|
+
|
|
5
|
+
## Identity Fields
|
|
6
|
+
|
|
7
|
+
| Field | Match Type | Tolerance | Severity | Notes |
|
|
8
|
+
|-------|-----------|-----------|----------|-------|
|
|
9
|
+
| Full name | Fuzzy | Abbreviation, spacing, honorifics | Critical | "Zhang Wei" vs "Zhang W." is fuzzy match; "Zhang Wei" vs "Li Ming" is hard fail |
|
|
10
|
+
| ID number | Exact | None (zero tolerance) | Critical | Single-digit difference = different person or transcription error — both critical |
|
|
11
|
+
| Date of birth | Exact | None | Critical | Cross-check against ID number encoding where applicable |
|
|
12
|
+
| Address | Fuzzy | Abbreviation, floor/unit formatting | Medium | "Rm 1201, Bldg A" vs "Room 1201, Building A" is acceptable |
|
|
13
|
+
| Phone number | Exact | Country code prefix | Medium | +86 prefix presence/absence is tolerated |
|
|
14
|
+
| Company name | Fuzzy | Ltd/Co/Inc suffix, punctuation | High | Must match on core name; suffix variation tolerated |
|
|
15
|
+
|
|
16
|
+
## Financial Fields
|
|
17
|
+
|
|
18
|
+
| Field | Comparison Method | Tolerance | Severity | Notes |
|
|
19
|
+
|-------|------------------|-----------|----------|-------|
|
|
20
|
+
| Stated income | Cross-document | 10% relative | High | Application vs income certificate vs credit report |
|
|
21
|
+
| Bank avg deposits | Income plausibility | 50% of stated income (floor) | High | If avg deposits < 50% of claimed income, flag |
|
|
22
|
+
| Loan amount | Exact across docs | 0.1% relative | Critical | Must be identical in application and contract |
|
|
23
|
+
| Property value | Appraisal consistency | 5% relative | High | Application estimate vs formal appraisal |
|
|
24
|
+
| Existing debt | Cross-source | 20% relative | Medium | Self-reported vs credit report |
|
|
25
|
+
| Net assets | Calculated consistency | Sum of components vs stated total | High | Assets - liabilities must equal stated net |
|
|
26
|
+
| Contract total vs line items | Sum check | 0.01 absolute | Critical | Main contract total must equal appendix line item sum |
|
|
27
|
+
|
|
28
|
+
## Temporal Fields
|
|
29
|
+
|
|
30
|
+
| Field | Consistency Check | Tolerance | Severity | Notes |
|
|
31
|
+
|-------|------------------|-----------|----------|-------|
|
|
32
|
+
| Employment start date | Cross-document match | 90 days | Medium | Application vs income cert vs credit report |
|
|
33
|
+
| Contract signing date | Sequence plausibility | Must be after application date | High | Cannot sign before applying |
|
|
34
|
+
| Document issuance date | Freshness and sequence | Per business rule (typically 30-90 days) | Medium | Income cert issued 6 months ago may be stale |
|
|
35
|
+
| Loan maturity date | Contract consistency | Exact match across docs | High | Application vs contract vs amortization schedule |
|
|
36
|
+
| Appraisal date | Sequence plausibility | Must precede loan approval | Medium | Appraisal after disbursement is a red flag |
|
|
37
|
+
|
|
38
|
+
## Logical Consistency Checks
|
|
39
|
+
|
|
40
|
+
These are not single-field comparisons but cross-field logical validations:
|
|
41
|
+
|
|
42
|
+
- **LTV ratio consistency**: Property value x LTV % should equal or exceed loan amount. Check across appraisal, application, and contract.
|
|
43
|
+
- **DTI ratio reasonableness**: Monthly debt payments (from credit report) + proposed payment / monthly income (from income cert) should not exceed the stated DTI or regulatory limit.
|
|
44
|
+
- **Timeline plausibility**: Employment start < income cert issuance < application date < contract signing < disbursement. Any violation of this sequence is a finding.
|
|
45
|
+
- **Appendix completeness**: Every appendix referenced in the main contract must be present in the case file. Every appendix present must be referenced in the main contract.
|
|
46
|
+
- **Guarantor cross-check**: If a guarantor is listed, their identity fields must also pass cross-document verification against any guarantor-specific documents.
|
|
47
|
+
- **Amount decomposition**: If the contract specifies principal + interest + fees, these must sum to the total obligation stated elsewhere.
|
|
48
|
+
|
|
49
|
+
## Comparison Matrix Template
|
|
50
|
+
|
|
51
|
+
Output schema for the case-level comparison matrix:
|
|
52
|
+
|
|
53
|
+
```json
|
|
54
|
+
{
|
|
55
|
+
"case_id": "CASE-2024-0042",
|
|
56
|
+
"documents": [
|
|
57
|
+
{"doc_id": "DOC-001", "type": "loan_application", "source": "applicant"},
|
|
58
|
+
{"doc_id": "DOC-002", "type": "income_certificate", "source": "employer"}
|
|
59
|
+
],
|
|
60
|
+
"matrix": [
|
|
61
|
+
{"field": "applicant_name", "category": "identity",
|
|
62
|
+
"values": {"DOC-001": "Zhang Wei", "DOC-002": "Zhang Wei"},
|
|
63
|
+
"status": "consistent", "severity": null}
|
|
64
|
+
],
|
|
65
|
+
"contradictions": [
|
|
66
|
+
{"field": "monthly_income", "documents": ["DOC-001", "DOC-002"],
|
|
67
|
+
"values": [85000, 82000], "type": "hard_contradiction",
|
|
68
|
+
"severity": "medium", "detail": "3.5% discrepancy in stated income"}
|
|
69
|
+
],
|
|
70
|
+
"fraud_signals": [],
|
|
71
|
+
"summary": {"total_fields_compared": 8, "consistent": 6, "soft_mismatch": 1, "hard_mismatch": 1, "absent": 0}
|
|
72
|
+
}
|
|
73
|
+
```
|