open-classify 0.5.0 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +60 -63
- package/dist/src/aggregator.d.ts +7 -23
- package/dist/src/aggregator.js +108 -186
- package/dist/src/classifiers/{routing → model_tier}/manifest.json +2 -2
- package/dist/src/classifiers/{routing → model_tier}/prompt.md +1 -1
- package/dist/src/classifiers/preflight/manifest.json +9 -8
- package/dist/src/classifiers/preflight/prompt.md +12 -6
- package/dist/src/classifiers/prompt_injection/manifest.json +2 -3
- package/dist/src/classify.d.ts +1 -2
- package/dist/src/classify.js +0 -2
- package/dist/src/config.d.ts +0 -2
- package/dist/src/config.js +1 -23
- package/dist/src/index.js +2 -3
- package/dist/src/manifest.d.ts +25 -70
- package/dist/src/pipeline.d.ts +1 -2
- package/dist/src/pipeline.js +22 -89
- package/dist/src/stock-validation.js +8 -4
- package/docs/adding-a-classifier.md +5 -3
- package/docs/manifests.md +6 -6
- package/docs/resolver.md +20 -44
- package/docs/signals.md +18 -8
- package/open-classify.config.example.json +1 -4
- package/package.json +1 -1
package/docs/resolver.md
CHANGED
|
@@ -1,12 +1,10 @@
|
|
|
1
1
|
# Aggregation and model resolution
|
|
2
2
|
|
|
3
|
-
The aggregator merges classifier outputs into
|
|
3
|
+
The aggregator merges classifier outputs into a `PipelineResult` with a flat shape — no nested `audit` or `downstream` envelope.
|
|
4
4
|
|
|
5
|
-
## Certainty
|
|
5
|
+
## Certainty labels
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
Per-classifier outputs carry `certainty` tags. The aggregator maps tags to scores:
|
|
7
|
+
Classifier outputs carry a `certainty` label. The aggregator maps labels to numeric scores for comparison and reporting:
|
|
10
8
|
|
|
11
9
|
```ts
|
|
12
10
|
{
|
|
@@ -21,23 +19,27 @@ Per-classifier outputs carry `certainty` tags. The aggregator maps tags to score
|
|
|
21
19
|
}
|
|
22
20
|
```
|
|
23
21
|
|
|
24
|
-
|
|
22
|
+
Labels stay in classifier prompts (the model understands them as semantic grades). Floats appear only in the final `PipelineResult` fields: `avg_certainty`, `min_certainty`, and `classifier_outputs[name].certainty`.
|
|
25
23
|
|
|
26
|
-
|
|
24
|
+
## Reserved-field merging
|
|
27
25
|
|
|
28
|
-
|
|
26
|
+
When multiple classifiers emit the same reserved field, the aggregator picks the highest-certainty contributor. Ties are broken by manifest `dispatch_order` ascending (first wins). Classifiers without `dispatch_order` sort last for tie-break purposes.
|
|
29
27
|
|
|
30
|
-
|
|
28
|
+
There is no certainty threshold gate — the highest-certainty value always wins, regardless of score. Values below any particular threshold are still reported in `classifier_outputs` for the caller to inspect.
|
|
31
29
|
|
|
32
|
-
|
|
30
|
+
## Action
|
|
33
31
|
|
|
34
|
-
|
|
32
|
+
Every result has `action: "route" | "block" | "reply"`.
|
|
35
33
|
|
|
36
|
-
|
|
34
|
+
**`"reply"`** — `preflight` emitted `final_reply`. The classifier determined it can answer the message immediately; no downstream model is needed. `result.reply` contains the text. All other classifiers still ran.
|
|
37
35
|
|
|
38
|
-
|
|
36
|
+
**`"block"`** — something prevented routing. `result.block_reason` names the cause:
|
|
37
|
+
- `"prompt_injection"` — `risk_level` is `"high_risk"` or `"unknown"`, regardless of certainty. This takes priority over other causes.
|
|
38
|
+
- `"classification_error"` — one or more classifiers failed or timed out, or preflight provided no reply (which means the pipeline cannot fulfill its reply contract), or no model could be resolved.
|
|
39
39
|
|
|
40
|
-
|
|
40
|
+
**`"route"`** — all classifiers succeeded and `result.model_id` names the downstream model to call.
|
|
41
|
+
|
|
42
|
+
Even on `"block"`, `model_id` and `reply` are populated when they can be (the caller may want to store them). `failed_classifiers` lists every classifier that errored or timed out.
|
|
41
43
|
|
|
42
44
|
## Model resolution
|
|
43
45
|
|
|
@@ -60,36 +62,10 @@ Within a pass, candidates are ranked:
|
|
|
60
62
|
3. larger `context_window`
|
|
61
63
|
4. earlier catalog order
|
|
62
64
|
|
|
63
|
-
If every pass returns no candidates, the resolver
|
|
64
|
-
|
|
65
|
-
## Resolution audit
|
|
65
|
+
If every pass returns no candidates, the resolver uses `catalog.default`. In practice the no-constraints pass always finds at least one model unless the catalog is empty, so the default-fallback path is defensive.
|
|
66
66
|
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
```ts
|
|
70
|
-
{
|
|
71
|
-
constraints_used: { model_specialization?: ..., model_tier?: ... },
|
|
72
|
-
constraints_dropped: Array<{
|
|
73
|
-
axis: "model_specialization" | "model_tier",
|
|
74
|
-
reason: "low_confidence" | "no_match_relaxed" | "default_fallback"
|
|
75
|
-
}>,
|
|
76
|
-
confidences: { routing?: number },
|
|
77
|
-
fell_back_to_default: boolean,
|
|
78
|
-
}
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
Drop reasons:
|
|
82
|
-
|
|
83
|
-
- `low_confidence` — a classifier emitted the axis but its certainty was below threshold.
|
|
84
|
-
- `no_match_relaxed` — the axis was requested but no model matched, so the resolver relaxed it.
|
|
85
|
-
- `default_fallback` — every pass failed; the resolver used `catalog.default`.
|
|
86
|
-
|
|
87
|
-
## Audit envelope
|
|
67
|
+
## Whole-run certainty summary
|
|
88
68
|
|
|
89
|
-
|
|
69
|
+
Every run includes `avg_certainty` and `min_certainty` at the top level of `PipelineResult`. These are the arithmetic mean and minimum certainty scores across all classifiers, including failed classifiers that fell back to their manifest fallback (which use `no_signal` and score `0`).
|
|
90
70
|
|
|
91
|
-
|
|
92
|
-
- `classifier_outputs[]` — every classifier's full output, in registry order, including `reason`, `certainty`, all reserved fields, and all custom fields
|
|
93
|
-
- `model_recommendation` with the resolution audit above
|
|
94
|
-
- `meta.classifiers[name]` — per-classifier full output plus `status` and `version`
|
|
95
|
-
- `meta.certainty.{min, avg}` — whole-run certainty summary
|
|
71
|
+
The pipeline does not block based on these values — the caller inspects them and decides whether to trust the result or fall back to a safer behavior.
|
package/docs/signals.md
CHANGED
|
@@ -14,7 +14,7 @@ type Certainty =
|
|
|
14
14
|
| "near_certain";
|
|
15
15
|
```
|
|
16
16
|
|
|
17
|
-
The aggregator maps certainty tags to numeric scores.
|
|
17
|
+
The aggregator maps certainty tags to numeric scores. `classifier_outputs[name].certainty` is a float; `avg_certainty` and `min_certainty` on the top-level result are also floats. Certainty labels are internal to classifier prompts; floats are what callers see.
|
|
18
18
|
|
|
19
19
|
## Reserved fields
|
|
20
20
|
|
|
@@ -26,9 +26,9 @@ A manifest declares which reserved fields its classifier may emit via the `reser
|
|
|
26
26
|
{ text: string } // 1–200 chars; must contain at least one non-whitespace character
|
|
27
27
|
```
|
|
28
28
|
|
|
29
|
-
Use only for tiny terminal answers (greetings, thanks, spelling, simple arithmetic). The text IS the complete answer
|
|
29
|
+
Use only for tiny terminal answers (greetings, thanks, spelling, simple arithmetic). The text IS the complete answer — nothing else happens after this. Mutually exclusive with `ack_reply`.
|
|
30
30
|
|
|
31
|
-
When emitted
|
|
31
|
+
When emitted, the pipeline sets `action: "reply"` and surfaces the text in `result.reply`. All other classifiers still run to completion.
|
|
32
32
|
|
|
33
33
|
### `ack_reply`
|
|
34
34
|
|
|
@@ -36,7 +36,9 @@ When emitted with sufficient certainty, the highest-certainty value is surfaced
|
|
|
36
36
|
{ text: string } // 1–200 chars; must contain at least one non-whitespace character
|
|
37
37
|
```
|
|
38
38
|
|
|
39
|
-
A brief acknowledgement to show while downstream work continues.
|
|
39
|
+
A brief, task-specific acknowledgement to show while downstream work continues. Mutually exclusive with `final_reply`.
|
|
40
|
+
|
|
41
|
+
When emitted (and the action is `"route"`), the text is surfaced in `result.reply`. This is the immediate response your UI can show while the downstream model works.
|
|
40
42
|
|
|
41
43
|
### `model_tier`
|
|
42
44
|
|
|
@@ -62,22 +64,30 @@ Soft constraint for the catalog resolver. The resolver picks the cheapest catalo
|
|
|
62
64
|
string[] // each id must appear in the manifest's allowed_tools list
|
|
63
65
|
```
|
|
64
66
|
|
|
65
|
-
Sets `
|
|
67
|
+
Sets `result.tools`. Any classifier emitting this reserved field must declare `allowed_tools` on its manifest — that menu of allowed ids becomes both the JSON Schema constraint and the prompt listing.
|
|
66
68
|
|
|
67
69
|
Common tool-id aliases (`browser`, `browsing`, `internet`, `web_browsing`, `web_search`) are normalized to `web` before validation, so the model can drift on phrasing without breaking.
|
|
68
70
|
|
|
71
|
+
`result.tools` is always an array (empty if no classifier emitted it or no tools were selected).
|
|
72
|
+
|
|
69
73
|
### `risk_level`
|
|
70
74
|
|
|
71
75
|
```ts
|
|
72
76
|
"normal" | "suspicious" | "high_risk" | "unknown"
|
|
73
77
|
```
|
|
74
78
|
|
|
75
|
-
Prompt-injection posture for the target message. Surfaced in `
|
|
79
|
+
Prompt-injection posture for the target message. Surfaced in `result.prompt_injection`.
|
|
80
|
+
|
|
81
|
+
`"high_risk"` and `"unknown"` trigger `action: "block"` with `block_reason: "prompt_injection"`, regardless of certainty. `"suspicious"` is advisory — the pipeline routes normally and the caller decides whether to act on it.
|
|
82
|
+
|
|
83
|
+
When the `prompt_injection` classifier fails (runtime error or timeout), it uses its fallback which does **not** include `risk_level`. The pipeline then blocks with `block_reason: "classification_error"`, not `"prompt_injection"` — a classifier failure is distinct from an assessed injection risk.
|
|
76
84
|
|
|
77
85
|
## Custom fields
|
|
78
86
|
|
|
79
|
-
Anything not in the reserved list lives in your manifest's `output_schema.properties`. The runtime validates each output against the composed schema (custom properties + reserved sub-schemas + `reason` + `certainty`) at runtime, and surfaces the full output on `result.classifier_outputs[name]
|
|
87
|
+
Anything not in the reserved list lives in your manifest's `output_schema.properties`. The runtime validates each output against the composed schema (custom properties + reserved sub-schemas + `reason` + `certainty`) at runtime, and surfaces the full output on `result.classifier_outputs[name]`.
|
|
88
|
+
|
|
89
|
+
`classifier_outputs[name]` contains all payload fields plus `reason` (string) and `certainty` (float). The raw certainty label is not exposed; only the float score.
|
|
80
90
|
|
|
81
91
|
## Picking between reserved-field contributors
|
|
82
92
|
|
|
83
|
-
When two classifiers declare the same reserved field, the aggregator picks the highest-certainty value
|
|
93
|
+
When two classifiers declare the same reserved field, the aggregator picks the highest-certainty value. Ties are broken by manifest `dispatch_order` ascending (first in registry order keeps the slot). Both classifiers' full outputs still appear in `classifier_outputs` regardless of which one "won" the slot.
|
|
@@ -11,15 +11,12 @@
|
|
|
11
11
|
},
|
|
12
12
|
"models": {
|
|
13
13
|
"preflight": "gemma4:e4b-it-q4_K_M",
|
|
14
|
-
"
|
|
14
|
+
"model_tier": "gemma4:e4b-it-q4_K_M",
|
|
15
15
|
"model_specialization": "gemma4:e4b-it-q4_K_M",
|
|
16
16
|
"tools": "gemma4:e4b-it-q4_K_M",
|
|
17
17
|
"prompt_injection": "gemma4:e4b-it-q4_K_M",
|
|
18
18
|
"memory_retrieval_queries": "gemma4:e4b-it-q4_K_M"
|
|
19
19
|
}
|
|
20
20
|
},
|
|
21
|
-
"aggregator": {
|
|
22
|
-
"certaintyThreshold": 0.65
|
|
23
|
-
},
|
|
24
21
|
"catalog": "downstream-models.json"
|
|
25
22
|
}
|