instar 1.2.60 → 1.2.62
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +10 -0
- package/dist/commands/server.d.ts.map +1 -1
- package/dist/commands/server.js +55 -0
- package/dist/commands/server.js.map +1 -1
- package/dist/config/ConfigDefaults.d.ts.map +1 -1
- package/dist/config/ConfigDefaults.js +13 -0
- package/dist/config/ConfigDefaults.js.map +1 -1
- package/dist/core/MessagingToneGate.d.ts +2 -2
- package/dist/core/MessagingToneGate.d.ts.map +1 -1
- package/dist/core/MessagingToneGate.js +18 -1
- package/dist/core/MessagingToneGate.js.map +1 -1
- package/dist/core/TopicIntent.d.ts +62 -1
- package/dist/core/TopicIntent.d.ts.map +1 -1
- package/dist/core/TopicIntent.js +131 -2
- package/dist/core/TopicIntent.js.map +1 -1
- package/dist/core/TopicIntentCapture.d.ts +124 -0
- package/dist/core/TopicIntentCapture.d.ts.map +1 -0
- package/dist/core/TopicIntentCapture.js +232 -0
- package/dist/core/TopicIntentCapture.js.map +1 -0
- package/dist/core/TopicIntentExtractor.d.ts +32 -0
- package/dist/core/TopicIntentExtractor.d.ts.map +1 -1
- package/dist/core/TopicIntentExtractor.js +52 -3
- package/dist/core/TopicIntentExtractor.js.map +1 -1
- package/dist/core/types.d.ts +10 -0
- package/dist/core/types.d.ts.map +1 -1
- package/dist/core/types.js.map +1 -1
- package/dist/server/CapabilityIndex.d.ts.map +1 -1
- package/dist/server/CapabilityIndex.js +1 -0
- package/dist/server/CapabilityIndex.js.map +1 -1
- package/dist/server/topicIntentRoutes.d.ts.map +1 -1
- package/dist/server/topicIntentRoutes.js +61 -1
- package/dist/server/topicIntentRoutes.js.map +1 -1
- package/package.json +1 -1
- package/src/data/builtin-manifest.json +2 -2
- package/upgrades/1.2.61.md +37 -0
- package/upgrades/1.2.62.md +75 -0
- package/upgrades/side-effects/topic-intent-capture-loop.md +100 -0
- package/upgrades/side-effects/wall-is-a-hypothesis-standard.md +49 -0
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
# Upgrade Guide — topic-intent auto-capture loop (rung 0)
|
|
2
|
+
|
|
3
|
+
<!-- bump: minor -->
|
|
4
|
+
<!-- minor = new features, new APIs, new capabilities (backwards-compatible) -->
|
|
5
|
+
|
|
6
|
+
## What Changed
|
|
7
|
+
|
|
8
|
+
**The topic-intent "cabinet" now fills itself from live conversation.**
|
|
9
|
+
|
|
10
|
+
The Topic-Intent Layer shipped a while back — a per-topic store of the facts
|
|
11
|
+
and decisions a conversation establishes, a session-start briefing that reads
|
|
12
|
+
from it, and an ArcCheck that guards against acting on shaky ground. But its
|
|
13
|
+
defining capability was never switched on: nothing ever read a real
|
|
14
|
+
conversation turn and filed anything. The store stayed empty, so the briefing
|
|
15
|
+
rendered blank and ArcCheck had nothing to gate against. This is why the
|
|
16
|
+
original methodology-drift incident found "topic-intent had no record for the
|
|
17
|
+
topic" — the drift-catching machine was shipped but asleep.
|
|
18
|
+
|
|
19
|
+
This change wires the capture "clerk" that reads each substantive turn and
|
|
20
|
+
gives it a cheap, fast-tier "anything worth filing here?" read, with **broader
|
|
21
|
+
context** — the topic's already-established refs plus a rolling summary — so it
|
|
22
|
+
judges significance against what the conversation is actually about, not one
|
|
23
|
+
line in isolation.
|
|
24
|
+
|
|
25
|
+
Built in:
|
|
26
|
+
|
|
27
|
+
- **A deterministic, fail-open pre-filter** so trivial turns (empty, bare
|
|
28
|
+
"ok"/"thanks", agent status/heartbeat lines) never reach the model. Registered
|
|
29
|
+
as a state-detector with a canary (`docs/specs/06-state-detector-registry.md`)
|
|
30
|
+
that guards against sentinel-format drift silently dropping real captures.
|
|
31
|
+
- **Cost controls**: every extraction is admitted through the shared LlmQueue on
|
|
32
|
+
the background lane (yields to interactive work, shares the daily cap), runs on
|
|
33
|
+
the subscription transport (never the raw paid API), is bounded by a per-topic
|
|
34
|
+
rate ceiling, and backs off under quota pressure. Fire-and-forget, so capture
|
|
35
|
+
latency can never slow a message reaching you.
|
|
36
|
+
- **Prompt-injection hardening**: prior notes and the rolling summary are fed
|
|
37
|
+
back to the model only inside delimited untrusted-data blocks, truncated to
|
|
38
|
+
hard caps.
|
|
39
|
+
- **Concurrency-safe writes**: the store's append path holds a per-topic lock so
|
|
40
|
+
two sessions capturing the same topic can't drop each other's events.
|
|
41
|
+
- **Whole-loop observability** (`GET /topic-intent/:id/capture-metrics`): the
|
|
42
|
+
funnel of captured → surfaced → used → corrected. Paired with the
|
|
43
|
+
human-as-detector heat map (what we MISSED), this is the read for tuning
|
|
44
|
+
effectiveness over time.
|
|
45
|
+
|
|
46
|
+
ON by default (ratified), with a kill-switch: `topicIntent.capture.enabled:
|
|
47
|
+
false`. Existing agents get the default on update.
|
|
48
|
+
|
|
49
|
+
Spec: `docs/specs/topic-intent-capture-loop.md` (converged iter 3, approved).
|
|
50
|
+
ELI16: `docs/specs/topic-intent-capture-loop.eli16.md`.
|
|
51
|
+
Side-effects review: `upgrades/side-effects/topic-intent-capture-loop.md`.
|
|
52
|
+
|
|
53
|
+
## What to Tell Your User
|
|
54
|
+
|
|
55
|
+
- **Conversation memory**: "I now quietly remember the facts and decisions from
|
|
56
|
+
our conversations, so when we pick a topic back up I already know where we
|
|
57
|
+
left off — instead of you having to remind me."
|
|
58
|
+
|
|
59
|
+
## Summary of New Capabilities
|
|
60
|
+
|
|
61
|
+
| Capability | How to Use |
|
|
62
|
+
|-----------|-----------|
|
|
63
|
+
| Per-turn topic-intent capture | Automatic (ON by default; disable via `topicIntent.capture.enabled: false`) |
|
|
64
|
+
| Capture-loop funnel metrics | `GET /topic-intent/:topicId/capture-metrics` (operator-only) |
|
|
65
|
+
|
|
66
|
+
## Evidence
|
|
67
|
+
|
|
68
|
+
Not a bug fix in the runtime sense — this wires a capability that shipped inert.
|
|
69
|
+
Verified end-to-end: 30 new tests across all three tiers (20 unit, 6
|
|
70
|
+
integration, 4 e2e) plus the 13 pre-existing capture tests — 138 topic-intent
|
|
71
|
+
tests green. The e2e **wiring-integrity** test fires a real inbound turn through
|
|
72
|
+
the live `onMessageLogged` callback and asserts a ref lands in the store and the
|
|
73
|
+
briefing goes from empty to non-empty (the anti-"shipped-but-asleep" guard), and
|
|
74
|
+
that the extraction went through the queue's background lane on the injected
|
|
75
|
+
provider, never a raw API client. `tsc` + lint clean.
|
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
# Side-effects review — topic-intent auto-capture loop (rung 0)
|
|
2
|
+
|
|
3
|
+
**Scope**: Wire the topic-intent capture loop so the per-topic store actually
|
|
4
|
+
fills from live conversation (closing the "shipped but asleep" gap — the store,
|
|
5
|
+
read routes, and session-start briefing all shipped, but nothing ever invoked
|
|
6
|
+
`ingest()` on a real turn). Adds the adapter-agnostic capture "clerk", broader
|
|
7
|
+
context (rolling summary + established refs) feeding the extractor, cost
|
|
8
|
+
controls, prompt-injection-hardened extraction, the live wiring on the inbound
|
|
9
|
+
message path, and whole-loop observability. Spec:
|
|
10
|
+
`docs/specs/topic-intent-capture-loop.md` (converged iter 3, approved by justin).
|
|
11
|
+
|
|
12
|
+
**Files touched**:
|
|
13
|
+
- `src/core/TopicIntent.ts` — add `CaptureCounters` to `TelemetryCounters`
|
|
14
|
+
(defaulted on read for back-compat) + `defaultCaptureCounters()` +
|
|
15
|
+
`bumpCaptureCounters()` (atomic under the existing per-topic lock). Switch the
|
|
16
|
+
two `withTopicLock` lock-dir removals to `SafeFsExecutor.safeRmdirSync`.
|
|
17
|
+
- `src/core/TopicIntentExtractor.ts` — `createLlmExtractFn` gains an optional
|
|
18
|
+
`onDegrade(reason, topicId)` observability hook; still returns `[]` on every
|
|
19
|
+
degrade path (degrade-safety unchanged).
|
|
20
|
+
- `src/core/TopicIntentCapture.ts` — NEW. The capture step: `isSubstantiveTurn`
|
|
21
|
+
pre-filter (deterministic, fail-open) + canary; `createQueuedIntelligence`
|
|
22
|
+
(queue-backed, subscription-transport); `captureTurn` + `createCaptureLoop`
|
|
23
|
+
(rate-state-owning closure).
|
|
24
|
+
- `src/server/topicIntentRoutes.ts` — NEW `GET /topic-intent/:id/capture-metrics`
|
|
25
|
+
(the whole-loop funnel); `briefing_served` metering on the briefing route;
|
|
26
|
+
`arccheck_fired`/`arccheck_signalled` metering on the arccheck route.
|
|
27
|
+
- `src/server/CapabilityIndex.ts` — add `topic-intent` to `INTERNAL_PREFIXES`
|
|
28
|
+
(operator-only; not a discoverable agent endpoint).
|
|
29
|
+
- `src/config/ConfigDefaults.ts` — `topicIntent.capture.enabled: true` in
|
|
30
|
+
`SHARED_DEFAULTS` (auto-applies on init AND migration → migration parity).
|
|
31
|
+
- `src/core/types.ts` — add optional `topicIntent` to `InstarConfig`.
|
|
32
|
+
- `src/commands/server.ts` — construct the queue-backed extractor + capture loop
|
|
33
|
+
and chain it onto `telegram.onMessageLogged` (preserving prior callbacks),
|
|
34
|
+
gated on `sharedIntelligence && config.topicIntent.capture.enabled`.
|
|
35
|
+
- Tests: `tests/unit/TopicIntentCapture.test.ts`,
|
|
36
|
+
`tests/integration/topic-intent-capture-routes.test.ts`,
|
|
37
|
+
`tests/e2e/topic-intent-capture-lifecycle.test.ts`.
|
|
38
|
+
- `docs/specs/06-state-detector-registry.md` — NEW registry; pre-filter entry.
|
|
39
|
+
|
|
40
|
+
**Under-block**: The pre-filter is fail-open — when unsure it passes the turn to
|
|
41
|
+
the LLM, so it cannot silently swallow a substantive turn on an ambiguous input.
|
|
42
|
+
Its only confident skips are empty/whitespace, whole-message bare acks, and
|
|
43
|
+
agent sentinel/heartbeat lines (agent turns only). Risk of under-block (a real
|
|
44
|
+
turn skipped) is bounded to sentinel-format drift, which the canary guards.
|
|
45
|
+
|
|
46
|
+
**Over-block**: The only "block"-shaped behavior is the pre-filter skip and the
|
|
47
|
+
QuotaTracker shed. Over-skipping costs a missed cheap extraction, never a
|
|
48
|
+
delivery failure or a user-visible block. The canary asserts known substantive
|
|
49
|
+
turns (including ack-prefixed ones) are NOT skipped.
|
|
50
|
+
|
|
51
|
+
**Level-of-abstraction fit**: The capture step is adapter-agnostic — it takes a
|
|
52
|
+
generic `CaptureTurnEntry`, not a Telegram type; Telegram is merely the first
|
|
53
|
+
wiring (other adapters tracked as `cwa-multi-adapter-capture`). The store stays
|
|
54
|
+
the single authority for persistence/projection; the extractor owns extraction;
|
|
55
|
+
the capture helper only orchestrates. Transport is delegated to the injected
|
|
56
|
+
`sharedIntelligence` provider (subscription/REPL-pool) through the shared
|
|
57
|
+
`LlmQueue` — capture never reaches for a raw API client.
|
|
58
|
+
|
|
59
|
+
**Signal vs authority**: Capture only RECORDS (append-only evidence); it has no
|
|
60
|
+
blocking authority. ArcCheck SIGNALS; neither blocks a send. The pre-filter is a
|
|
61
|
+
brittle low-context detector emitting a skip signal, never a gate. This matches
|
|
62
|
+
`[[feedback_signal_vs_authority]]`.
|
|
63
|
+
|
|
64
|
+
**Interactions**:
|
|
65
|
+
- `telegram.onMessageLogged` is a single-assignment property already chained by
|
|
66
|
+
PresenceProxy, human-as-detector, and the keep-watching detector. The capture
|
|
67
|
+
wiring preserves the prior callback (`const before = ...; cb = (e) => { before?.(e); capture(e); }`),
|
|
68
|
+
verified by the e2e chain test (prior callback still fires).
|
|
69
|
+
- Capture is fire-and-forget (`void captureLoop(...)`) so extraction latency
|
|
70
|
+
can never reach the delivery path (acceptance #4).
|
|
71
|
+
- Extraction is admitted on the LlmQueue **background** lane, so it yields to
|
|
72
|
+
interactive (PresenceProxy/PromiseBeacon) work and shares the daily cap. On
|
|
73
|
+
cap breach the queue throws → `createLlmExtractFn` catches → degrades to a
|
|
74
|
+
`degraded_cap_or_error` tick.
|
|
75
|
+
- `bumpCaptureCounters`, `bumpTurn`, and `appendEvidence` each take the per-topic
|
|
76
|
+
lock separately (multiple short acquisitions per turn). Correct under
|
|
77
|
+
concurrency (the concurrency test still passes); accepted minor lock churn for
|
|
78
|
+
v1 since capture runs off the delivery path.
|
|
79
|
+
- The briefing route now has a metering side-effect on a GET (writes
|
|
80
|
+
`briefing_served`). Intentional per spec §10; best-effort and never blocks the
|
|
81
|
+
fetch.
|
|
82
|
+
|
|
83
|
+
**External surfaces**:
|
|
84
|
+
- New endpoint: `GET /topic-intent/:topicId/capture-metrics` (operator-only).
|
|
85
|
+
- New config field: `topicIntent.capture.enabled` (default true; kill-switch).
|
|
86
|
+
- New additive `TopicIntentFile` fields (`telemetry.capture.*`), defaulted on
|
|
87
|
+
read — old files load unchanged.
|
|
88
|
+
- New exported symbols in `TopicIntentCapture.ts`; no breaking change to
|
|
89
|
+
existing exports.
|
|
90
|
+
|
|
91
|
+
**Cost (the one genuinely new ongoing cost)**: This is the product's first
|
|
92
|
+
always-on per-turn LLM path. Bounded by: the deterministic pre-filter (most
|
|
93
|
+
turns never reach the model), a per-topic rate ceiling (30/60s), the LlmQueue
|
|
94
|
+
daily cap (best-effort, per-process), and QuotaTracker load-shedding. ON by
|
|
95
|
+
default is ratified.
|
|
96
|
+
|
|
97
|
+
**Rollback cost**: Low. Config kill-switch `topicIntent.capture.enabled: false`
|
|
98
|
+
makes capture inert immediately (store + routes remain, as today). Full revert:
|
|
99
|
+
drop the server.ts wiring block + the new file; the store/routes/briefing return
|
|
100
|
+
to the inert pre-capture state. Additive store fields are harmless if left.
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
# Side-Effects Review — A Wall Is a Hypothesis (B16_UNVERIFIED_WALL)
|
|
2
|
+
|
|
3
|
+
**Slug:** `wall-is-a-hypothesis-standard`
|
|
4
|
+
**Date:** 2026-05-24
|
|
5
|
+
**Author:** echo
|
|
6
|
+
**Second-pass reviewer:** internal conformance pass
|
|
7
|
+
|
|
8
|
+
## Summary of the change
|
|
9
|
+
|
|
10
|
+
Adds the constitution standard "A Wall Is a Hypothesis" to `docs/STANDARDS-REGISTRY.md` and its structural enforcement: a new always-evaluated rule **B16_UNVERIFIED_WALL** in `MessagingToneGate` (the existing outbound-message authority that hosts B15). B16 blocks an outbound message that declares a path impossible/blocked/infeasible because an interface/API/mechanism is missing, when the message shows no evidence the agent inventoried its own capabilities first. Also registers the standard in `docs/INSTAR-DESIGN-PRINCIPLES-AND-LESSONS.md` (the catalog the `/spec-converge` reviewer loads).
|
|
11
|
+
|
|
12
|
+
## Decision-point inventory
|
|
13
|
+
|
|
14
|
+
- `VALID_RULES` set — **add** `'B16_UNVERIFIED_WALL'`. Without this the gate's drift-detection fails-open on a legitimate B16 citation.
|
|
15
|
+
- `buildPrompt()` rule section — **add** the B16 definition after B15 (always-evaluated, no precondition).
|
|
16
|
+
- Response-format enumeration + two stale doc comments — **modify** to include B16 (the comments already lagged at B14).
|
|
17
|
+
- No route changes: `checkOutboundMessage` → 422 is rule-agnostic; B16 rides the existing outbound paths.
|
|
18
|
+
|
|
19
|
+
## 1. Over-block
|
|
20
|
+
|
|
21
|
+
The principal over-block risk: blocking ordinary "I can't do X" messages. Mitigated in the rule text — severity explicitly favors false-negatives; genuinely-external limits ("can't read your email until you connect it"), walls reported after a visible inventory, real either/or questions, real runtime errors, and messages discussing the rule all pass. The rule targets only the precise pattern: an internal feasibility verdict resting on a missing interface with no inventory shown.
|
|
22
|
+
|
|
23
|
+
## 2. Under-block (a real wall-surrender slipping through)
|
|
24
|
+
|
|
25
|
+
Possible if the LLM judge misses a borderline case — acceptable by design (favor false-negatives), matching the gate's stated philosophy (high signal, not adversarial correctness). The standard + the `/spec-converge` registration provide the softer review-time catch as backup.
|
|
26
|
+
|
|
27
|
+
## 3. Level-of-abstraction fit
|
|
28
|
+
|
|
29
|
+
Correct: the guard lives inside the single outbound authority (where B15 lives), not as a new detector with independent block power. Signal-vs-authority compliant.
|
|
30
|
+
|
|
31
|
+
## 4. Blocking authority
|
|
32
|
+
|
|
33
|
+
No new brittle authority. B16 is one more rule the existing authority may cite; the 422 plumbing is unchanged. Fail-open behavior (LLM error/timeout/invalid-rule) is inherited unchanged.
|
|
34
|
+
|
|
35
|
+
## 5. Interactions
|
|
36
|
+
|
|
37
|
+
B16 is always evaluated alongside B15 and the signal/health rules in one LLM call — no extra calls, marginally longer prompt. No interaction with the health-alert (B12-B14) or style (B11) rules, which remain gated by their preconditions. The drift-detection branch is unaffected (an invented rule id still fails open — covered by a regression test).
|
|
38
|
+
|
|
39
|
+
## 6. External surfaces
|
|
40
|
+
|
|
41
|
+
None. No new endpoints, credentials, or network calls. The standard's enforcement claim was verified against code before authoring (the registry is not parsed at runtime; the conformance gate and Usher are unbuilt North Star designs) — so the "Applied through" line states only what exists.
|
|
42
|
+
|
|
43
|
+
## 7. Rollback cost
|
|
44
|
+
|
|
45
|
+
Low. Reverting removes the rule from the set + prompt and the doc entries; no state, no migration, no schema. An older server simply lacks the rule.
|
|
46
|
+
|
|
47
|
+
## 8. Test evidence
|
|
48
|
+
|
|
49
|
+
Unit (`messaging-tone-gate-b16.test.ts`, 9 tests) + integration (`telegram-reply-b16-wall.test.ts`, 2 tests) green; tsc clean. Both sides of the decision boundary covered with realistic inputs; the /goal-style wall blocks through the real route (422, message suppressed) and the happy path still delivers (200).
|