agenr 0.7.6 → 0.7.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,18 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.7.7] - 2026-02-20
4
+
5
+ ### Fixed
6
+ - fix(extractor): rewrote importance score calibration in SYSTEM_PROMPT -- per-score definitions (5-10) replace undifferentiated 8-10 band
7
+ - fix(extractor): added signal-cost framing -- 8+ fires real-time cross-session alerts; prompt now uses this as conservative filter
8
+ - fix(extractor): made score 7 the explicit default workhorse; 8+ now requires cross-session justification
9
+ - fix(extractor): added dev-session-observations rule -- verified/tested/confirmed patterns cap at 6 unless result is surprising or breaking
10
+ - fix(extractor): resolved conflict between dev-session cap and explicit memory request rule ("remember this" overrides cap)
11
+ - fix(extractor): removed "verified again today" from score-8 pnpm example to avoid contradicting dev-session rule
12
+ - fix(extractor): added NOT-8 negative examples alongside existing NOT-9 callouts
13
+ - fix(extractor): added 3 non-developer few-shot examples (health at 8, personal at 7, preference at 6) to prevent domain bias
14
+ - fix(extractor): lowered 8+ calibration cap from 30% to 20%
15
+
3
16
  ## [0.7.6] - 2026-02-20
4
17
 
5
18
  ### Fixed
@@ -22,6 +35,7 @@
22
35
  - fix(extractor): increase MAX_PREFETCH_RESULTS from 3 to 5 and lower PREFETCH_SIMILARITY_THRESHOLD from 0.78 to 0.72
23
36
  - fix(extractor): increase PREFETCH_CANDIDATE_LIMIT from 10 to 15 for broader elaborative encoding candidates
24
37
  - fix(extractor): tighten extractor prompt to suppress near-variant entries already captured in DB
38
+ - fix(extractor): recalibrate importance scoring anchors so routine verifications and test-pass observations default to 6-7; reserve 8+ for cross-session alert-worthy updates
25
39
 
26
40
  ### Added
27
41
  - feat(plugin): signalCooldownMs config - minimum ms between signal batches per session (default: 30000)
package/dist/cli-main.js CHANGED
@@ -8964,17 +8964,69 @@ If uncertain whether durable, skip.
8964
8964
 
8965
8965
  ## Importance (1-10)
8966
8966
 
8967
- Emit only importance >= 5. Start every candidate at 5; raise only with clear justification.
8967
+ Emit only importance >= 5. Start every candidate at 7; lower or raise only with clear justification.
8968
8968
 
8969
- 8-10: biographical facts, durable strategic decisions, foundational architecture
8970
- 6-7: meaningful project facts, preferences, events that matter beyond this week
8971
- 5: borderline but still durable for days/weeks and actionable in future context
8972
- 1-4: noise \u2014 do not emit
8969
+ Importance scores map to real behavior in the memory system:
8970
+ - 8 or higher fires a real-time cross-session signal (an alert to other active AI sessions)
8971
+ - 7 is stored silently; no alert fires
8972
+ - Below 7 is stored but deprioritized in recall
8973
+
8974
+ Use that signal cost as your conservative filter. Ask: "Does someone in another session need to know this RIGHT NOW?" If no, stay at 7 or below.
8975
+
8976
+ Score anchors:
8977
+
8978
+ 10: Once-per-project facts. Core identity, permanent constraints, "never forget this."
8979
+ Example: "This project must never use GPL-licensed dependencies."
8980
+ Example: "The production database password rotation requires manual approval."
8981
+
8982
+ 9: Critical breaking changes or decisions with immediate cross-session impact.
8983
+ Use for: major architecture reversals, breaking API changes, critical blockers discovered.
8984
+ Example: "agenr embed API changed: model param is now required; all callers must update."
8985
+ Example: "Decided to abandon SQLite-vec in favor of Postgres pgvector - all storage code changes."
8986
+ NOT 9: "we verified signals work" (that is a 6)
8987
+ NOT 9: "tests are passing" (that is a 5-6)
8988
+ NOT 9: "deployed feature X" (that is a 7 event at most)
8989
+
8990
+ 8: Things an active parallel session would act on if notified right now.
8991
+ Use for: new user preferences discovered, important architectural facts just learned,
8992
+ active blocking issues, key decisions made today that others need to know.
8993
+ Example: "User prefers pnpm over npm for all projects in this workspace."
8994
+ Example: "The chunker silently drops chunks over 8k tokens - callers must split first."
8995
+ If in doubt between 7 and 8, use 7.
8996
+ NOT 8: "we decided to use TypeScript strict mode" (that is a 7 decision)
8997
+ NOT 8: "the user's daughter is named Emma" (that is a 7 biographical fact)
8998
+ NOT 8: "confirmed the import worked" (that is a 6 verification)
8999
+
9000
+ 7: Default for solid durable facts. Stored, retrievable, no alert.
9001
+ Use for: project facts, preferences (non-critical), completed milestones, stable architecture notes.
9002
+ Example: "agenr stores entries in SQLite with sqlite-vec for vector search."
9003
+ Example: "Completed brain audit. Found 73% noise rate in knowledge base."
9004
+ This is the right score for most extracted entries.
9005
+
9006
+ 6: Routine durable observations. Worth storing but minor.
9007
+ Use for: dev session observations, test results, routine verifications, minor notes.
9008
+ Example: "Verified that signal emission works end to end in local testing."
9009
+ Example: "Confirmed the import path change did not break CLI startup."
9010
+ Example: "agenr extraction runs in about 2s per chunk on the test dataset."
9011
+
9012
+ 5: Borderline. Only emit if clearly durable beyond today and actionable in a future session.
9013
+ Example: "Port 4242 is the default for the local test server."
8973
9014
 
8974
9015
  Calibration:
8975
9016
  - Typical chunk: 0-3 entries. Most chunks: 0.
8976
- - 8+ entries: usually 0-1 per chunk
8977
- - If >30% of emitted entries are 8+, you are inflating
9017
+ - Score 9 or 10: very rare, at most 1 per significant session, often 0
9018
+ - Score 8: at most 1-2 per session; ask the cross-session-alert question before assigning
9019
+ - Score 7: this is your workhorse; most emitted entries should be 7
9020
+ - Score 6: routine dev observations that are still worth storing
9021
+ - If more than 20% of your emitted entries are 8 or higher, you are inflating
9022
+
9023
+ Dev session observations rule: Anything in the form "we tested X and it worked", "verified X",
9024
+ "confirmed X runs", "X is passing" belongs at 6 unless the result was surprising or breaks
9025
+ something. Surprising means a critical bug was found or a breaking change was discovered --
9026
+ minor timing differences or slightly unexpected output do not count. Test passes and routine
9027
+ verifications are not cross-session alerts.
9028
+ Exception: if the user explicitly said "remember this" or "remember that", the explicit memory
9029
+ request rule takes precedence and the 6-cap does not apply.
8978
9030
 
8979
9031
  ## Subject (critical)
8980
9032
 
@@ -9038,7 +9090,8 @@ Before emitting EACH entry, all five must be true:
9038
9090
  2. Durable beyond the immediate step
9039
9091
  3. Non-duplicate of another entry in this batch
9040
9092
  4. Importance >= 5 with a concrete reason
9041
- 5. Explicit user "remember this/that" requests justify importance >= 7 regardless of content type
9093
+ 5. Explicit user "remember this/that" requests justify importance >= 7 regardless of content
9094
+ type, including dev session verifications (this overrides the dev-session-observations rule).
9042
9095
  If any check fails, do not emit.
9043
9096
 
9044
9097
  ## Few-Shot Examples
@@ -9122,6 +9175,50 @@ EVENT:
9122
9175
  "source_context": "Brain audit completed and findings documented"
9123
9176
  }
9124
9177
 
9178
+ EVENT:
9179
+ {
9180
+ "type": "event",
9181
+ "subject": "agenr signal emission",
9182
+ "content": "Verified end-to-end signal emission works in local testing. Entry stored, signal fired, received in OpenClaw session.",
9183
+ "importance": 6,
9184
+ "expiry": "temporary",
9185
+ "tags": ["agenr", "signals", "testing"],
9186
+ "source_context": "Dev session verification of signal feature"
9187
+ }
9188
+
9189
+ FACT:
9190
+ {
9191
+ "type": "fact",
9192
+ "subject": "user penicillin allergy",
9193
+ "content": "User is allergic to penicillin. Must never suggest it or related antibiotics.",
9194
+ "importance": 8,
9195
+ "expiry": "permanent",
9196
+ "tags": ["health", "allergy", "critical"],
9197
+ "source_context": "User mentioned allergy when discussing a doctor visit -- scored 8 not 9/10 because a parallel session needs to know this now (e.g. to avoid recommending antibiotics), but it is not a breaking change or architectural decision"
9198
+ }
9199
+
9200
+ FACT:
9201
+ {
9202
+ "type": "fact",
9203
+ "subject": "user daughter name",
9204
+ "content": "User's daughter is named Emma. She is 8 years old.",
9205
+ "importance": 7,
9206
+ "expiry": "permanent",
9207
+ "tags": ["family", "personal"],
9208
+ "source_context": "User mentioned daughter while discussing weekend plans"
9209
+ }
9210
+
9211
+ PREFERENCE:
9212
+ {
9213
+ "type": "preference",
9214
+ "subject": "meeting time preference",
9215
+ "content": "User prefers morning meetings before noon and avoids late afternoon calls.",
9216
+ "importance": 6,
9217
+ "expiry": "long-term",
9218
+ "tags": ["schedule", "preference"],
9219
+ "source_context": "User mentioned scheduling preference during calendar discussion -- scored 6 not 8 because no parallel session needs to act on this immediately; it is a low-urgency convenience preference"
9220
+ }
9221
+
9125
9222
  ### BORDERLINE \u2014 skip these
9126
9223
 
9127
9224
  SKIP: "The assistant read the config file and found the port was 3000."
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agenr",
3
- "version": "0.7.6",
3
+ "version": "0.7.7",
4
4
  "openclaw": {
5
5
  "extensions": [
6
6
  "dist/openclaw-plugin/index.js"
@@ -11,6 +11,13 @@
11
11
  "bin": {
12
12
  "agenr": "dist/cli.js"
13
13
  },
14
+ "scripts": {
15
+ "build": "tsup src/cli.ts src/cli-main.ts src/openclaw-plugin/index.ts --format esm --dts",
16
+ "dev": "tsup src/cli.ts src/cli-main.ts --format esm --watch",
17
+ "test": "vitest run",
18
+ "test:watch": "vitest",
19
+ "typecheck": "tsc --noEmit"
20
+ },
14
21
  "dependencies": {
15
22
  "@clack/prompts": "^1.0.1",
16
23
  "@libsql/client": "^0.17.0",
@@ -54,11 +61,9 @@
54
61
  "README.md"
55
62
  ],
56
63
  "author": "agenr-ai",
57
- "scripts": {
58
- "build": "tsup src/cli.ts src/cli-main.ts src/openclaw-plugin/index.ts --format esm --dts",
59
- "dev": "tsup src/cli.ts src/cli-main.ts --format esm --watch",
60
- "test": "vitest run",
61
- "test:watch": "vitest",
62
- "typecheck": "tsc --noEmit"
64
+ "pnpm": {
65
+ "overrides": {
66
+ "fast-xml-parser": "^5.3.6"
67
+ }
63
68
  }
64
- }
69
+ }