agenr 0.7.6 → 0.7.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +14 -0
- package/dist/cli-main.js +105 -8
- package/package.json +13 -8
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,18 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [0.7.7] - 2026-02-20
|
|
4
|
+
|
|
5
|
+
### Fixed
|
|
6
|
+
- fix(extractor): rewrote importance score calibration in SYSTEM_PROMPT -- per-score definitions (5-10) replace undifferentiated 8-10 band
|
|
7
|
+
- fix(extractor): added signal-cost framing -- 8+ fires real-time cross-session alerts; prompt now uses this as conservative filter
|
|
8
|
+
- fix(extractor): made score 7 the explicit default workhorse; 8+ now requires cross-session justification
|
|
9
|
+
- fix(extractor): added dev-session-observations rule -- verified/tested/confirmed patterns cap at 6 unless result is surprising or breaking
|
|
10
|
+
- fix(extractor): resolved conflict between dev-session cap and explicit memory request rule ("remember this" overrides cap)
|
|
11
|
+
- fix(extractor): removed "verified again today" from score-8 pnpm example to avoid contradicting dev-session rule
|
|
12
|
+
- fix(extractor): added NOT-8 negative examples alongside existing NOT-9 callouts
|
|
13
|
+
- fix(extractor): added 3 non-developer few-shot examples (health at 8, personal at 7, preference at 6) to prevent domain bias
|
|
14
|
+
- fix(extractor): lowered 8+ calibration cap from 30% to 20%
|
|
15
|
+
|
|
3
16
|
## [0.7.6] - 2026-02-20
|
|
4
17
|
|
|
5
18
|
### Fixed
|
|
@@ -22,6 +35,7 @@
|
|
|
22
35
|
- fix(extractor): increase MAX_PREFETCH_RESULTS from 3 to 5 and lower PREFETCH_SIMILARITY_THRESHOLD from 0.78 to 0.72
|
|
23
36
|
- fix(extractor): increase PREFETCH_CANDIDATE_LIMIT from 10 to 15 for broader elaborative encoding candidates
|
|
24
37
|
- fix(extractor): tighten extractor prompt to suppress near-variant entries already captured in DB
|
|
38
|
+
- fix(extractor): recalibrate importance scoring anchors so routine verifications and test-pass observations default to 6-7; reserve 8+ for cross-session alert-worthy updates
|
|
25
39
|
|
|
26
40
|
### Added
|
|
27
41
|
- feat(plugin): signalCooldownMs config - minimum ms between signal batches per session (default: 30000)
|
package/dist/cli-main.js
CHANGED
|
@@ -8964,17 +8964,69 @@ If uncertain whether durable, skip.
|
|
|
8964
8964
|
|
|
8965
8965
|
## Importance (1-10)
|
|
8966
8966
|
|
|
8967
|
-
Emit only importance >= 5. Start every candidate at
|
|
8967
|
+
Emit only importance >= 5. Start every candidate at 7; lower or raise only with clear justification.
|
|
8968
8968
|
|
|
8969
|
-
|
|
8970
|
-
|
|
8971
|
-
|
|
8972
|
-
|
|
8969
|
+
Importance scores map to real behavior in the memory system:
|
|
8970
|
+
- 8 or higher fires a real-time cross-session signal (an alert to other active AI sessions)
|
|
8971
|
+
- 7 is stored silently; no alert fires
|
|
8972
|
+
- Below 7 is stored but deprioritized in recall
|
|
8973
|
+
|
|
8974
|
+
Use that signal cost as your conservative filter. Ask: "Does someone in another session need to know this RIGHT NOW?" If no, stay at 7 or below.
|
|
8975
|
+
|
|
8976
|
+
Score anchors:
|
|
8977
|
+
|
|
8978
|
+
10: Once-per-project facts. Core identity, permanent constraints, "never forget this."
|
|
8979
|
+
Example: "This project must never use GPL-licensed dependencies."
|
|
8980
|
+
Example: "The production database password rotation requires manual approval."
|
|
8981
|
+
|
|
8982
|
+
9: Critical breaking changes or decisions with immediate cross-session impact.
|
|
8983
|
+
Use for: major architecture reversals, breaking API changes, critical blockers discovered.
|
|
8984
|
+
Example: "agenr embed API changed: model param is now required; all callers must update."
|
|
8985
|
+
Example: "Decided to abandon SQLite-vec in favor of Postgres pgvector - all storage code changes."
|
|
8986
|
+
NOT 9: "we verified signals work" (that is a 6)
|
|
8987
|
+
NOT 9: "tests are passing" (that is a 5-6)
|
|
8988
|
+
NOT 9: "deployed feature X" (that is a 7 event at most)
|
|
8989
|
+
|
|
8990
|
+
8: Things an active parallel session would act on if notified right now.
|
|
8991
|
+
Use for: new user preferences discovered, important architectural facts just learned,
|
|
8992
|
+
active blocking issues, key decisions made today that others need to know.
|
|
8993
|
+
Example: "User prefers pnpm over npm for all projects in this workspace."
|
|
8994
|
+
Example: "The chunker silently drops chunks over 8k tokens - callers must split first."
|
|
8995
|
+
If in doubt between 7 and 8, use 7.
|
|
8996
|
+
NOT 8: "we decided to use TypeScript strict mode" (that is a 7 decision)
|
|
8997
|
+
NOT 8: "the user's daughter is named Emma" (that is a 7 biographical fact)
|
|
8998
|
+
NOT 8: "confirmed the import worked" (that is a 6 verification)
|
|
8999
|
+
|
|
9000
|
+
7: Default for solid durable facts. Stored, retrievable, no alert.
|
|
9001
|
+
Use for: project facts, preferences (non-critical), completed milestones, stable architecture notes.
|
|
9002
|
+
Example: "agenr stores entries in SQLite with sqlite-vec for vector search."
|
|
9003
|
+
Example: "Completed brain audit. Found 73% noise rate in knowledge base."
|
|
9004
|
+
This is the right score for most extracted entries.
|
|
9005
|
+
|
|
9006
|
+
6: Routine durable observations. Worth storing but minor.
|
|
9007
|
+
Use for: dev session observations, test results, routine verifications, minor notes.
|
|
9008
|
+
Example: "Verified that signal emission works end to end in local testing."
|
|
9009
|
+
Example: "Confirmed the import path change did not break CLI startup."
|
|
9010
|
+
Example: "agenr extraction runs in about 2s per chunk on the test dataset."
|
|
9011
|
+
|
|
9012
|
+
5: Borderline. Only emit if clearly durable beyond today and actionable in a future session.
|
|
9013
|
+
Example: "Port 4242 is the default for the local test server."
|
|
8973
9014
|
|
|
8974
9015
|
Calibration:
|
|
8975
9016
|
- Typical chunk: 0-3 entries. Most chunks: 0.
|
|
8976
|
-
-
|
|
8977
|
-
-
|
|
9017
|
+
- Score 9 or 10: very rare, at most 1 per significant session, often 0
|
|
9018
|
+
- Score 8: at most 1-2 per session; ask the cross-session-alert question before assigning
|
|
9019
|
+
- Score 7: this is your workhorse; most emitted entries should be 7
|
|
9020
|
+
- Score 6: routine dev observations that are still worth storing
|
|
9021
|
+
- If more than 20% of your emitted entries are 8 or higher, you are inflating
|
|
9022
|
+
|
|
9023
|
+
Dev session observations rule: Anything in the form "we tested X and it worked", "verified X",
|
|
9024
|
+
"confirmed X runs", "X is passing" belongs at 6 unless the result was surprising or breaks
|
|
9025
|
+
something. Surprising means a critical bug was found or a breaking change was discovered --
|
|
9026
|
+
minor timing differences or slightly unexpected output do not count. Test passes and routine
|
|
9027
|
+
verifications are not cross-session alerts.
|
|
9028
|
+
Exception: if the user explicitly said "remember this" or "remember that", the explicit memory
|
|
9029
|
+
request rule takes precedence and the 6-cap does not apply.
|
|
8978
9030
|
|
|
8979
9031
|
## Subject (critical)
|
|
8980
9032
|
|
|
@@ -9038,7 +9090,8 @@ Before emitting EACH entry, all five must be true:
|
|
|
9038
9090
|
2. Durable beyond the immediate step
|
|
9039
9091
|
3. Non-duplicate of another entry in this batch
|
|
9040
9092
|
4. Importance >= 5 with a concrete reason
|
|
9041
|
-
5. Explicit user "remember this/that" requests justify importance >= 7 regardless of content
|
|
9093
|
+
5. Explicit user "remember this/that" requests justify importance >= 7 regardless of content
|
|
9094
|
+
type, including dev session verifications (this overrides the dev-session-observations rule).
|
|
9042
9095
|
If any check fails, do not emit.
|
|
9043
9096
|
|
|
9044
9097
|
## Few-Shot Examples
|
|
@@ -9122,6 +9175,50 @@ EVENT:
|
|
|
9122
9175
|
"source_context": "Brain audit completed and findings documented"
|
|
9123
9176
|
}
|
|
9124
9177
|
|
|
9178
|
+
EVENT:
|
|
9179
|
+
{
|
|
9180
|
+
"type": "event",
|
|
9181
|
+
"subject": "agenr signal emission",
|
|
9182
|
+
"content": "Verified end-to-end signal emission works in local testing. Entry stored, signal fired, received in OpenClaw session.",
|
|
9183
|
+
"importance": 6,
|
|
9184
|
+
"expiry": "temporary",
|
|
9185
|
+
"tags": ["agenr", "signals", "testing"],
|
|
9186
|
+
"source_context": "Dev session verification of signal feature"
|
|
9187
|
+
}
|
|
9188
|
+
|
|
9189
|
+
FACT:
|
|
9190
|
+
{
|
|
9191
|
+
"type": "fact",
|
|
9192
|
+
"subject": "user penicillin allergy",
|
|
9193
|
+
"content": "User is allergic to penicillin. Must never suggest it or related antibiotics.",
|
|
9194
|
+
"importance": 8,
|
|
9195
|
+
"expiry": "permanent",
|
|
9196
|
+
"tags": ["health", "allergy", "critical"],
|
|
9197
|
+
"source_context": "User mentioned allergy when discussing a doctor visit -- scored 8 not 9/10 because a parallel session needs to know this now (e.g. to avoid recommending antibiotics), but it is not a breaking change or architectural decision"
|
|
9198
|
+
}
|
|
9199
|
+
|
|
9200
|
+
FACT:
|
|
9201
|
+
{
|
|
9202
|
+
"type": "fact",
|
|
9203
|
+
"subject": "user daughter name",
|
|
9204
|
+
"content": "User's daughter is named Emma. She is 8 years old.",
|
|
9205
|
+
"importance": 7,
|
|
9206
|
+
"expiry": "permanent",
|
|
9207
|
+
"tags": ["family", "personal"],
|
|
9208
|
+
"source_context": "User mentioned daughter while discussing weekend plans"
|
|
9209
|
+
}
|
|
9210
|
+
|
|
9211
|
+
PREFERENCE:
|
|
9212
|
+
{
|
|
9213
|
+
"type": "preference",
|
|
9214
|
+
"subject": "meeting time preference",
|
|
9215
|
+
"content": "User prefers morning meetings before noon and avoids late afternoon calls.",
|
|
9216
|
+
"importance": 6,
|
|
9217
|
+
"expiry": "long-term",
|
|
9218
|
+
"tags": ["schedule", "preference"],
|
|
9219
|
+
"source_context": "User mentioned scheduling preference during calendar discussion -- scored 6 not 8 because no parallel session needs to act on this immediately; it is a low-urgency convenience preference"
|
|
9220
|
+
}
|
|
9221
|
+
|
|
9125
9222
|
### BORDERLINE \u2014 skip these
|
|
9126
9223
|
|
|
9127
9224
|
SKIP: "The assistant read the config file and found the port was 3000."
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "agenr",
|
|
3
|
-
"version": "0.7.
|
|
3
|
+
"version": "0.7.7",
|
|
4
4
|
"openclaw": {
|
|
5
5
|
"extensions": [
|
|
6
6
|
"dist/openclaw-plugin/index.js"
|
|
@@ -11,6 +11,13 @@
|
|
|
11
11
|
"bin": {
|
|
12
12
|
"agenr": "dist/cli.js"
|
|
13
13
|
},
|
|
14
|
+
"scripts": {
|
|
15
|
+
"build": "tsup src/cli.ts src/cli-main.ts src/openclaw-plugin/index.ts --format esm --dts",
|
|
16
|
+
"dev": "tsup src/cli.ts src/cli-main.ts --format esm --watch",
|
|
17
|
+
"test": "vitest run",
|
|
18
|
+
"test:watch": "vitest",
|
|
19
|
+
"typecheck": "tsc --noEmit"
|
|
20
|
+
},
|
|
14
21
|
"dependencies": {
|
|
15
22
|
"@clack/prompts": "^1.0.1",
|
|
16
23
|
"@libsql/client": "^0.17.0",
|
|
@@ -54,11 +61,9 @@
|
|
|
54
61
|
"README.md"
|
|
55
62
|
],
|
|
56
63
|
"author": "agenr-ai",
|
|
57
|
-
"
|
|
58
|
-
"
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
"test:watch": "vitest",
|
|
62
|
-
"typecheck": "tsc --noEmit"
|
|
64
|
+
"pnpm": {
|
|
65
|
+
"overrides": {
|
|
66
|
+
"fast-xml-parser": "^5.3.6"
|
|
67
|
+
}
|
|
63
68
|
}
|
|
64
|
-
}
|
|
69
|
+
}
|