get-claudia 1.58.0 → 1.59.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +29 -0
- package/memory-daemon/claudia_memory/daemon/README.md +16 -0
- package/memory-daemon/claudia_memory/extraction/README.md +16 -0
- package/memory-daemon/claudia_memory/mcp/README.md +17 -0
- package/memory-daemon/claudia_memory/services/README.md +25 -0
- package/package.json +1 -1
- package/template-v2/.claude/manifest.json +2 -2
- package/memory-daemon/claudia_memory/services/metrics.py +0 -312
- package/memory-daemon/claudia_memory/services/verify.py +0 -279
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,35 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to Claudia will be documented in this file.
|
|
4
4
|
|
|
5
|
+
## 1.59.1 (2026-05-15)
|
|
6
|
+
|
|
7
|
+
### Docs uplift
|
|
8
|
+
|
|
9
|
+
Pure documentation. No code change.
|
|
10
|
+
|
|
11
|
+
### Documentation
|
|
12
|
+
- Added subpackage READMEs under `memory-daemon/claudia_memory/`: `services/`, `daemon/`, `extraction/`, `mcp/`. Each names the public entry points, lists the most relevant files, and captures the conventions a contributor needs to know before editing.
|
|
13
|
+
- Expanded `CONTRIBUTING.md` with a "Your first PR" walkthrough covering the seven-step path from picking a starter issue to opening the PR.
|
|
14
|
+
|
|
15
|
+
No user-visible behavior change.
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## 1.59.0 (2026-05-15)
|
|
20
|
+
|
|
21
|
+
### Removed
|
|
22
|
+
- `claudia_memory.services.verify.VerifyService` and `run_verification`. No production callers since v1.35; only the dedicated test referenced it.
|
|
23
|
+
- `claudia_memory.services.metrics.MetricsService` and `get_metrics_service`. Never wired into the scheduler, MCP server, or any service; only the dedicated test referenced it.
|
|
24
|
+
- Orphan test files `tests/test_verify.py` (7 tests) and `tests/test_metrics.py` (12 tests).
|
|
25
|
+
|
|
26
|
+
### Documentation
|
|
27
|
+
- Added "Verifying dead code" section to `CONTRIBUTING.md` documenting the audit method used in this release.
|
|
28
|
+
- Refactor design plan: `docs/plans/2026-05-15-craft-refactor-design.md` (shipped in PR #58).
|
|
29
|
+
|
|
30
|
+
No user-visible behavior change. No CLI flag change. No MCP tool change. No database schema change.
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
5
34
|
## 1.58.0 (2026-05-13)
|
|
6
35
|
|
|
7
36
|
### The Memory Reliability Release
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
# daemon/
|
|
2
|
+
|
|
3
|
+
Long-running side of the memory daemon: scheduled background jobs and a local HTTP health endpoint. Only used when the daemon runs in standalone mode (`--standalone`); the MCP-server path (stdio transport, default) does not run these.
|
|
4
|
+
|
|
5
|
+
## Where to look first
|
|
6
|
+
|
|
7
|
+
| Concern | File | Notes |
|
|
8
|
+
|---------|------|-------|
|
|
9
|
+
| Scheduled background work | `scheduler.py` | APScheduler with three jobs: `daily_decay` at 02:00, `pattern_detection` every 6 hours, `full_consolidation` at 03:00. Optional `vault_sync` at 03:15 if `vault_sync_enabled` is set. |
|
|
10
|
+
| Health endpoint | `health.py` | HTTP server bound to `localhost:3848`. The `/health` route is what the npm installer probes during Step 5 of install. The `/status` route powers the `memory_system_health` MCP tool. |
|
|
11
|
+
|
|
12
|
+
## Conventions
|
|
13
|
+
|
|
14
|
+
- **Bind localhost only.** Never `0.0.0.0`. The health server exposes internal state and is not auth-gated.
|
|
15
|
+
- **New scheduled jobs go through the same path as existing ones.** Add to `scheduler.py`'s job registration. Don't spawn ad-hoc background threads from service modules.
|
|
16
|
+
- **Service code stays in `services/`.** The daemon module is for *scheduling and exposing* that work, not implementing it. If you find yourself writing business logic here, move it to a service.
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
# extraction/
|
|
2
|
+
|
|
3
|
+
Pulling structured signal out of free-form text. Used by the remember pipeline (to identify which entities a memory is about) and by the ingest flow (to extract entities, commitments, and dates from longer documents).
|
|
4
|
+
|
|
5
|
+
## Where to look first
|
|
6
|
+
|
|
7
|
+
| Concern | File | Notes |
|
|
8
|
+
|---------|------|-------|
|
|
9
|
+
| Named-entity recognition | `entity_extractor.py` | Detects people, organizations, projects, concepts. Uses spaCy when the optional `[nlp]` extra is installed; falls back to pattern-based heuristics otherwise. |
|
|
10
|
+
| Date and time parsing | `temporal.py` | Resolves relative phrases (e.g., "next Thursday") to absolute dates in the user's timezone. Used by commitment detection and event extraction. |
|
|
11
|
+
|
|
12
|
+
## Conventions
|
|
13
|
+
|
|
14
|
+
- **spaCy is optional.** Anything in `extraction/` must work without it. Test the no-spaCy path. If you require it, gate behind a clear `ImportError` message that tells the user to install `claudia-memory[nlp]`.
|
|
15
|
+
- **Return structured results, never raw text.** Extractors emit typed dicts or dataclasses; the caller decides how to render them. This keeps the boundary clean and the call sites testable.
|
|
16
|
+
- **Idempotent on re-extraction.** If a document is re-ingested, extractors must produce the same result. No hidden state.
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
# mcp/
|
|
2
|
+
|
|
3
|
+
The Model Context Protocol surface: how Claude Code talks to the memory daemon. Stdio transport, configured in the user's `.mcp.json` after install.
|
|
4
|
+
|
|
5
|
+
## Where to look first
|
|
6
|
+
|
|
7
|
+
| Concern | File | Notes |
|
|
8
|
+
|---------|------|-------|
|
|
9
|
+
| Tool registration | `server.py` | Single file defining all ~33 `memory_*` MCP tools. Each tool is a handler function paired with a JSONSchema-style parameter declaration. |
|
|
10
|
+
|
|
11
|
+
## Conventions
|
|
12
|
+
|
|
13
|
+
- **Tool names are a public API.** Never rename a `memory_*` tool. Users have skills and workflows that invoke them by name. Add new tools rather than renaming old ones.
|
|
14
|
+
- **Tool docstrings are how Claude Code decides when to call.** Like skill descriptions, they need a clear verb, expected inputs, and example trigger phrases. Vague tool docs cause inconsistent invocation.
|
|
15
|
+
- **Each tool is a thin wrapper.** The real work lives in `services/`. Handlers in `server.py` parse the MCP request, call into a service, format the response. No business logic here.
|
|
16
|
+
- **Parameter aliases are supported for ergonomics.** `memory_about`, `memory_relate`, and `memory_recall` accept multiple parameter names (e.g., `entity` and `name`) so users can call them naturally. See PR #57 for the canonical example.
|
|
17
|
+
- **Errors should be actionable.** When a handler raises, the error message reaches the user verbatim. Say what went wrong and what to try, not just "failed."
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# services/
|
|
2
|
+
|
|
3
|
+
Business logic for the memory daemon. One module per concern. Every MCP tool exposed by `mcp/server.py` ultimately calls into a function here.
|
|
4
|
+
|
|
5
|
+
## Where to look first
|
|
6
|
+
|
|
7
|
+
| Concern | File | Public entry points |
|
|
8
|
+
|---------|------|--------------------|
|
|
9
|
+
| Write a memory, entity, or relationship | `remember.py` | `remember_fact`, `remember_entity`, `relate_entities`, `invalidate_memory` |
|
|
10
|
+
| Recall memories or find entities | `recall.py` | `recall`, `recall_about`, `search_entities`, `deep_recall` |
|
|
11
|
+
| Background decay + dedup + pattern detection | `consolidate.py` | `run_full_consolidation`, decay/dedup helpers, prediction lifecycle |
|
|
12
|
+
| Entity type inference and naming | `entities.py` | `infer_entity_type` |
|
|
13
|
+
| Memory and input validation rules | `guards.py` | `validate_memory`, `validate_entity`, `validate_relationship` |
|
|
14
|
+
| File storage for filed source material | `filestore.py`, `documents.py` | `LocalFileStore`, document filing pipeline |
|
|
15
|
+
| Provenance and audit trail | `audit.py` | source links, correction history |
|
|
16
|
+
| Bulk historical fixes | `backfill.py` | one-shot maintenance utilities |
|
|
17
|
+
| Compact session summaries for greeting | `context_builder.py` | `build_briefing_context` and friends |
|
|
18
|
+
| Multi-document intake pipeline | `ingest.py` | the Extract-Then-Aggregate flow |
|
|
19
|
+
| Obsidian vault projection | `vault_sync.py`, `canvas_generator.py` | PARA-layout write of entities, MOC canvases |
|
|
20
|
+
|
|
21
|
+
## Conventions
|
|
22
|
+
|
|
23
|
+
- **Soft-delete columns differ by table.** `memories.invalidated_at` vs. `entities.deleted_at`. Always check the schema before writing recall queries that filter "active" rows.
|
|
24
|
+
- **Embedding storage is JSON text** (via `json.dumps`), not binary `struct.pack` blobs. Match this when writing new embedding paths.
|
|
25
|
+
- Functions exported from a service module are the unit of testability. Tests for `recall.py` live at `tests/test_recall*.py` and call the module's public functions directly. Don't add internal coupling that bypasses those entry points.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
|
-
"version": "1.
|
|
3
|
-
"generated": "2026-05-
|
|
2
|
+
"version": "1.59.1",
|
|
3
|
+
"generated": "2026-05-15T13:14:19.001Z",
|
|
4
4
|
"algorithm": "sha256",
|
|
5
5
|
"files": {
|
|
6
6
|
".claude/rules/claudia-principles.md": "939e9720421628e7f2e4c8dfbaa4aeb9c1e18e8c6a5379cd6b772a6835b812e5",
|
|
@@ -1,312 +0,0 @@
|
|
|
1
|
-
"""
|
|
2
|
-
Metrics Service for Claudia Memory System
|
|
3
|
-
|
|
4
|
-
Tracks system health and improvement over time.
|
|
5
|
-
"""
|
|
6
|
-
|
|
7
|
-
import json
|
|
8
|
-
import logging
|
|
9
|
-
from datetime import datetime, timedelta
|
|
10
|
-
from typing import Any, Dict, List, Optional
|
|
11
|
-
|
|
12
|
-
from ..database import get_db
|
|
13
|
-
|
|
14
|
-
logger = logging.getLogger(__name__)
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
class MetricsService:
|
|
18
|
-
"""Track system health and quality metrics"""
|
|
19
|
-
|
|
20
|
-
def __init__(self):
|
|
21
|
-
self.db = get_db()
|
|
22
|
-
|
|
23
|
-
def record(
|
|
24
|
-
self,
|
|
25
|
-
name: str,
|
|
26
|
-
value: float,
|
|
27
|
-
dimensions: Optional[Dict[str, Any]] = None,
|
|
28
|
-
) -> int:
|
|
29
|
-
"""
|
|
30
|
-
Record a metric value.
|
|
31
|
-
|
|
32
|
-
Args:
|
|
33
|
-
name: Metric name (e.g., 'entity_count', 'memory_avg_importance')
|
|
34
|
-
value: Metric value
|
|
35
|
-
dimensions: Optional key-value pairs for filtering
|
|
36
|
-
|
|
37
|
-
Returns:
|
|
38
|
-
Metric entry ID
|
|
39
|
-
"""
|
|
40
|
-
entry_id = self.db.insert(
|
|
41
|
-
"metrics",
|
|
42
|
-
{
|
|
43
|
-
"timestamp": datetime.utcnow().isoformat(),
|
|
44
|
-
"metric_name": name,
|
|
45
|
-
"metric_value": value,
|
|
46
|
-
"dimensions": json.dumps(dimensions) if dimensions else None,
|
|
47
|
-
},
|
|
48
|
-
)
|
|
49
|
-
logger.debug(f"Metric recorded: {name}={value}")
|
|
50
|
-
return entry_id
|
|
51
|
-
|
|
52
|
-
def collect_system_health(self) -> Dict[str, Any]:
|
|
53
|
-
"""
|
|
54
|
-
Collect current system health metrics.
|
|
55
|
-
|
|
56
|
-
Returns a snapshot of entity counts, memory stats, reflection stats,
|
|
57
|
-
prediction engagement, and data quality indicators.
|
|
58
|
-
|
|
59
|
-
Returns:
|
|
60
|
-
Dict with current health metrics
|
|
61
|
-
"""
|
|
62
|
-
health = {
|
|
63
|
-
"collected_at": datetime.utcnow().isoformat(),
|
|
64
|
-
"entities": {},
|
|
65
|
-
"memories": {},
|
|
66
|
-
"reflections": {},
|
|
67
|
-
"predictions": {},
|
|
68
|
-
"data_quality": {},
|
|
69
|
-
}
|
|
70
|
-
|
|
71
|
-
# Entity counts by type
|
|
72
|
-
try:
|
|
73
|
-
entity_rows = self.db.execute(
|
|
74
|
-
"""
|
|
75
|
-
SELECT type, COUNT(*) as count
|
|
76
|
-
FROM entities
|
|
77
|
-
WHERE deleted_at IS NULL
|
|
78
|
-
GROUP BY type
|
|
79
|
-
""",
|
|
80
|
-
fetch=True,
|
|
81
|
-
) or []
|
|
82
|
-
for row in entity_rows:
|
|
83
|
-
health["entities"][row["type"]] = row["count"]
|
|
84
|
-
health["entities"]["total"] = sum(health["entities"].values())
|
|
85
|
-
except Exception as e:
|
|
86
|
-
logger.debug(f"Could not collect entity metrics: {e}")
|
|
87
|
-
|
|
88
|
-
# Memory stats
|
|
89
|
-
try:
|
|
90
|
-
mem_row = self.db.execute(
|
|
91
|
-
"""
|
|
92
|
-
SELECT
|
|
93
|
-
COUNT(*) as total,
|
|
94
|
-
AVG(importance) as avg_importance,
|
|
95
|
-
COUNT(CASE WHEN invalidated_at IS NOT NULL THEN 1 END) as invalidated,
|
|
96
|
-
COUNT(CASE WHEN corrected_at IS NOT NULL THEN 1 END) as corrected
|
|
97
|
-
FROM memories
|
|
98
|
-
""",
|
|
99
|
-
fetch=True,
|
|
100
|
-
)
|
|
101
|
-
if mem_row:
|
|
102
|
-
health["memories"] = {
|
|
103
|
-
"total": mem_row[0]["total"],
|
|
104
|
-
"avg_importance": round(mem_row[0]["avg_importance"] or 0, 3),
|
|
105
|
-
"invalidated": mem_row[0]["invalidated"],
|
|
106
|
-
"corrected": mem_row[0]["corrected"],
|
|
107
|
-
}
|
|
108
|
-
|
|
109
|
-
# Memory counts by type
|
|
110
|
-
mem_type_rows = self.db.execute(
|
|
111
|
-
"""
|
|
112
|
-
SELECT type, COUNT(*) as count
|
|
113
|
-
FROM memories
|
|
114
|
-
WHERE invalidated_at IS NULL
|
|
115
|
-
GROUP BY type
|
|
116
|
-
""",
|
|
117
|
-
fetch=True,
|
|
118
|
-
) or []
|
|
119
|
-
health["memories"]["by_type"] = {row["type"]: row["count"] for row in mem_type_rows}
|
|
120
|
-
except Exception as e:
|
|
121
|
-
logger.debug(f"Could not collect memory metrics: {e}")
|
|
122
|
-
|
|
123
|
-
# Reflection stats
|
|
124
|
-
try:
|
|
125
|
-
ref_row = self.db.execute(
|
|
126
|
-
"""
|
|
127
|
-
SELECT
|
|
128
|
-
COUNT(*) as total,
|
|
129
|
-
AVG(importance) as avg_importance,
|
|
130
|
-
AVG(aggregation_count) as avg_confirmations
|
|
131
|
-
FROM reflections
|
|
132
|
-
""",
|
|
133
|
-
fetch=True,
|
|
134
|
-
)
|
|
135
|
-
if ref_row:
|
|
136
|
-
health["reflections"] = {
|
|
137
|
-
"total": ref_row[0]["total"],
|
|
138
|
-
"avg_importance": round(ref_row[0]["avg_importance"] or 0, 3),
|
|
139
|
-
"avg_confirmations": round(ref_row[0]["avg_confirmations"] or 0, 1),
|
|
140
|
-
}
|
|
141
|
-
|
|
142
|
-
# By type
|
|
143
|
-
ref_type_rows = self.db.execute(
|
|
144
|
-
"""
|
|
145
|
-
SELECT reflection_type, COUNT(*) as count
|
|
146
|
-
FROM reflections
|
|
147
|
-
GROUP BY reflection_type
|
|
148
|
-
""",
|
|
149
|
-
fetch=True,
|
|
150
|
-
) or []
|
|
151
|
-
health["reflections"]["by_type"] = {row["reflection_type"]: row["count"] for row in ref_type_rows}
|
|
152
|
-
except Exception as e:
|
|
153
|
-
logger.debug(f"Could not collect reflection metrics: {e}")
|
|
154
|
-
|
|
155
|
-
# Prediction engagement rate
|
|
156
|
-
try:
|
|
157
|
-
pred_row = self.db.execute(
|
|
158
|
-
"""
|
|
159
|
-
SELECT
|
|
160
|
-
COUNT(*) as total,
|
|
161
|
-
COUNT(CASE WHEN is_shown = 1 THEN 1 END) as shown,
|
|
162
|
-
COUNT(CASE WHEN is_acted_on = 1 THEN 1 END) as acted_on
|
|
163
|
-
FROM predictions
|
|
164
|
-
""",
|
|
165
|
-
fetch=True,
|
|
166
|
-
)
|
|
167
|
-
if pred_row:
|
|
168
|
-
total = pred_row[0]["total"]
|
|
169
|
-
shown = pred_row[0]["shown"]
|
|
170
|
-
acted = pred_row[0]["acted_on"]
|
|
171
|
-
health["predictions"] = {
|
|
172
|
-
"total": total,
|
|
173
|
-
"shown": shown,
|
|
174
|
-
"acted_on": acted,
|
|
175
|
-
"engagement_rate": round(acted / shown, 3) if shown > 0 else 0,
|
|
176
|
-
}
|
|
177
|
-
except Exception as e:
|
|
178
|
-
logger.debug(f"Could not collect prediction metrics: {e}")
|
|
179
|
-
|
|
180
|
-
# Data quality indicators
|
|
181
|
-
try:
|
|
182
|
-
# Potential duplicates count (using name similarity > 0.85)
|
|
183
|
-
# This is expensive so we just count entities without the full comparison
|
|
184
|
-
entity_count = health["entities"].get("total", 0)
|
|
185
|
-
|
|
186
|
-
# Orphan memories (no entity links)
|
|
187
|
-
orphan_row = self.db.execute(
|
|
188
|
-
"""
|
|
189
|
-
SELECT COUNT(*) as count
|
|
190
|
-
FROM memories m
|
|
191
|
-
WHERE NOT EXISTS (
|
|
192
|
-
SELECT 1 FROM memory_entities me WHERE me.memory_id = m.id
|
|
193
|
-
)
|
|
194
|
-
AND m.invalidated_at IS NULL
|
|
195
|
-
""",
|
|
196
|
-
fetch=True,
|
|
197
|
-
)
|
|
198
|
-
orphan_count = orphan_row[0]["count"] if orphan_row else 0
|
|
199
|
-
|
|
200
|
-
# Stale entities (no activity in 90 days)
|
|
201
|
-
stale_cutoff = (datetime.utcnow() - timedelta(days=90)).isoformat()
|
|
202
|
-
stale_row = self.db.execute(
|
|
203
|
-
"""
|
|
204
|
-
SELECT COUNT(*) as count
|
|
205
|
-
FROM entities
|
|
206
|
-
WHERE updated_at < ? AND deleted_at IS NULL
|
|
207
|
-
""",
|
|
208
|
-
(stale_cutoff,),
|
|
209
|
-
fetch=True,
|
|
210
|
-
)
|
|
211
|
-
stale_count = stale_row[0]["count"] if stale_row else 0
|
|
212
|
-
|
|
213
|
-
health["data_quality"] = {
|
|
214
|
-
"orphan_memories": orphan_count,
|
|
215
|
-
"stale_entities": stale_count,
|
|
216
|
-
"entities_needing_review": stale_count,
|
|
217
|
-
}
|
|
218
|
-
except Exception as e:
|
|
219
|
-
logger.debug(f"Could not collect data quality metrics: {e}")
|
|
220
|
-
|
|
221
|
-
return health
|
|
222
|
-
|
|
223
|
-
def get_trend(
|
|
224
|
-
self,
|
|
225
|
-
metric_name: str,
|
|
226
|
-
days: int = 30,
|
|
227
|
-
) -> List[Dict[str, Any]]:
|
|
228
|
-
"""
|
|
229
|
-
Get metric values over time.
|
|
230
|
-
|
|
231
|
-
Args:
|
|
232
|
-
metric_name: Name of the metric
|
|
233
|
-
days: Number of days to look back
|
|
234
|
-
|
|
235
|
-
Returns:
|
|
236
|
-
List of {timestamp, value} ordered by time
|
|
237
|
-
"""
|
|
238
|
-
cutoff = (datetime.utcnow() - timedelta(days=days)).isoformat()
|
|
239
|
-
rows = self.db.execute(
|
|
240
|
-
"""
|
|
241
|
-
SELECT timestamp, metric_value, dimensions
|
|
242
|
-
FROM metrics
|
|
243
|
-
WHERE metric_name = ? AND timestamp > ?
|
|
244
|
-
ORDER BY timestamp ASC
|
|
245
|
-
""",
|
|
246
|
-
(metric_name, cutoff),
|
|
247
|
-
fetch=True,
|
|
248
|
-
) or []
|
|
249
|
-
return [
|
|
250
|
-
{
|
|
251
|
-
"timestamp": row["timestamp"],
|
|
252
|
-
"value": row["metric_value"],
|
|
253
|
-
"dimensions": json.loads(row["dimensions"]) if row["dimensions"] else None,
|
|
254
|
-
}
|
|
255
|
-
for row in rows
|
|
256
|
-
]
|
|
257
|
-
|
|
258
|
-
def collect_and_store(self) -> Dict[str, Any]:
|
|
259
|
-
"""
|
|
260
|
-
Collect health metrics and store them.
|
|
261
|
-
|
|
262
|
-
Called by scheduler for daily metrics collection.
|
|
263
|
-
|
|
264
|
-
Returns:
|
|
265
|
-
The collected health metrics
|
|
266
|
-
"""
|
|
267
|
-
health = self.collect_system_health()
|
|
268
|
-
|
|
269
|
-
# Store key metrics for trend tracking
|
|
270
|
-
self.record("entities_total", health["entities"].get("total", 0))
|
|
271
|
-
self.record("memories_total", health["memories"].get("total", 0))
|
|
272
|
-
self.record("memories_avg_importance", health["memories"].get("avg_importance", 0))
|
|
273
|
-
self.record("reflections_total", health["reflections"].get("total", 0))
|
|
274
|
-
self.record("predictions_engagement_rate", health["predictions"].get("engagement_rate", 0))
|
|
275
|
-
self.record("orphan_memories", health["data_quality"].get("orphan_memories", 0))
|
|
276
|
-
self.record("stale_entities", health["data_quality"].get("stale_entities", 0))
|
|
277
|
-
|
|
278
|
-
logger.info("Daily metrics collected and stored")
|
|
279
|
-
return health
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
# Global service instance
|
|
283
|
-
_service: Optional[MetricsService] = None
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
def get_metrics_service() -> MetricsService:
|
|
287
|
-
"""Get or create the global metrics service"""
|
|
288
|
-
global _service
|
|
289
|
-
if _service is None:
|
|
290
|
-
_service = MetricsService()
|
|
291
|
-
return _service
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
# Convenience functions
|
|
295
|
-
def record_metric(name: str, value: float, **kwargs) -> int:
|
|
296
|
-
"""Record a metric value"""
|
|
297
|
-
return get_metrics_service().record(name, value, **kwargs)
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
def get_system_health() -> Dict[str, Any]:
|
|
301
|
-
"""Collect current system health metrics"""
|
|
302
|
-
return get_metrics_service().collect_system_health()
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
def get_metric_trend(metric_name: str, days: int = 30) -> List[Dict[str, Any]]:
|
|
306
|
-
"""Get metric values over time"""
|
|
307
|
-
return get_metrics_service().get_trend(metric_name, days)
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
def collect_daily_metrics() -> Dict[str, Any]:
|
|
311
|
-
"""Collect and store daily metrics"""
|
|
312
|
-
return get_metrics_service().collect_and_store()
|
|
@@ -1,279 +0,0 @@
|
|
|
1
|
-
"""
|
|
2
|
-
Background Verification Service for Claudia Memory System
|
|
3
|
-
|
|
4
|
-
Async background verification of recently stored memories.
|
|
5
|
-
Never blocks conversation. Runs on a schedule via the daemon scheduler.
|
|
6
|
-
|
|
7
|
-
Verification cascade (cheapest to most expensive):
|
|
8
|
-
1. Commitment deadline check (deterministic, reuses guards)
|
|
9
|
-
2. Entity duplicate check (deterministic, reuses guards)
|
|
10
|
-
3. Fact contradiction check (LLM, only if available)
|
|
11
|
-
4. Commitment completeness check (LLM, only if available)
|
|
12
|
-
"""
|
|
13
|
-
|
|
14
|
-
import json
|
|
15
|
-
import logging
|
|
16
|
-
from datetime import datetime
|
|
17
|
-
from typing import Any, Dict, List, Optional
|
|
18
|
-
|
|
19
|
-
from ..config import get_config
|
|
20
|
-
from ..database import get_db
|
|
21
|
-
from .guards import DEADLINE_PATTERNS, validate_entity
|
|
22
|
-
|
|
23
|
-
logger = logging.getLogger(__name__)
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
class VerifyService:
|
|
27
|
-
"""Background memory verification"""
|
|
28
|
-
|
|
29
|
-
_instance: Optional["VerifyService"] = None
|
|
30
|
-
|
|
31
|
-
def __init__(self):
|
|
32
|
-
self.db = get_db()
|
|
33
|
-
self.config = get_config()
|
|
34
|
-
|
|
35
|
-
@classmethod
|
|
36
|
-
def get_instance(cls) -> "VerifyService":
|
|
37
|
-
"""Singleton accessor"""
|
|
38
|
-
if cls._instance is None:
|
|
39
|
-
cls._instance = cls()
|
|
40
|
-
return cls._instance
|
|
41
|
-
|
|
42
|
-
def run_verification(self) -> Dict[str, Any]:
|
|
43
|
-
"""
|
|
44
|
-
Verify recently stored memories that have passed the 5-minute buffer.
|
|
45
|
-
|
|
46
|
-
Returns:
|
|
47
|
-
Dict with verification stats
|
|
48
|
-
"""
|
|
49
|
-
batch_size = self.config.verify_batch_size
|
|
50
|
-
|
|
51
|
-
# Query pending memories older than 5 minutes (buffer prevents mid-session verification)
|
|
52
|
-
# Use REPLACE to normalize T-separator from Python's isoformat() for comparison
|
|
53
|
-
pending = self.db.execute(
|
|
54
|
-
"""
|
|
55
|
-
SELECT id, content, type, importance, metadata
|
|
56
|
-
FROM memories
|
|
57
|
-
WHERE verification_status = 'pending'
|
|
58
|
-
AND julianday(REPLACE(created_at, 'T', ' ')) < julianday('now', '-5 minutes')
|
|
59
|
-
ORDER BY created_at ASC
|
|
60
|
-
LIMIT ?
|
|
61
|
-
""",
|
|
62
|
-
(batch_size,),
|
|
63
|
-
fetch=True,
|
|
64
|
-
) or []
|
|
65
|
-
|
|
66
|
-
if not pending:
|
|
67
|
-
return {"verified": 0, "flagged": 0, "contradicts": 0, "skipped": 0}
|
|
68
|
-
|
|
69
|
-
stats = {"verified": 0, "flagged": 0, "contradicts": 0, "skipped": 0}
|
|
70
|
-
|
|
71
|
-
for memory in pending:
|
|
72
|
-
try:
|
|
73
|
-
result = self._verify_single(memory)
|
|
74
|
-
self._apply_result(memory["id"], result)
|
|
75
|
-
stats[result["status"]] = stats.get(result["status"], 0) + 1
|
|
76
|
-
except Exception as e:
|
|
77
|
-
logger.warning(f"Verification failed for memory {memory['id']}: {e}")
|
|
78
|
-
stats["skipped"] += 1
|
|
79
|
-
|
|
80
|
-
logger.info(f"Verification batch complete: {stats}")
|
|
81
|
-
return stats
|
|
82
|
-
|
|
83
|
-
def _verify_single(self, memory: dict) -> Dict[str, Any]:
|
|
84
|
-
"""
|
|
85
|
-
Run verification cascade on a single memory.
|
|
86
|
-
|
|
87
|
-
Returns:
|
|
88
|
-
Dict with status ('verified', 'flagged', 'contradicts') and reasons
|
|
89
|
-
"""
|
|
90
|
-
reasons = []
|
|
91
|
-
memory_type = memory["type"]
|
|
92
|
-
content = memory["content"]
|
|
93
|
-
memory_id = memory["id"]
|
|
94
|
-
|
|
95
|
-
# Check 1: Commitment deadline (deterministic)
|
|
96
|
-
if memory_type == "commitment":
|
|
97
|
-
has_deadline = any(p.search(content) for p in DEADLINE_PATTERNS)
|
|
98
|
-
if not has_deadline:
|
|
99
|
-
reasons.append("Commitment has no detected deadline")
|
|
100
|
-
|
|
101
|
-
# Check 2: Entity duplicate (deterministic)
|
|
102
|
-
entity_links = self.db.execute(
|
|
103
|
-
"""
|
|
104
|
-
SELECT e.name, e.canonical_name
|
|
105
|
-
FROM memory_entities me
|
|
106
|
-
JOIN entities e ON me.entity_id = e.id
|
|
107
|
-
WHERE me.memory_id = ?
|
|
108
|
-
""",
|
|
109
|
-
(memory_id,),
|
|
110
|
-
fetch=True,
|
|
111
|
-
) or []
|
|
112
|
-
|
|
113
|
-
if entity_links:
|
|
114
|
-
all_canonical = [
|
|
115
|
-
row["canonical_name"]
|
|
116
|
-
for row in self.db.query("entities", columns=["canonical_name"])
|
|
117
|
-
]
|
|
118
|
-
for linked in entity_links:
|
|
119
|
-
result = validate_entity(
|
|
120
|
-
linked["name"],
|
|
121
|
-
entity_type="person",
|
|
122
|
-
existing_canonical_names=[
|
|
123
|
-
n for n in all_canonical if n != linked["canonical_name"]
|
|
124
|
-
],
|
|
125
|
-
)
|
|
126
|
-
for w in result.warnings:
|
|
127
|
-
if "near-duplicate" in w.lower():
|
|
128
|
-
reasons.append(w)
|
|
129
|
-
|
|
130
|
-
# Check 3: Fact contradiction (LLM, only if available)
|
|
131
|
-
if memory_type == "fact" and self._has_language_model():
|
|
132
|
-
contradiction = self._check_fact_contradiction(memory_id, content, entity_links)
|
|
133
|
-
if contradiction:
|
|
134
|
-
return {"status": "contradicts", "reasons": [contradiction]}
|
|
135
|
-
|
|
136
|
-
# Check 4: Commitment completeness (LLM, only if available)
|
|
137
|
-
if memory_type == "commitment" and self._has_language_model():
|
|
138
|
-
completeness = self._check_commitment_completeness(content)
|
|
139
|
-
if completeness:
|
|
140
|
-
reasons.append(completeness)
|
|
141
|
-
|
|
142
|
-
# Determine final status
|
|
143
|
-
if reasons:
|
|
144
|
-
return {"status": "flagged", "reasons": reasons}
|
|
145
|
-
return {"status": "verified", "reasons": []}
|
|
146
|
-
|
|
147
|
-
def _apply_result(self, memory_id: int, result: Dict[str, Any]) -> None:
|
|
148
|
-
"""Apply verification result to the database"""
|
|
149
|
-
status = result["status"]
|
|
150
|
-
update_data = {
|
|
151
|
-
"verification_status": status,
|
|
152
|
-
"verified_at": datetime.utcnow().isoformat(),
|
|
153
|
-
}
|
|
154
|
-
|
|
155
|
-
# Flagged/contradicting memories get importance reduced
|
|
156
|
-
if status in ("flagged", "contradicts"):
|
|
157
|
-
update_data["importance"] = 0.1
|
|
158
|
-
|
|
159
|
-
# Store reasons in metadata
|
|
160
|
-
if result.get("reasons"):
|
|
161
|
-
memory = self.db.get_one("memories", where="id = ?", where_params=(memory_id,))
|
|
162
|
-
if memory:
|
|
163
|
-
meta = json.loads(memory["metadata"] or "{}")
|
|
164
|
-
meta["verification_reasons"] = result["reasons"]
|
|
165
|
-
update_data["metadata"] = json.dumps(meta)
|
|
166
|
-
|
|
167
|
-
self.db.update("memories", update_data, "id = ?", (memory_id,))
|
|
168
|
-
|
|
169
|
-
def _has_language_model(self) -> bool:
|
|
170
|
-
"""Check if a language model is configured"""
|
|
171
|
-
return bool(self.config.language_model)
|
|
172
|
-
|
|
173
|
-
def _check_fact_contradiction(
|
|
174
|
-
self, memory_id: int, content: str, entity_links: list
|
|
175
|
-
) -> Optional[str]:
|
|
176
|
-
"""
|
|
177
|
-
Check if a new fact contradicts existing verified facts about the same entities.
|
|
178
|
-
Uses LLM to assess contradiction.
|
|
179
|
-
|
|
180
|
-
Returns:
|
|
181
|
-
Contradiction description or None
|
|
182
|
-
"""
|
|
183
|
-
if not entity_links:
|
|
184
|
-
return None
|
|
185
|
-
|
|
186
|
-
try:
|
|
187
|
-
from ..language_model import get_language_model_service
|
|
188
|
-
|
|
189
|
-
lm = get_language_model_service()
|
|
190
|
-
if not lm.is_available():
|
|
191
|
-
return None
|
|
192
|
-
|
|
193
|
-
# Get existing verified facts about the same entities
|
|
194
|
-
entity_ids = []
|
|
195
|
-
for link in entity_links:
|
|
196
|
-
entity = self.db.get_one(
|
|
197
|
-
"entities",
|
|
198
|
-
where="canonical_name = ?",
|
|
199
|
-
where_params=(link["canonical_name"],),
|
|
200
|
-
)
|
|
201
|
-
if entity:
|
|
202
|
-
entity_ids.append(entity["id"])
|
|
203
|
-
|
|
204
|
-
if not entity_ids:
|
|
205
|
-
return None
|
|
206
|
-
|
|
207
|
-
placeholders = ", ".join(["?" for _ in entity_ids])
|
|
208
|
-
existing_facts = self.db.execute(
|
|
209
|
-
f"""
|
|
210
|
-
SELECT DISTINCT m.content
|
|
211
|
-
FROM memories m
|
|
212
|
-
JOIN memory_entities me ON m.id = me.memory_id
|
|
213
|
-
WHERE me.entity_id IN ({placeholders})
|
|
214
|
-
AND m.type = 'fact'
|
|
215
|
-
AND m.verification_status = 'verified'
|
|
216
|
-
AND m.id != ?
|
|
217
|
-
AND m.importance > 0.1
|
|
218
|
-
ORDER BY m.importance DESC
|
|
219
|
-
LIMIT 10
|
|
220
|
-
""",
|
|
221
|
-
tuple(entity_ids) + (memory_id,),
|
|
222
|
-
fetch=True,
|
|
223
|
-
) or []
|
|
224
|
-
|
|
225
|
-
if not existing_facts:
|
|
226
|
-
return None
|
|
227
|
-
|
|
228
|
-
facts_text = "\n".join(f"- {f['content']}" for f in existing_facts)
|
|
229
|
-
prompt = (
|
|
230
|
-
f"Existing verified facts:\n{facts_text}\n\n"
|
|
231
|
-
f"New fact: {content}\n\n"
|
|
232
|
-
f"Does the new fact directly contradict any existing fact? "
|
|
233
|
-
f"Answer ONLY 'no' or describe the specific contradiction in one sentence."
|
|
234
|
-
)
|
|
235
|
-
|
|
236
|
-
response = lm.generate_sync(prompt)
|
|
237
|
-
if response and response.strip().lower() != "no":
|
|
238
|
-
return f"Potential contradiction: {response.strip()[:200]}"
|
|
239
|
-
|
|
240
|
-
except Exception as e:
|
|
241
|
-
logger.debug(f"Fact contradiction check failed: {e}")
|
|
242
|
-
|
|
243
|
-
return None
|
|
244
|
-
|
|
245
|
-
def _check_commitment_completeness(self, content: str) -> Optional[str]:
|
|
246
|
-
"""
|
|
247
|
-
Check if a commitment has a clear owner and deadline using LLM.
|
|
248
|
-
|
|
249
|
-
Returns:
|
|
250
|
-
Incompleteness description or None
|
|
251
|
-
"""
|
|
252
|
-
try:
|
|
253
|
-
from ..language_model import get_language_model_service
|
|
254
|
-
|
|
255
|
-
lm = get_language_model_service()
|
|
256
|
-
if not lm.is_available():
|
|
257
|
-
return None
|
|
258
|
-
|
|
259
|
-
prompt = (
|
|
260
|
-
f"Commitment: {content}\n\n"
|
|
261
|
-
f"Does this commitment have a clear owner (who is responsible) "
|
|
262
|
-
f"and a clear deadline (when it should be done)? "
|
|
263
|
-
f"Answer ONLY 'yes' or describe what is missing in one sentence."
|
|
264
|
-
)
|
|
265
|
-
|
|
266
|
-
response = lm.generate_sync(prompt)
|
|
267
|
-
if response and response.strip().lower() != "yes":
|
|
268
|
-
return f"Incomplete commitment: {response.strip()[:200]}"
|
|
269
|
-
|
|
270
|
-
except Exception as e:
|
|
271
|
-
logger.debug(f"Commitment completeness check failed: {e}")
|
|
272
|
-
|
|
273
|
-
return None
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
# Convenience function
|
|
277
|
-
def run_verification() -> Dict[str, Any]:
|
|
278
|
-
"""Run background verification"""
|
|
279
|
-
return VerifyService.get_instance().run_verification()
|