ltcai 0.3.1 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +285 -208
- package/docs/CHANGELOG.md +73 -0
- package/kg_schema.py +42 -0
- package/knowledge_graph.py +232 -36
- package/latticeai/api/security_dashboard.py +6 -2
- package/latticeai/core/agent.py +453 -0
- package/latticeai/core/audit.py +4 -1
- package/latticeai/core/config.py +178 -0
- package/latticeai/core/graph_curator.py +60 -4
- package/latticeai/core/model_compat.py +67 -24
- package/latticeai/core/timezones.py +80 -0
- package/package.json +2 -2
- package/server.py +108 -441
- package/static/scripts/chat.js +105 -16
- package/tools.py +87 -115
package/docs/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,78 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [0.4.0] - 2026-05-31
|
|
4
|
+
|
|
5
|
+
> Knowledge Graph v2 read/write cutover — legacy/v2 동등성 보장, dual-write
|
|
6
|
+
> projection, deterministic ordering, 삭제 미러링 완성. 그래프 안정화 릴리스.
|
|
7
|
+
|
|
8
|
+
### Changed
|
|
9
|
+
|
|
10
|
+
- **KGStoreV2 read/write cutover 완료** — 그래프 read 메서드(`search`,
|
|
11
|
+
`context_for_query`, `graph`, `neighbors`, `multi_hop_context`,
|
|
12
|
+
`search_for_document_generation`, `stats`)와 write가 v2 store를 단일 경로로
|
|
13
|
+
사용. `KnowledgeGraphStore` 공개 인터페이스는 시그니처·반환형 그대로 유지.
|
|
14
|
+
- **단일 read 코드 경로** — `_read_tables()`가 legacy 테이블 또는 v2 재구성
|
|
15
|
+
뷰(`kgv2_nodes`/`kgv2_edges`)를 같은 코드로 조회. `LATTICEAI_KG_READ_V2`로
|
|
16
|
+
소스 토글(기본 v2).
|
|
17
|
+
|
|
18
|
+
### Added
|
|
19
|
+
|
|
20
|
+
- **Dual-write projection** — `_upsert_node`/`_upsert_edge`가 동일 트랜잭션에서
|
|
21
|
+
`nodes_v2`/`edges_v2`에 프로젝션 기록. legacy 타입 문자열을 v2 type 칼럼에
|
|
22
|
+
보존하고 summary·원본 metadata_json을 `attrs._kg`에 보존해 결과 동등성 확보.
|
|
23
|
+
- **삭제 미러링** — `clear_all`, `delete_conversation`, 로컬 폴더 재인덱싱의
|
|
24
|
+
모든 노드/엣지 삭제를 v2에 미러(`_v2_delete_nodes`/`_v2_delete_edges_from`,
|
|
25
|
+
edges_v2 FK cascade 활용).
|
|
26
|
+
- **Deterministic ordering** — 모든 그래프 read의 `ORDER BY`에 `id ASC`
|
|
27
|
+
tie-break 추가(엣지/이웃 쿼리 포함). legacy/v2 결과 순서가 항상 동일.
|
|
28
|
+
- **Legacy/V2 equivalence test suite** — `test_kg_v2_read_equivalence.py`(7개
|
|
29
|
+
read + dual-write + 동률 timestamp + 재upsert + 삭제 반영),
|
|
30
|
+
`test_kg_v2_backfill.py`(프로젝션·self-heal·idempotent).
|
|
31
|
+
- v2 스키마 self-heal — 구버전 init이 만든 *빈* v2 테이블의 컬럼 누락 시
|
|
32
|
+
drop+recreate(행이 있으면 절대 drop 안 함).
|
|
33
|
+
|
|
34
|
+
### Internal
|
|
35
|
+
|
|
36
|
+
- agent 루프를 `latticeai/core/agent.py`(`AgentRuntime`+ports)로 추출, 앱 설정을
|
|
37
|
+
`latticeai/core/config.py`(`Config.from_env`)로 단일화, `tools.py`에 tool
|
|
38
|
+
registry 도입(`execute_tool` if/elif 제거). server.py 대폭 축소.
|
|
39
|
+
- 단위 테스트 181 passed.
|
|
40
|
+
|
|
41
|
+
## [0.3.2] - 2026-05-29
|
|
42
|
+
|
|
43
|
+
> 안정화 릴리스 — 모델 current 일관성, smoke test 3분류, 보안 대시보드 timezone
|
|
44
|
+
> 버그 수정, 자동 그래프 한국어 노이즈 개선, README 과장 표현 정리.
|
|
45
|
+
|
|
46
|
+
### Model loading & UI
|
|
47
|
+
|
|
48
|
+
- 웹 UI 모델 선택을 단일 흐름으로 통일(`selectModelByCard` → `prepareAndLoadModel`
|
|
49
|
+
→ smoke test → `current` 반영 → 채팅 가능 여부 표시). cloud(`loadSelectedModel`)
|
|
50
|
+
경로도 백엔드 `current`를 단일 진실원으로 사용. "보이는 모델 ≠ 채팅에 쓰이는
|
|
51
|
+
모델" 문제 제거.
|
|
52
|
+
- Smoke test 결과를 **ok / degraded / failed** 3분류로 확장
|
|
53
|
+
(`model_compat.classify_smoke_response()`). 특수/role 토큰 누출, 폭주 반복,
|
|
54
|
+
과도한 길이를 감지. `degraded`는 채팅은 가능하되 UI에 호환성 경고 표시.
|
|
55
|
+
`/models/load`·`/engines/prepare-model/stream` 응답의 `compatibility_status`가
|
|
56
|
+
3분류 값을 그대로 노출.
|
|
57
|
+
|
|
58
|
+
### Security dashboard
|
|
59
|
+
|
|
60
|
+
- **Timezone 버그 수정** — audit timestamp는 로컬 시간으로 기록되는데
|
|
61
|
+
"events_today"는 UTC로 계산해 한국 사용자에게 날짜가 어긋나던 문제 수정.
|
|
62
|
+
새 모듈 `latticeai/core/timezones.py`로 기준 시간대를 통일(`LATTICE_TZ` /
|
|
63
|
+
`LTCAI_TZ` 환경변수, 기본 시스템 로컬). overview 응답에 `timezone` 필드 추가.
|
|
64
|
+
|
|
65
|
+
### Auto graph curator
|
|
66
|
+
|
|
67
|
+
- 한국어 노이즈 감소 — 조사 제거, 일반어/파일확장자 blacklist, 단일 출처
|
|
68
|
+
후보 score 감점(여러 출처에서 반복된 개념만 승격).
|
|
69
|
+
|
|
70
|
+
### Docs & tests
|
|
71
|
+
|
|
72
|
+
- README/확장 설명의 과장 표현 완화(telemetry, skills/plugins 수치 등).
|
|
73
|
+
- 단위 테스트 추가: timezone, smoke 3분류, graph 노이즈, export secret redaction.
|
|
74
|
+
(tests/unit 149 passed)
|
|
75
|
+
|
|
3
76
|
## [0.3.1] - 2026-05-29
|
|
4
77
|
|
|
5
78
|
> Model loading reliability + auto-graph curation + AI Security & Audit Command Center.
|
package/kg_schema.py
CHANGED
|
@@ -59,6 +59,7 @@ from __future__ import annotations
|
|
|
59
59
|
import json
|
|
60
60
|
import os
|
|
61
61
|
import re
|
|
62
|
+
import logging
|
|
62
63
|
import sqlite3
|
|
63
64
|
import struct
|
|
64
65
|
import time
|
|
@@ -435,8 +436,49 @@ class KGStoreV2:
|
|
|
435
436
|
finally:
|
|
436
437
|
conn.close()
|
|
437
438
|
|
|
439
|
+
# Columns the current code writes; used to detect schema-evolution drift in
|
|
440
|
+
# v2 tables that an older ``CREATE TABLE IF NOT EXISTS`` left behind.
|
|
441
|
+
_V2_EXPECTED_COLUMNS = {
|
|
442
|
+
"edges_v2": {"id", "source", "target", "type", "weight", "confidence",
|
|
443
|
+
"evidence", "created_by", "created_at"},
|
|
444
|
+
"nodes_v2": {"id", "type", "label", "attrs", "embedding", "owner_id",
|
|
445
|
+
"visibility", "created_at", "updated_at", "style", "tone",
|
|
446
|
+
"importance_score", "last_used"},
|
|
447
|
+
}
|
|
448
|
+
|
|
449
|
+
def _drop_stale_empty_v2_tables(self, conn: sqlite3.Connection) -> None:
|
|
450
|
+
"""Drop v2 tables that predate a schema change — but only when empty.
|
|
451
|
+
|
|
452
|
+
``CREATE TABLE IF NOT EXISTS`` never upgrades an existing table, so a
|
|
453
|
+
v2 table created by an older version keeps its old columns and breaks
|
|
454
|
+
inserts. Recreating is safe precisely because these tables have never
|
|
455
|
+
held data (the v2 read-path isn't wired yet); we refuse to drop any
|
|
456
|
+
table that contains rows.
|
|
457
|
+
"""
|
|
458
|
+
# edges_v2 first (it has FKs into nodes_v2)
|
|
459
|
+
for table in ("edges_v2", "nodes_v2"):
|
|
460
|
+
exists = conn.execute(
|
|
461
|
+
"SELECT 1 FROM sqlite_master WHERE type='table' AND name=?", (table,)
|
|
462
|
+
).fetchone()
|
|
463
|
+
if not exists:
|
|
464
|
+
continue
|
|
465
|
+
cols = {r[1] for r in conn.execute(f"PRAGMA table_info({table})").fetchall()}
|
|
466
|
+
missing = self._V2_EXPECTED_COLUMNS[table] - cols
|
|
467
|
+
if not missing:
|
|
468
|
+
continue
|
|
469
|
+
count = conn.execute(f"SELECT COUNT(*) FROM {table}").fetchone()[0]
|
|
470
|
+
if count == 0:
|
|
471
|
+
conn.execute(f"DROP TABLE {table}")
|
|
472
|
+
else:
|
|
473
|
+
logging.warning(
|
|
474
|
+
"kg_schema: %s is missing columns %s but holds %d rows — "
|
|
475
|
+
"leaving it untouched (manual migration required).",
|
|
476
|
+
table, sorted(missing), count,
|
|
477
|
+
)
|
|
478
|
+
|
|
438
479
|
def init_schema(self) -> None:
|
|
439
480
|
with self._conn() as conn:
|
|
481
|
+
self._drop_stale_empty_v2_tables(conn)
|
|
440
482
|
conn.executescript(SCHEMA_SQL)
|
|
441
483
|
conn.execute(
|
|
442
484
|
"INSERT OR REPLACE INTO kg_meta(key, value) VALUES (?, ?)",
|
package/knowledge_graph.py
CHANGED
|
@@ -27,6 +27,10 @@ try:
|
|
|
27
27
|
except Exception: # pragma: no cover - v2 schema is optional at import time
|
|
28
28
|
KGStoreV2 = None # type: ignore[assignment]
|
|
29
29
|
|
|
30
|
+
# Default read source for the graph queries: v2 reconstruction views.
|
|
31
|
+
# Override with LATTICEAI_KG_READ_V2=0 to fall back to the legacy tables.
|
|
32
|
+
_READ_FROM_V2_DEFAULT = os.getenv("LATTICEAI_KG_READ_V2", "1") != "0"
|
|
33
|
+
|
|
30
34
|
_llm_router_ref = None
|
|
31
35
|
|
|
32
36
|
def set_llm_router(router_instance):
|
|
@@ -799,6 +803,19 @@ class KnowledgeGraphStore:
|
|
|
799
803
|
self.db_path.parent.mkdir(parents=True, exist_ok=True)
|
|
800
804
|
self.blob_dir.mkdir(parents=True, exist_ok=True)
|
|
801
805
|
self._init_db()
|
|
806
|
+
# Read graph queries from the v2 projection (kgv2_* views) when available.
|
|
807
|
+
# Toggle off (e.g. in tests) to compare against the legacy tables.
|
|
808
|
+
self._read_from_v2 = KGStoreV2 is not None and _READ_FROM_V2_DEFAULT
|
|
809
|
+
|
|
810
|
+
def _read_tables(self) -> tuple:
|
|
811
|
+
"""Return (nodes_table, edges_table) for read queries.
|
|
812
|
+
|
|
813
|
+
Same read code runs against the legacy tables or the v2 reconstruction
|
|
814
|
+
views, so the two paths are equivalent by construction.
|
|
815
|
+
"""
|
|
816
|
+
if self._read_from_v2:
|
|
817
|
+
return ("kgv2_nodes", "kgv2_edges")
|
|
818
|
+
return ("nodes", "edges")
|
|
802
819
|
|
|
803
820
|
def _connect(self) -> sqlite3.Connection:
|
|
804
821
|
conn = sqlite3.connect(str(self.db_path))
|
|
@@ -899,14 +916,158 @@ class KnowledgeGraphStore:
|
|
|
899
916
|
)
|
|
900
917
|
self._init_v2_schema()
|
|
901
918
|
|
|
919
|
+
# SQL views that reconstruct the *exact* legacy row shape on top of the v2
|
|
920
|
+
# tables, so the read methods can run unchanged against either source. The
|
|
921
|
+
# projection (see _v2_project_node/_edge) stashes summary + the original
|
|
922
|
+
# metadata_json + (via the type column) the legacy type string, so these
|
|
923
|
+
# views are byte-faithful to the legacy nodes/edges tables.
|
|
924
|
+
_V2_VIEWS_SQL = """
|
|
925
|
+
CREATE VIEW IF NOT EXISTS kgv2_nodes AS
|
|
926
|
+
SELECT id, type,
|
|
927
|
+
label AS title,
|
|
928
|
+
COALESCE(json_extract(attrs, '$._kg.summary'), '') AS summary,
|
|
929
|
+
COALESCE(json_extract(attrs, '$._kg.metadata_json'), '{}') AS metadata_json,
|
|
930
|
+
created_at, updated_at
|
|
931
|
+
FROM nodes_v2;
|
|
932
|
+
CREATE VIEW IF NOT EXISTS kgv2_edges AS
|
|
933
|
+
SELECT id, source AS from_node, target AS to_node, type, weight,
|
|
934
|
+
COALESCE(evidence, '{}') AS metadata_json, created_at
|
|
935
|
+
FROM edges_v2;
|
|
936
|
+
"""
|
|
937
|
+
|
|
902
938
|
def _init_v2_schema(self) -> None:
|
|
903
|
-
"""Initialize the
|
|
939
|
+
"""Initialize the v2 tables + reconstruction views and backfill from legacy.
|
|
940
|
+
|
|
941
|
+
Completes the v2 migration: both write (dual-write projection in
|
|
942
|
+
_upsert_node/_upsert_edge) and read (read methods route to the kgv2_*
|
|
943
|
+
views when ``_READ_FROM_V2`` is on) flow through the v2 tables. Legacy
|
|
944
|
+
nodes/edges are retained as the durable source until the v2 path bakes in.
|
|
945
|
+
"""
|
|
904
946
|
if KGStoreV2 is None:
|
|
905
947
|
return
|
|
906
948
|
try:
|
|
907
949
|
KGStoreV2(self.db_path).init_schema()
|
|
950
|
+
with self._connect() as conn:
|
|
951
|
+
conn.executescript(self._V2_VIEWS_SQL)
|
|
952
|
+
self._backfill_v2_if_needed()
|
|
908
953
|
except Exception as e:
|
|
909
|
-
logging.warning("knowledge_graph: v2 schema init skipped: %s", e)
|
|
954
|
+
logging.warning("knowledge_graph: v2 schema init/backfill skipped: %s", e)
|
|
955
|
+
|
|
956
|
+
def _backfill_v2_if_needed(self) -> None:
|
|
957
|
+
"""Project legacy nodes/edges into the v2 tables when v2 is empty or stale.
|
|
958
|
+
|
|
959
|
+
Non-destructive to legacy. Reprojects when the v2 rows predate the
|
|
960
|
+
``_kg`` reconstruction blob (older enum-only backfill), so the views
|
|
961
|
+
stay faithful. Idempotent: no-ops once v2 carries the current projection.
|
|
962
|
+
"""
|
|
963
|
+
try:
|
|
964
|
+
with self._connect() as conn:
|
|
965
|
+
v2_nodes = conn.execute("SELECT COUNT(*) FROM nodes_v2").fetchone()[0]
|
|
966
|
+
legacy_nodes = conn.execute("SELECT COUNT(*) FROM nodes").fetchone()[0]
|
|
967
|
+
if legacy_nodes == 0:
|
|
968
|
+
return
|
|
969
|
+
if v2_nodes > 0:
|
|
970
|
+
has_kg = conn.execute(
|
|
971
|
+
"SELECT COUNT(*) FROM nodes_v2 WHERE json_extract(attrs,'$._kg') IS NOT NULL"
|
|
972
|
+
).fetchone()[0]
|
|
973
|
+
if has_kg > 0:
|
|
974
|
+
return # current projection already present
|
|
975
|
+
# (re)project: clear v2 graph (not authoritative) and rebuild
|
|
976
|
+
conn.execute("DELETE FROM edges_v2")
|
|
977
|
+
conn.execute("DELETE FROM nodes_v2")
|
|
978
|
+
n = e = 0
|
|
979
|
+
for r in conn.execute(
|
|
980
|
+
"SELECT id, type, title, summary, metadata_json, created_at, updated_at FROM nodes"
|
|
981
|
+
).fetchall():
|
|
982
|
+
self._v2_project_node(
|
|
983
|
+
conn, r["id"], r["type"], r["title"] or "", r["summary"] or "",
|
|
984
|
+
_safe_loads(r["metadata_json"]),
|
|
985
|
+
created_at=r["created_at"], updated_at=r["updated_at"],
|
|
986
|
+
)
|
|
987
|
+
n += 1
|
|
988
|
+
for r in conn.execute(
|
|
989
|
+
"SELECT id, from_node, to_node, type, weight, metadata_json, created_at FROM edges"
|
|
990
|
+
).fetchall():
|
|
991
|
+
self._v2_project_edge(
|
|
992
|
+
conn, r["from_node"], r["to_node"], r["type"], float(r["weight"] or 1.0),
|
|
993
|
+
_safe_loads(r["metadata_json"]), edge_id=r["id"], created_at=r["created_at"],
|
|
994
|
+
)
|
|
995
|
+
e += 1
|
|
996
|
+
logging.info("knowledge_graph: projected legacy → v2 (%d nodes, %d edges)", n, e)
|
|
997
|
+
except Exception as ex:
|
|
998
|
+
logging.warning("knowledge_graph: v2 backfill skipped: %s", ex)
|
|
999
|
+
|
|
1000
|
+
# ── v2 dual-write projection (legacy types + summary/metadata in attrs._kg) ──
|
|
1001
|
+
def _v2_project_node(
|
|
1002
|
+
self, conn: sqlite3.Connection, node_id: str, node_type: str, title: str,
|
|
1003
|
+
summary: str, metadata: Optional[Dict[str, Any]],
|
|
1004
|
+
*, created_at: Optional[str] = None, updated_at: Optional[str] = None,
|
|
1005
|
+
) -> None:
|
|
1006
|
+
if KGStoreV2 is None:
|
|
1007
|
+
return
|
|
1008
|
+
ts = updated_at or _now()
|
|
1009
|
+
attrs = _json({"_kg": {"summary": (summary or "")[:1000], "metadata_json": _json(metadata)}})
|
|
1010
|
+
try:
|
|
1011
|
+
conn.execute(
|
|
1012
|
+
"""
|
|
1013
|
+
INSERT INTO nodes_v2(id, type, label, attrs, owner_id, visibility,
|
|
1014
|
+
created_at, updated_at, importance_score)
|
|
1015
|
+
VALUES (?, ?, ?, ?, NULL, 'private', ?, ?, 0.0)
|
|
1016
|
+
ON CONFLICT(id) DO UPDATE SET
|
|
1017
|
+
type=excluded.type, label=excluded.label,
|
|
1018
|
+
attrs=excluded.attrs, updated_at=excluded.updated_at
|
|
1019
|
+
""",
|
|
1020
|
+
(node_id, node_type, (title or "")[:240], attrs, created_at or ts, ts),
|
|
1021
|
+
)
|
|
1022
|
+
except Exception as ex:
|
|
1023
|
+
logging.debug("knowledge_graph: v2 node projection skipped (%s): %s", node_id, ex)
|
|
1024
|
+
|
|
1025
|
+
def _v2_project_edge(
|
|
1026
|
+
self, conn: sqlite3.Connection, from_node: str, to_node: str, edge_type: str,
|
|
1027
|
+
weight: float, metadata: Optional[Dict[str, Any]],
|
|
1028
|
+
*, edge_id: Optional[str] = None, created_at: Optional[str] = None,
|
|
1029
|
+
) -> None:
|
|
1030
|
+
if KGStoreV2 is None:
|
|
1031
|
+
return
|
|
1032
|
+
meta = metadata or {}
|
|
1033
|
+
eid = edge_id or f"edge:{_sha256_text(f'{from_node}|{edge_type}|{to_node}')[:24]}"
|
|
1034
|
+
try:
|
|
1035
|
+
conn.execute(
|
|
1036
|
+
"""
|
|
1037
|
+
INSERT INTO edges_v2(id, source, target, type, weight, confidence,
|
|
1038
|
+
evidence, created_by, created_at)
|
|
1039
|
+
VALUES (?, ?, ?, ?, ?, ?, ?, 'legacy', ?)
|
|
1040
|
+
ON CONFLICT(source, target, type) DO UPDATE SET
|
|
1041
|
+
weight=max(edges_v2.weight, excluded.weight),
|
|
1042
|
+
evidence=excluded.evidence
|
|
1043
|
+
""",
|
|
1044
|
+
(eid, from_node, to_node, edge_type, float(weight),
|
|
1045
|
+
float(meta.get("confidence", 1.0)), _json(meta), created_at or _now()),
|
|
1046
|
+
)
|
|
1047
|
+
except Exception as ex:
|
|
1048
|
+
logging.debug("knowledge_graph: v2 edge projection skipped (%s->%s): %s", from_node, to_node, ex)
|
|
1049
|
+
|
|
1050
|
+
def _v2_delete_nodes(self, conn: sqlite3.Connection, ids) -> None:
|
|
1051
|
+
"""Mirror legacy node deletions into v2 (edges_v2 cascade on the FK)."""
|
|
1052
|
+
if KGStoreV2 is None:
|
|
1053
|
+
return
|
|
1054
|
+
ids = list(ids)
|
|
1055
|
+
if not ids:
|
|
1056
|
+
return
|
|
1057
|
+
ph = ",".join("?" * len(ids))
|
|
1058
|
+
try:
|
|
1059
|
+
conn.execute(f"DELETE FROM nodes_v2 WHERE id IN ({ph})", ids)
|
|
1060
|
+
except Exception as ex:
|
|
1061
|
+
logging.debug("knowledge_graph: v2 node delete mirror skipped: %s", ex)
|
|
1062
|
+
|
|
1063
|
+
def _v2_delete_edges_from(self, conn: sqlite3.Connection, node_id: str) -> None:
|
|
1064
|
+
"""Mirror a legacy ``DELETE FROM edges WHERE from_node=?`` into v2."""
|
|
1065
|
+
if KGStoreV2 is None:
|
|
1066
|
+
return
|
|
1067
|
+
try:
|
|
1068
|
+
conn.execute("DELETE FROM edges_v2 WHERE source=?", (node_id,))
|
|
1069
|
+
except Exception as ex:
|
|
1070
|
+
logging.debug("knowledge_graph: v2 edge delete mirror skipped: %s", ex)
|
|
910
1071
|
|
|
911
1072
|
def _upsert_node(
|
|
912
1073
|
self,
|
|
@@ -932,6 +1093,9 @@ class KnowledgeGraphStore:
|
|
|
932
1093
|
""",
|
|
933
1094
|
(node_id, node_type, title[:240], summary[:1000], _json(metadata), _json(raw), now, now),
|
|
934
1095
|
)
|
|
1096
|
+
# dual-write: project into the v2 graph on the same transaction
|
|
1097
|
+
self._v2_project_node(conn, node_id, node_type, title, summary, metadata,
|
|
1098
|
+
created_at=now, updated_at=now)
|
|
935
1099
|
return node_id
|
|
936
1100
|
|
|
937
1101
|
def _upsert_edge(
|
|
@@ -944,6 +1108,7 @@ class KnowledgeGraphStore:
|
|
|
944
1108
|
metadata: Optional[Dict[str, Any]] = None,
|
|
945
1109
|
) -> str:
|
|
946
1110
|
edge_id = f"edge:{_sha256_text(f'{from_node}|{edge_type}|{to_node}')[:24]}"
|
|
1111
|
+
now = _now()
|
|
947
1112
|
conn.execute(
|
|
948
1113
|
"""
|
|
949
1114
|
INSERT INTO edges(id, from_node, to_node, type, weight, metadata_json, created_at)
|
|
@@ -952,8 +1117,11 @@ class KnowledgeGraphStore:
|
|
|
952
1117
|
weight=max(edges.weight, excluded.weight),
|
|
953
1118
|
metadata_json=excluded.metadata_json
|
|
954
1119
|
""",
|
|
955
|
-
(edge_id, from_node, to_node, edge_type, float(weight), _json(metadata),
|
|
1120
|
+
(edge_id, from_node, to_node, edge_type, float(weight), _json(metadata), now),
|
|
956
1121
|
)
|
|
1122
|
+
# dual-write: project into the v2 graph on the same transaction
|
|
1123
|
+
self._v2_project_edge(conn, from_node, to_node, edge_type, float(weight), metadata,
|
|
1124
|
+
edge_id=edge_id, created_at=now)
|
|
957
1125
|
return edge_id
|
|
958
1126
|
|
|
959
1127
|
# ── Local folder sources → Graph RAG ──────────────────────────────────
|
|
@@ -1307,7 +1475,7 @@ class KnowledgeGraphStore:
|
|
|
1307
1475
|
SELECT id, root_path, os_type, drive_id, label, status, include_ocr,
|
|
1308
1476
|
watch_enabled, consent_json, created_at, updated_at, last_scanned_at
|
|
1309
1477
|
FROM knowledge_sources
|
|
1310
|
-
ORDER BY updated_at DESC
|
|
1478
|
+
ORDER BY updated_at DESC, id ASC
|
|
1311
1479
|
"""
|
|
1312
1480
|
)
|
|
1313
1481
|
]
|
|
@@ -1571,6 +1739,7 @@ class KnowledgeGraphStore:
|
|
|
1571
1739
|
conn.execute("DELETE FROM chunks WHERE source_node=?", (file_node_id,))
|
|
1572
1740
|
conn.execute("DELETE FROM edges WHERE from_node=? OR to_node=?", (file_node_id, file_node_id))
|
|
1573
1741
|
conn.execute("DELETE FROM nodes WHERE id=?", (file_node_id,))
|
|
1742
|
+
self._v2_delete_nodes(conn, [file_node_id])
|
|
1574
1743
|
|
|
1575
1744
|
def delete_nodes(node_ids: set) -> None:
|
|
1576
1745
|
if not node_ids:
|
|
@@ -1580,6 +1749,7 @@ class KnowledgeGraphStore:
|
|
|
1580
1749
|
conn.execute(f"DELETE FROM chunks WHERE source_node IN ({placeholders})", params)
|
|
1581
1750
|
conn.execute(f"DELETE FROM edges WHERE from_node IN ({placeholders}) OR to_node IN ({placeholders})", params * 2)
|
|
1582
1751
|
conn.execute(f"DELETE FROM nodes WHERE id IN ({placeholders})", params)
|
|
1752
|
+
self._v2_delete_nodes(conn, params)
|
|
1583
1753
|
|
|
1584
1754
|
delete_nodes(owned_ids)
|
|
1585
1755
|
|
|
@@ -1619,6 +1789,7 @@ class KnowledgeGraphStore:
|
|
|
1619
1789
|
placeholders = ",".join("?" * len(leaf_ids))
|
|
1620
1790
|
conn.execute(f"DELETE FROM edges WHERE from_node IN ({placeholders}) OR to_node IN ({placeholders})", leaf_ids * 2)
|
|
1621
1791
|
conn.execute(f"DELETE FROM nodes WHERE id IN ({placeholders})", leaf_ids)
|
|
1792
|
+
self._v2_delete_nodes(conn, leaf_ids)
|
|
1622
1793
|
|
|
1623
1794
|
for node_type in ("Drive", "Computer"):
|
|
1624
1795
|
rows = conn.execute("SELECT id FROM nodes WHERE type=?", (node_type,)).fetchall()
|
|
@@ -1634,6 +1805,7 @@ class KnowledgeGraphStore:
|
|
|
1634
1805
|
placeholders = ",".join("?" * len(removable))
|
|
1635
1806
|
conn.execute(f"DELETE FROM edges WHERE from_node IN ({placeholders}) OR to_node IN ({placeholders})", removable * 2)
|
|
1636
1807
|
conn.execute(f"DELETE FROM nodes WHERE id IN ({placeholders})", removable)
|
|
1808
|
+
self._v2_delete_nodes(conn, removable)
|
|
1637
1809
|
|
|
1638
1810
|
def _local_file_index_has_extracted_text(self, row: sqlite3.Row) -> bool:
|
|
1639
1811
|
metadata = _safe_loads(row["metadata_json"])
|
|
@@ -1691,7 +1863,9 @@ class KnowledgeGraphStore:
|
|
|
1691
1863
|
if child_ids:
|
|
1692
1864
|
placeholders = ",".join("?" * len(child_ids))
|
|
1693
1865
|
conn.execute(f"DELETE FROM nodes WHERE id IN ({placeholders})", child_ids)
|
|
1866
|
+
self._v2_delete_nodes(conn, child_ids)
|
|
1694
1867
|
conn.execute("DELETE FROM edges WHERE from_node=?", (file_node_id,))
|
|
1868
|
+
self._v2_delete_edges_from(conn, file_node_id)
|
|
1695
1869
|
|
|
1696
1870
|
metadata = {
|
|
1697
1871
|
"source": "local_folder",
|
|
@@ -2591,6 +2765,7 @@ class KnowledgeGraphStore:
|
|
|
2591
2765
|
def graph(self, limit: int = 300) -> Dict[str, Any]:
|
|
2592
2766
|
limit = max(1, min(int(limit or 300), 2000))
|
|
2593
2767
|
visible = ",".join(f"'{t}'" for t in self._GRAPH_VISIBLE_TYPES)
|
|
2768
|
+
nt, et = self._read_tables()
|
|
2594
2769
|
with self._connect() as conn:
|
|
2595
2770
|
nodes = [
|
|
2596
2771
|
{
|
|
@@ -2602,7 +2777,7 @@ class KnowledgeGraphStore:
|
|
|
2602
2777
|
"updated_at": row["updated_at"],
|
|
2603
2778
|
}
|
|
2604
2779
|
for row in conn.execute(
|
|
2605
|
-
f"SELECT id, type, title, summary, metadata_json, updated_at FROM
|
|
2780
|
+
f"SELECT id, type, title, summary, metadata_json, updated_at FROM {nt} WHERE type IN ({visible}) ORDER BY updated_at DESC, id ASC LIMIT ?",
|
|
2606
2781
|
(limit,),
|
|
2607
2782
|
)
|
|
2608
2783
|
]
|
|
@@ -2612,16 +2787,16 @@ class KnowledgeGraphStore:
|
|
|
2612
2787
|
edge_rows = conn.execute(
|
|
2613
2788
|
f"""
|
|
2614
2789
|
SELECT id, from_node, to_node, type, weight, metadata_json
|
|
2615
|
-
FROM
|
|
2790
|
+
FROM {et}
|
|
2616
2791
|
WHERE from_node IN (
|
|
2617
|
-
SELECT id FROM
|
|
2618
|
-
ORDER BY updated_at DESC LIMIT ?
|
|
2792
|
+
SELECT id FROM {nt} WHERE type IN ({visible})
|
|
2793
|
+
ORDER BY updated_at DESC, id ASC LIMIT ?
|
|
2619
2794
|
)
|
|
2620
2795
|
AND to_node IN (
|
|
2621
|
-
SELECT id FROM
|
|
2622
|
-
ORDER BY updated_at DESC LIMIT ?
|
|
2796
|
+
SELECT id FROM {nt} WHERE type IN ({visible})
|
|
2797
|
+
ORDER BY updated_at DESC, id ASC LIMIT ?
|
|
2623
2798
|
)
|
|
2624
|
-
ORDER BY weight DESC, created_at DESC
|
|
2799
|
+
ORDER BY weight DESC, created_at DESC, id ASC
|
|
2625
2800
|
""",
|
|
2626
2801
|
(limit, limit),
|
|
2627
2802
|
).fetchall()
|
|
@@ -2708,15 +2883,16 @@ class KnowledgeGraphStore:
|
|
|
2708
2883
|
query = str(query or "").strip()
|
|
2709
2884
|
q = f"%{query}%"
|
|
2710
2885
|
limit = max(1, min(int(limit or 30), 100))
|
|
2886
|
+
nt, et = self._read_tables()
|
|
2711
2887
|
with self._connect() as conn:
|
|
2712
2888
|
rows = []
|
|
2713
2889
|
if query:
|
|
2714
2890
|
rows = conn.execute(
|
|
2715
|
-
"""
|
|
2891
|
+
f"""
|
|
2716
2892
|
SELECT id, type, title, summary, metadata_json, updated_at
|
|
2717
|
-
FROM
|
|
2893
|
+
FROM {nt}
|
|
2718
2894
|
WHERE title LIKE ? OR summary LIKE ? OR metadata_json LIKE ?
|
|
2719
|
-
ORDER BY updated_at DESC
|
|
2895
|
+
ORDER BY updated_at DESC, id ASC
|
|
2720
2896
|
LIMIT ?
|
|
2721
2897
|
""",
|
|
2722
2898
|
(q, q, q, limit),
|
|
@@ -2733,9 +2909,9 @@ class KnowledgeGraphStore:
|
|
|
2733
2909
|
extra = conn.execute(
|
|
2734
2910
|
f"""
|
|
2735
2911
|
SELECT id, type, title, summary, metadata_json, updated_at
|
|
2736
|
-
FROM
|
|
2912
|
+
FROM {nt}
|
|
2737
2913
|
WHERE {' OR '.join(clauses)}
|
|
2738
|
-
ORDER BY updated_at DESC
|
|
2914
|
+
ORDER BY updated_at DESC, id ASC
|
|
2739
2915
|
LIMIT ?
|
|
2740
2916
|
""",
|
|
2741
2917
|
(*params, limit * 3),
|
|
@@ -2780,15 +2956,16 @@ class KnowledgeGraphStore:
|
|
|
2780
2956
|
if not matches:
|
|
2781
2957
|
topics = _topic_candidates(query, limit=4)
|
|
2782
2958
|
if topics:
|
|
2959
|
+
nt, et = self._read_tables()
|
|
2783
2960
|
with self._connect() as conn:
|
|
2784
2961
|
rows = []
|
|
2785
2962
|
for topic in topics:
|
|
2786
2963
|
rows.extend(conn.execute(
|
|
2787
|
-
"""
|
|
2964
|
+
f"""
|
|
2788
2965
|
SELECT id, type, title, summary, metadata_json
|
|
2789
|
-
FROM
|
|
2966
|
+
FROM {nt}
|
|
2790
2967
|
WHERE title LIKE ? OR metadata_json LIKE ?
|
|
2791
|
-
ORDER BY updated_at DESC
|
|
2968
|
+
ORDER BY updated_at DESC, id ASC
|
|
2792
2969
|
LIMIT 3
|
|
2793
2970
|
""",
|
|
2794
2971
|
(f"%{topic}%", f"%{topic}%"),
|
|
@@ -2824,9 +3001,10 @@ class KnowledgeGraphStore:
|
|
|
2824
3001
|
|
|
2825
3002
|
def neighbors(self, node_id: str) -> Dict[str, Any]:
|
|
2826
3003
|
"""Return direct neighbors (1-hop) of a node."""
|
|
3004
|
+
nt, et = self._read_tables()
|
|
2827
3005
|
with self._connect() as conn:
|
|
2828
3006
|
edge_rows = conn.execute(
|
|
2829
|
-
"SELECT from_node, to_node, type, weight FROM
|
|
3007
|
+
f"SELECT from_node, to_node, type, weight FROM {et} WHERE from_node=? OR to_node=? ORDER BY id ASC",
|
|
2830
3008
|
(node_id, node_id),
|
|
2831
3009
|
).fetchall()
|
|
2832
3010
|
neighbor_ids: set = set()
|
|
@@ -2848,7 +3026,7 @@ class KnowledgeGraphStore:
|
|
|
2848
3026
|
"metadata": _safe_loads(row["metadata_json"]),
|
|
2849
3027
|
}
|
|
2850
3028
|
for row in conn.execute(
|
|
2851
|
-
f"SELECT id, type, title, summary, metadata_json FROM
|
|
3029
|
+
f"SELECT id, type, title, summary, metadata_json FROM {nt} WHERE id IN ({placeholders}) ORDER BY id ASC",
|
|
2852
3030
|
list(neighbor_ids),
|
|
2853
3031
|
)
|
|
2854
3032
|
]
|
|
@@ -2880,6 +3058,8 @@ class KnowledgeGraphStore:
|
|
|
2880
3058
|
remove_ids.add(conv_id)
|
|
2881
3059
|
for node_id in remove_ids:
|
|
2882
3060
|
conn.execute("DELETE FROM nodes WHERE id=?", (node_id,))
|
|
3061
|
+
if KGStoreV2 is not None:
|
|
3062
|
+
conn.execute("DELETE FROM nodes_v2 WHERE id=?", (node_id,)) # edges_v2 cascade
|
|
2883
3063
|
conn.execute(
|
|
2884
3064
|
"""
|
|
2885
3065
|
DELETE FROM nodes
|
|
@@ -2888,6 +3068,15 @@ class KnowledgeGraphStore:
|
|
|
2888
3068
|
AND id NOT IN (SELECT from_node FROM edges)
|
|
2889
3069
|
"""
|
|
2890
3070
|
)
|
|
3071
|
+
if KGStoreV2 is not None:
|
|
3072
|
+
conn.execute(
|
|
3073
|
+
"""
|
|
3074
|
+
DELETE FROM nodes_v2
|
|
3075
|
+
WHERE type='Topic'
|
|
3076
|
+
AND id NOT IN (SELECT target FROM edges_v2)
|
|
3077
|
+
AND id NOT IN (SELECT source FROM edges_v2)
|
|
3078
|
+
"""
|
|
3079
|
+
)
|
|
2891
3080
|
return {"status": "ok", "conversation_id": conversation_id, "removed_nodes": len(remove_ids)}
|
|
2892
3081
|
|
|
2893
3082
|
def clear_all(self) -> Dict[str, Any]:
|
|
@@ -2904,20 +3093,24 @@ class KnowledgeGraphStore:
|
|
|
2904
3093
|
conn.execute("DELETE FROM chunks")
|
|
2905
3094
|
conn.execute("DELETE FROM edges")
|
|
2906
3095
|
conn.execute("DELETE FROM nodes")
|
|
3096
|
+
if KGStoreV2 is not None:
|
|
3097
|
+
conn.execute("DELETE FROM edges_v2")
|
|
3098
|
+
conn.execute("DELETE FROM nodes_v2")
|
|
2907
3099
|
if self.blob_dir.exists():
|
|
2908
3100
|
shutil.rmtree(self.blob_dir, ignore_errors=True)
|
|
2909
3101
|
self.blob_dir.mkdir(parents=True, exist_ok=True)
|
|
2910
3102
|
return {"status": "ok", "removed": counts}
|
|
2911
3103
|
|
|
2912
3104
|
def stats(self) -> Dict[str, Any]:
|
|
3105
|
+
nt, et = self._read_tables()
|
|
2913
3106
|
with self._connect() as conn:
|
|
2914
3107
|
node_counts = {
|
|
2915
3108
|
row["type"]: row["count"]
|
|
2916
|
-
for row in conn.execute("SELECT type, COUNT(*) AS count FROM
|
|
3109
|
+
for row in conn.execute(f"SELECT type, COUNT(*) AS count FROM {nt} GROUP BY type")
|
|
2917
3110
|
}
|
|
2918
3111
|
edge_counts = {
|
|
2919
3112
|
row["type"]: row["count"]
|
|
2920
|
-
for row in conn.execute("SELECT type, COUNT(*) AS count FROM
|
|
3113
|
+
for row in conn.execute(f"SELECT type, COUNT(*) AS count FROM {et} GROUP BY type")
|
|
2921
3114
|
}
|
|
2922
3115
|
local_sources = conn.execute("SELECT COUNT(*) AS c FROM knowledge_sources").fetchone()["c"]
|
|
2923
3116
|
local_file_status = {
|
|
@@ -2953,6 +3146,7 @@ class KnowledgeGraphStore:
|
|
|
2953
3146
|
limit = max(1, min(int(limit or 10), 50))
|
|
2954
3147
|
terms = _topic_candidates(query, limit=12)
|
|
2955
3148
|
now = datetime.now()
|
|
3149
|
+
nt, et = self._read_tables()
|
|
2956
3150
|
|
|
2957
3151
|
with self._connect() as conn:
|
|
2958
3152
|
candidate_rows = []
|
|
@@ -2961,15 +3155,15 @@ class KnowledgeGraphStore:
|
|
|
2961
3155
|
if query:
|
|
2962
3156
|
q = f"%{query}%"
|
|
2963
3157
|
rows = conn.execute(
|
|
2964
|
-
"""
|
|
3158
|
+
f"""
|
|
2965
3159
|
SELECT id, type, title, summary, metadata_json, updated_at
|
|
2966
|
-
FROM
|
|
3160
|
+
FROM {nt}
|
|
2967
3161
|
WHERE (title LIKE ? OR summary LIKE ? OR metadata_json LIKE ?)
|
|
2968
3162
|
AND type IN ('Document', 'File', 'CodeFile', 'SlideDeck',
|
|
2969
3163
|
'Spreadsheet', 'Image', 'ImageText', 'Chat',
|
|
2970
3164
|
'Decision', 'Task', 'Concept', 'Feature',
|
|
2971
3165
|
'Page', 'Slide')
|
|
2972
|
-
ORDER BY updated_at DESC
|
|
3166
|
+
ORDER BY updated_at DESC, id ASC
|
|
2973
3167
|
LIMIT ?
|
|
2974
3168
|
""",
|
|
2975
3169
|
(q, q, q, limit * 5),
|
|
@@ -2982,15 +3176,15 @@ class KnowledgeGraphStore:
|
|
|
2982
3176
|
for term in terms:
|
|
2983
3177
|
t = f"%{term}%"
|
|
2984
3178
|
rows = conn.execute(
|
|
2985
|
-
"""
|
|
3179
|
+
f"""
|
|
2986
3180
|
SELECT id, type, title, summary, metadata_json, updated_at
|
|
2987
|
-
FROM
|
|
3181
|
+
FROM {nt}
|
|
2988
3182
|
WHERE (title LIKE ? OR summary LIKE ? OR metadata_json LIKE ?)
|
|
2989
3183
|
AND type IN ('Document', 'File', 'CodeFile', 'SlideDeck',
|
|
2990
3184
|
'Spreadsheet', 'Image', 'ImageText', 'Chat',
|
|
2991
3185
|
'Decision', 'Task', 'Concept', 'Feature',
|
|
2992
3186
|
'Page', 'Slide')
|
|
2993
|
-
ORDER BY updated_at DESC
|
|
3187
|
+
ORDER BY updated_at DESC, id ASC
|
|
2994
3188
|
LIMIT ?
|
|
2995
3189
|
""",
|
|
2996
3190
|
(t, t, t, limit * 3),
|
|
@@ -3008,7 +3202,7 @@ class KnowledgeGraphStore:
|
|
|
3008
3202
|
text_score = min(1.0, text_hits / max(len(terms), 1))
|
|
3009
3203
|
|
|
3010
3204
|
edge_count = conn.execute(
|
|
3011
|
-
"SELECT COUNT(*) AS c FROM
|
|
3205
|
+
f"SELECT COUNT(*) AS c FROM {et} WHERE from_node=? OR to_node=?",
|
|
3012
3206
|
(row["id"], row["id"]),
|
|
3013
3207
|
).fetchone()["c"]
|
|
3014
3208
|
graph_score = min(1.0, math.log1p(edge_count) / 4.0)
|
|
@@ -3028,9 +3222,9 @@ class KnowledgeGraphStore:
|
|
|
3028
3222
|
meta = _safe_loads(row["metadata_json"])
|
|
3029
3223
|
neighbor_concepts = []
|
|
3030
3224
|
neighbor_rows = conn.execute(
|
|
3031
|
-
"""
|
|
3032
|
-
SELECT n.title, n.type FROM
|
|
3033
|
-
JOIN
|
|
3225
|
+
f"""
|
|
3226
|
+
SELECT n.title, n.type FROM {et} e
|
|
3227
|
+
JOIN {nt} n ON n.id = CASE WHEN e.from_node = ? THEN e.to_node ELSE e.from_node END
|
|
3034
3228
|
WHERE (e.from_node = ? OR e.to_node = ?)
|
|
3035
3229
|
AND n.type IN ('Concept', 'Feature', 'Decision', 'Task')
|
|
3036
3230
|
LIMIT 8
|
|
@@ -3066,6 +3260,7 @@ class KnowledgeGraphStore:
|
|
|
3066
3260
|
all_nodes = []
|
|
3067
3261
|
all_edges = []
|
|
3068
3262
|
frontier = set(node_ids)
|
|
3263
|
+
nt, et = self._read_tables()
|
|
3069
3264
|
|
|
3070
3265
|
with self._connect() as conn:
|
|
3071
3266
|
for hop in range(max_hops):
|
|
@@ -3077,7 +3272,7 @@ class KnowledgeGraphStore:
|
|
|
3077
3272
|
continue
|
|
3078
3273
|
visited_nodes.add(nid)
|
|
3079
3274
|
row = conn.execute(
|
|
3080
|
-
"SELECT id, type, title, summary, metadata_json, updated_at FROM
|
|
3275
|
+
f"SELECT id, type, title, summary, metadata_json, updated_at FROM {nt} WHERE id=?",
|
|
3081
3276
|
(nid,),
|
|
3082
3277
|
).fetchone()
|
|
3083
3278
|
if row:
|
|
@@ -3088,9 +3283,10 @@ class KnowledgeGraphStore:
|
|
|
3088
3283
|
"hop": hop,
|
|
3089
3284
|
})
|
|
3090
3285
|
edge_rows = conn.execute(
|
|
3091
|
-
"""
|
|
3286
|
+
f"""
|
|
3092
3287
|
SELECT id, from_node, to_node, type, weight
|
|
3093
|
-
FROM
|
|
3288
|
+
FROM {et} WHERE from_node=? OR to_node=?
|
|
3289
|
+
ORDER BY id ASC
|
|
3094
3290
|
""",
|
|
3095
3291
|
(nid, nid),
|
|
3096
3292
|
).fetchall()
|