npm - @pentatonic-ai/ai-agent-sdk - Versions diffs - 0.10.19 → 0.10.21 - Mend

@pentatonic-ai/ai-agent-sdk 0.10.19 → 0.10.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/dist/index.cjs +1 -1
package/dist/index.js +1 -1
package/package.json +1 -1
package/packages/memory-engine-v2/RFC-decay-and-fusion.md +122 -8
package/packages/memory-engine-v2/compat/server.py +55 -10
package/packages/memory-engine-v2/extractor-async/test_email_alias_guard.py +78 -0
package/packages/memory-engine-v2/extractor-async/worker.py +52 -0
package/packages/memory-engine-v2/scripts/build_retrain_corpus.py +240 -0
package/packages/memory-engine-v2/scripts/fusion_defrag.py +440 -0
package/packages/memory-engine-v2/scripts/redistill.py +236 -0

package/dist/index.cjs CHANGED Viewed

@@ -878,7 +878,7 @@ function fireAndForgetEmit(clientConfig, sessionOpts, messages, result, model) {
 }
 // src/telemetry.js
-var VERSION = "0.10.19";
+var VERSION = "0.10.21";
 var TELEMETRY_URL = "https://sdk-telemetry.philip-134.workers.dev";
 function machineId() {
   const raw = typeof process !== "undefined" ? `${process.env?.USER || process.env?.USERNAME || "u"}:${process.platform || "x"}:${process.arch || "x"}` : "browser";

package/dist/index.js CHANGED Viewed

@@ -847,7 +847,7 @@ function fireAndForgetEmit(clientConfig, sessionOpts, messages, result, model) {
 }
 // src/telemetry.js
-var VERSION = "0.10.19";
+var VERSION = "0.10.21";
 var TELEMETRY_URL = "https://sdk-telemetry.philip-134.workers.dev";
 function machineId() {
   const raw = typeof process !== "undefined" ? `${process.env?.USER || process.env?.USERNAME || "u"}:${process.platform || "x"}:${process.arch || "x"}` : "browser";

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@pentatonic-ai/ai-agent-sdk",
-  "version": "0.10.19",
+  "version": "0.10.21",
   "description": "TES SDK — LLM observability and lifecycle tracking via Pentatonic Thing Event System. Track token usage, tool calls, and conversations. Manage things through event-sourced lifecycle stages with AI enrichment and vector search.",
   "type": "module",
   "main": "./dist/index.cjs",

package/packages/memory-engine-v2/RFC-decay-and-fusion.md CHANGED Viewed

@@ -1,10 +1,18 @@
 # RFC: the Fusion Drive — v2 memory self-healing (cross-run node fusion + decay)
 > **Fusion Drive** = the continuous, arena-scoped background engine that keeps the v2
-> memory graph self-healing: it *fuses* duplicate/near-duplicate nodes from different
-> distillation runs into a single master node (horizontal convergence) and *decays* stale,
-> low-value, and junk nodes out of existence (vertical aging). Named for the drive that
-> does the fusing — the decay pass rides the same engine.
+> memory graph self-healing. It triages every node into one of **three** outcomes:
+> it *fuses* duplicate/near-duplicate nodes from different distillation runs into a single
+> master node (horizontal convergence); it *re-distills* high-value extractions produced by
+> a superseded teacher/prompt — regenerating them from the still-present source event through
+> the current clean teacher (depth refresh); and it *decays* stale, low-value, and junk nodes
+> out of existence (vertical aging). Named for the drive that does the fusing — the re-distill
+> and decay passes ride the same engine.
+>
+> *(Revised 2026-06-22: added Part B′ — re-distillation — as the third triage verb, with the
+> prompt-version-drift trigger. Motivated by the clean-prompt deploy (SDK 0.10.19, #126 +
+> #129) which made "the current teacher is materially better than the one that produced most
+> of the graph" concrete and measurable via `system_prompt_hash`.)*
 **Status:** draft / spec — 2026-06-12
 **Builds on:** `RFC-entity-reconciliation.md`, `scripts/entity_resolution_v2.py` (#82),
@@ -139,15 +147,101 @@ sparse backfill.
 ---
+## Part B′ — Re-distillation: regenerate stale-prompt extractions from source
+Fusion (A) needs a *correct counterpart* to converge toward; Decay (B) just *deletes*. But
+the common case after a teacher/prompt upgrade is a **high-value node with no correct
+counterpart yet** — the only extraction that exists is the stale-prompt one. Fusion has
+nothing to fuse to; decay would throw away real information. The cure is the third verb: the
+**source event still exists** (`events` table, 376k rows live), so regenerate the extraction
+by re-running that event through the *current clean teacher*. Fusion converges horizontally,
+decay ages vertically; re-distill refreshes **in depth**.
+### B′1. Trigger — prompt-version drift, not raw age
+The defect population is *exactly* the facts/entities whose provenance traces an **old
+`system_prompt_hash`** — `bbdaba6b…` / `f1e0ff55…` / `ef0647c7…` (pre-clean), vs the clean
+`6ccfe70f…` deployed with 0.10.19 (#126 modality/attribution + #129 email-discipline &
+entity-separation). #118 propagated source onto facts, so provenance → the event's
+`distillation_traces.system_prompt_hash` is queryable. **Age is a weak proxy; prompt-version
+selects the defect set directly** — a months-old node the clean teacher would extract
+identically needs nothing; a two-day-old node from the dirty prompt is a defect. Prioritize
+by `salience` (B1) so high-value stale nodes go first.
+### B′2. Triage routing — 3-way, by salience × prompt-version
+Per assessed node/event:
+| condition | outcome |
+|---|---|
+| stale prompt-hash **+** high salience **+** source event present | **re-distill** (this part) |
+| has a correct newer-teacher counterpart in the arena | **fuse** (Part A) |
+| low salience, junk-born (B2), no corroboration, never accessed | **decay** (Part B) |
+### B′3. Mechanism — re-enqueue, don't mutate in place
+Re-distill = re-insert the source `event_id` into `distillation_queue` (`status='pending'`,
+`attempts=0`). The existing **extractor-async** worker claims it, runs the clean teacher,
+writes the new extraction **and a fresh `6ccfe70f` trace**. No new pipeline — it reuses the
+distiller, the combined-demand **autoscaler**, and the trace ledger. (Re-distill is a
+*producer* of queue demand; the autoscaler's student-aware floor already keeps a teacher box
+warm for it — see the deploy notes.)
+### B′4. Supersedence — the load-bearing requirement
+The store is **pure-accretion** (the whole motivation of this RFC). A naive re-enqueue makes
+the clean extraction land **beside** the dirty one → it *worsens* fragmentation. So
+re-distill MUST close the loop through Fusion's tombstone machinery — it is **sequenced into
+the Fusion Drive, not bolted on**:
+1. Each re-distill is recorded in a `redistill_runs` ledger with its triggering
+   `(event_id, old_prompt_hash)`.
+2. When the clean extraction completes, **Fusion converges old ↔ new for that event** using
+   the teacher-version master signal (A2/A3): the new `6ccfe70f` extraction wins as master;
+   the old extraction's now-orphaned nodes (those whose **only** provenance was this event
+   under the old hash) are tombstoned/repointed via `entity_merges` / `fact_merges`.
+3. Where an old node carries **other live provenance** (multi-event corroboration), only this
+   event's contribution is repointed — **never blind-delete a multi-source node** (the
+   over-merge failure mode: a hotel email wrongly attached to a person must not let one
+   event's repoint nuke an otherwise-corroborated node).
+This dependency is hard: **re-distill is unsafe until Fusion's cross-run / teacher-version
+master selection (E3) is live.** Until then a re-distill loop accretes. An interim cheaper
+option (Open Q): explicit **event-scoped supersede** — delete only the facts/entities whose
+provenance set is exactly `{this event}` under the old hash before re-enqueue — covers the
+single-provenance majority without the full fusion adjudicator.
+### B′5. Corpus-as-byproduct — one loop, three wins
+Every re-distill emits a clean `6ccfe70f` `distillation_trace`. A prompt-version-drift
+re-distill loop therefore **builds the student retrain corpus while it repairs the graph**
+(`scripts/build_retrain_corpus.py` consumes those traces). It subsumes the one-shot full
+re-distill: gradual, rate-limited, no nuke — graph repair **+** corpus **+** self-healing
+from a single engine. This is the durable answer to "is the corpus building?": it is, as a
+side effect of the gardener.
+### B′6. Cadence + cost + safety
+Rolling, rate-limited, autoscaler-aware, off-peak. Budget *N* events/hour against teacher
+capacity; order by `salience × staleness`. **Never big-bang the full backlog** — gradual
+migration is the point. Arena-scoped, dry-run → `--apply`, `redistill_runs` ledger for
+observability and rollback. Same operational shape as fusion/decay/autoscaler.
+---
 ## Part C — Ordering & how they combine
-Per arena, on schedule: **(1) fusion → (2) decay.** Fusion first so a master node absorbs
-its duplicates' provenance/salience *before* decay judges it (else a real node split across
-two weak dupes could wrongly decay out). Then decay ages + evicts the survivors.
+Per arena, on schedule: **(1) triage → re-distill the high-value stale-prompt set (async via
+the queue) → (2) fusion → (3) decay.** Re-distill is enqueued first so that by the time
+fusion runs, the clean counterpart exists for it to crown as master (else fusion has only
+stale renderings to choose between). Fusion then absorbs each master's duplicates'
+provenance/salience *before* decay judges it (else a real node split across two weak dupes
+could wrongly decay out). Then decay ages + evicts the survivors.
+*(Re-distill is asynchronous — it completes on the teacher's schedule — so in practice a
+node re-distilled in this pass is fused/decayed in the **next** per-arena pass, once its
+clean trace + extraction have landed. The ledger links the two.)*
 **This is what finally cures immortal pollution:**
 - 7B polluted node *with* a correct Qwen3.6 counterpart → **fused**, correct one as master,
   polluted demoted to alias / tombstoned.
+- stale-prompt node, *high-value*, *no* correct counterpart, source event present →
+  **re-distilled** through the clean teacher → new master extraction; old superseded via
+  fusion (B′4). The information is *recovered*, not lost.
 - 7B pure-junk node with *no* correct counterpart (numeric-ID-person, ungrounded) → born-low
   salience + no corroboration + never accessed → **decays out and is evicted**.
@@ -165,8 +259,15 @@ reset, but no longer the *only* path).
 - `relationships`: `+ salience REAL`, `+ last_accessed` (already has `weight`,
   `first/last_seen`).
 - new `fact_merges` audit (mirror `entity_merges` incl. `rollback_payload`).
-- new `fusion_runs` + `decay_runs` ledgers for observability.
+- new `fusion_runs` + `decay_runs` + `redistill_runs` ledgers for observability. `redistill_runs`:
+  `(id, arena, event_id, old_prompt_hash, new_prompt_hash, salience_at_trigger, enqueued_at,
+  completed_at, fused_at, mode)` — links a re-distill to its triggering node and to the fusion
+  that superseded the old extraction.
 - `/search` gains a `last_accessed = NOW()` bump on returned nodes (batched).
+- re-distill trigger needs provenance → prompt-version: either denormalize `system_prompt_hash`
+  onto `facts`/`entities` at write time (cheap filter), or join through
+  `distillation_traces(event_id → system_prompt_hash)` on the provenance event ids (no schema
+  change, costlier query). Prefer the join until the trigger volume justifies denormalizing.
 ## Part E — Rollout (each flag-gated, arena-scoped, dry-run-first, audited)
@@ -176,6 +277,13 @@ reset, but no longer the *only* path).
 3. **Fusion extension** — scored canonical selection (fix typo-crowning) + cross-run
    detection + fact fusion, dry-run → apply.
 4. **Online/continuous** — wire fusion+decay to run after distillation per arena.
+5. **Re-distill loop (Part B′)** — dry-run triage first (count stale-prompt nodes by
+   `system_prompt_hash` × salience bucket to size the work), then a **bounded `--apply` slice**
+   on one curated arena (re-enqueue + verify clean trace + verify fusion supersedes the old
+   extraction), then wire continuous. **Gated on step 3** (Fusion cross-run / teacher-version
+   master selection): until that's live, re-distill must use the interim **event-scoped
+   supersede** (B′4) or it accretes. Ships as `scripts/redistill.py` (dry-run default,
+   `--apply` gate, arena-scoped, `redistill_runs` ledger).
 ## Open questions
 - Half-life constants per category — needs a calibration pass against real arenas.
@@ -183,3 +291,9 @@ reset, but no longer the *only* path).
 - Directory authority source for canonical anchoring — HubSpot contacts? a curated table?
 - Interaction with the (still-open) source_id supersede mode — fusion partly subsumes it,
   but explicit supersede is cheaper for known-mutable sources.
+- **Re-distill supersedence before full fusion is live** — is event-scoped supersede (delete
+  only nodes whose provenance set is exactly `{this event}` under the old hash) a safe enough
+  interim, or do we hard-gate the loop on E3? Single-provenance nodes are the majority, but
+  the multi-provenance tail is where the over-merge risk concentrates.
+- **Re-distill prioritization** — pure `salience × staleness`, or weight toward the entities
+  behind known user-visible confabulations (Vickers/Boedecker) first?

package/packages/memory-engine-v2/compat/server.py CHANGED Viewed

@@ -896,6 +896,8 @@ class GraphQueryRequest(BaseModel):
     entity_type: str | None = None
     name: str | None = None             # canonical_name (ILIKE)
     subject: str | None = None          # entity name OR canonical_name (facts.subject_entity)
+    subject_entity_id: str | None = None  # EXACT facts.subject_entity_id — strict, no name bleed
+    object_entity_id: str | None = None   # EXACT facts.object_entity_id
     predicate: str | None = None
     category: str | None = None         # facts.category
     from_name: str | None = None        # relationships.from_entity.canonical_name
@@ -911,6 +913,34 @@ def _resolve_arenas(req: GraphQueryRequest) -> list[str]:
     return arenas
+# Decay access signal (RFC-decay-and-fusion Part B1): the Fusion Drive decay pass
+# ages salience by the most recent of (last_accessed, last_seen/asserted_at), so
+# without an access bump a frequently-retrieved memory still decays and can be
+# evicted. Bump last_accessed on the nodes a read returns so retrieval keeps them
+# alive. THROTTLED to once / _ACCESS_BUMP_THROTTLE per node (the UPDATE no-ops for
+# nodes touched recently) to bound write amplification on these hot read paths,
+# and BEST-EFFORT — a bump failure must never fail the read. `table` is always a
+# trusted literal (never user input).
+_ACCESS_BUMP_THROTTLE = "6 hours"
+async def _bump_last_accessed(conn, cur, table: str, ids: list[str]) -> None:
+    ids = [i for i in ids if i]
+    if not ids:
+        return
+    try:
+        await cur.execute(
+            f"UPDATE {table} SET last_accessed = NOW() "
+            f"WHERE id = ANY(%s) AND (last_accessed IS NULL "
+            f"OR last_accessed < NOW() - interval '{_ACCESS_BUMP_THROTTLE}')",
+            (ids,),
+        )
+        await conn.commit()
+    except Exception as e:  # noqa: BLE001 — never let the access bump break a read
+        await conn.rollback()
+        log.warning("last_accessed bump failed on %s: %s", table, e)
 @app.post("/entities")
 async def list_entities(req: GraphQueryRequest):
     """Filter entities by arena + optional type + optional name pattern.
@@ -928,10 +958,10 @@ async def list_entities(req: GraphQueryRequest):
         params.extend([pattern, pattern])
     sql = f"""
         SELECT id, arena, entity_type, canonical_name, aliases,
-               provenance_event_ids, attributes, last_seen
+               provenance_event_ids, attributes, salience, last_seen
           FROM entities
          WHERE {' AND '.join(conditions)}
-      ORDER BY last_seen DESC
+      ORDER BY salience DESC, last_seen DESC
          LIMIT %s
     """
     params.append(req.limit)
@@ -939,14 +969,20 @@ async def list_entities(req: GraphQueryRequest):
         async with conn.cursor() as cur:
             await cur.execute(sql, params)
             rows = await cur.fetchall()
+            await _bump_last_accessed(conn, cur, "entities", [r["id"] for r in rows])
     return {"results": [dict(r) for r in rows]}
 @app.post("/facts")
 async def list_facts(req: GraphQueryRequest):
-    """Filter facts by arena + optional category/predicate + optional
-    subject-entity name. Subject filter joins facts → entities via
-    subject_entity_id."""
+    """Filter facts by arena + optional category/predicate + subject.
+    PREFER `subject_entity_id` (exact id match) over `subject` (name ILIKE):
+    name matching bleeds one person's facts into another's answer when names
+    collide or fragment (the Will Vickers ⟵ Will Spencer confabulation — a
+    query resolved to one entity must NOT pull a same/similar-named entity's
+    facts). The name path is kept for back-compat callers that haven't resolved
+    an id yet, but entity-id is the strict, bleed-free path."""
     arenas = _resolve_arenas(req)
     conditions = ["f.arena = ANY(%s)"]
     params: list[Any] = [arenas]
@@ -956,17 +992,24 @@ async def list_facts(req: GraphQueryRequest):
     if req.predicate:
         conditions.append("f.predicate ILIKE %s")
         params.append(f"%{req.predicate}%")
-    if req.subject:
+    if req.subject_entity_id:
+        conditions.append("f.subject_entity_id = %s")
+        params.append(req.subject_entity_id)
+    if req.object_entity_id:
+        conditions.append("f.object_entity_id = %s")
+        params.append(req.object_entity_id)
+    # Name path: only when no exact id was given (back-compat / unresolved callers).
+    if req.subject and not req.subject_entity_id:
         conditions.append("EXISTS (SELECT 1 FROM entities e WHERE e.id = f.subject_entity_id AND (e.canonical_name ILIKE %s OR %s = ANY(e.aliases)))")
         params.extend([f"%{req.subject}%", req.subject])
     sql = f"""
         SELECT f.id, f.arena, f.category, f.predicate, f.statement,
                f.subject_entity_id, f.object_entity_id,
                f.confidence, f.stage, f.asserted_at,
-               f.provenance_event_ids
+               f.salience, f.provenance_event_ids
           FROM facts f
          WHERE {' AND '.join(conditions)}
-      ORDER BY f.asserted_at DESC
+      ORDER BY f.salience DESC, f.asserted_at DESC
          LIMIT %s
     """
     params.append(req.limit)
@@ -974,6 +1017,7 @@ async def list_facts(req: GraphQueryRequest):
         async with conn.cursor() as cur:
             await cur.execute(sql, params)
             rows = await cur.fetchall()
+            await _bump_last_accessed(conn, cur, "facts", [r["id"] for r in rows])
     return {"results": [dict(r) for r in rows]}
@@ -999,13 +1043,13 @@ async def list_relationships(req: GraphQueryRequest):
                r.from_entity_id, r.to_entity_id,
                ef.canonical_name AS from_name,
                et.canonical_name AS to_name,
-               r.first_seen, r.last_seen,
+               r.first_seen, r.last_seen, r.salience,
                r.provenance_event_ids
           FROM relationships r
           JOIN entities ef ON ef.id = r.from_entity_id
           JOIN entities et ON et.id = r.to_entity_id
          WHERE {' AND '.join(conditions)}
-      ORDER BY r.last_seen DESC
+      ORDER BY r.salience DESC, r.last_seen DESC
          LIMIT %s
     """
     params.append(req.limit)
@@ -1013,6 +1057,7 @@ async def list_relationships(req: GraphQueryRequest):
         async with conn.cursor() as cur:
             await cur.execute(sql, params)
             rows = await cur.fetchall()
+            await _bump_last_accessed(conn, cur, "relationships", [r["id"] for r in rows])
     return {"results": [dict(r) for r in rows]}

package/packages/memory-engine-v2/extractor-async/test_email_alias_guard.py ADDED Viewed

@@ -0,0 +1,78 @@
+"""Unit tests for the email-alias guard (_email_plausibly_belongs).
+Pins the live pollution case (the "Johann Boedecker" node, 2026-06-22): keep the
+person's own addresses; drop the bystander emails (a hotel, newsletters, unrelated
+gmails) the LLM stapled on from co-occurring documents.
+"""
+from __future__ import annotations
+import importlib.util
+import sys
+from pathlib import Path
+import pytest
+_THIS = Path(__file__).resolve().parent
+def _load(name="extractor_async_worker_aliasguard"):
+    spec = importlib.util.spec_from_file_location(name, _THIS / "worker.py")
+    mod = importlib.util.module_from_spec(spec)
+    sys.modules[name] = mod
+    spec.loader.exec_module(mod)
+    return mod
+try:
+    worker = _load()
+except ImportError as e:
+    pytest.skip(f"extractor-async deps unavailable: {e}", allow_module_level=True)
+belongs = lambda n, e: worker._email_plausibly_belongs(n, e)
+# ── KEEP: the person's own addresses ─────────────────────────────────────
+@pytest.mark.parametrize("email", [
+    "johann@pentatonic.com",
+    "johann.boedecker@pentatonic.com",
+    "boedeckerjohann@gmail.com",
+    "JOHANN@pentatonic.com",          # case-insensitive
+    "jb@pentatonic.com",              # initials
+    "j.boedecker@pentatonic.com",     # surname token
+])
+def test_keeps_owner_emails(email):
+    assert belongs("Johann Boedecker", email) is True
+# ── DROP: the actual bystander emails found on the live Johann node ──────
+@pytest.mark.parametrize("email", [
+    "reservations.nyc@acehotel.com",
+    "marketingadmin@sustainablebrands.com",
+    "martinvasquez87@gmail.com",
+    "schwaabd@yahoo.de",
+    "cvanderlip@redish.com",
+    "leechihshan33@gmail.com",
+])
+def test_drops_bystander_emails(email):
+    assert belongs("Johann Boedecker", email) is False
+# ── edges ────────────────────────────────────────────────────────────────
+def test_initials_either_order():
+    assert belongs("Johann Boedecker", "bj@pentatonic.com") is True   # reversed initials
+def test_no_usable_name_does_not_overfilter():
+    # a bare/empty name has nothing to check against → keep (don't strip)
+    assert belongs("", "anything@x.com") is True
+    assert belongs("J", "anything@x.com") is True  # single letter < 2 → no tokens
+def test_surname_only_person_keeps_surname_email():
+    assert belongs("Vickers", "will.vickers@vickers-oil.com") is True
+    assert belongs("Vickers", "reservations.nyc@acehotel.com") is False
+def test_guard_flag_default_on():
+    assert worker.EMAIL_ALIAS_GUARD is True

package/packages/memory-engine-v2/extractor-async/worker.py CHANGED Viewed

@@ -1253,6 +1253,43 @@ def org_node_id_key(entity_type: str, name: str, stamped_domain: str | None) ->
     return name
+# --------------------------------------------------------------------
+# Email-alias guard — stop bystander emails polluting a person
+# --------------------------------------------------------------------
+# The async LLM pass sometimes emits a PERSON entity whose `email` is a BYSTANDER
+# address co-occurring in the same doc/thread (a hotel booking, a newsletter, an
+# unrelated gmail). _parse_guided_json promotes it into the entity's aliases and
+# upsert_entities then stores + RESOLVES on it — folding strangers' identities
+# (and their facts) onto the person. Measured live (pentatonic-team): a "Johann
+# Boedecker" node carrying reservations.nyc@acehotel.com + unrelated gmails, all
+# from STUDENT-distilled `doc` events. This guard keeps an email alias on a person
+# only when its local-part plausibly relates to the person's name; clear bystanders
+# are dropped BEFORE resolution/storage. Conservative: dropping a genuine but
+# non-name-matching alias is a mild loss; keeping a bystander is a confabulation
+# source. Flag-revertible (EMAIL_ALIAS_GUARD=false). Persons only — org domain
+# stamping is untouched.
+EMAIL_ALIAS_GUARD = _envflag("EMAIL_ALIAS_GUARD", "true")
+_ALIAS_NONALPHA = re.compile(r"[^a-z]")
+_ALIAS_SPLIT = re.compile(r"[^a-z]+")
+def _email_plausibly_belongs(person_name: str, email: str) -> bool:
+    """True ⇒ keep this email as an alias of `person_name`; False ⇒ drop (clear
+    bystander). Match = a name token appears in the local-part, OR the local-part
+    is the person's initials. Pure + deterministic."""
+    local = email.split("@", 1)[0].lower()
+    local_letters = _ALIAS_NONALPHA.sub("", local)
+    name_tokens = {t for t in _ALIAS_SPLIT.split(person_name.lower()) if len(t) >= 2}
+    if not name_tokens or not local_letters:
+        return True  # nothing to check against — don't over-filter
+    if any(nt in local_letters for nt in name_tokens):
+        return True  # johann@…, johann.boedecker@…, boedeckerjohann@…
+    initials = "".join(t[0] for t in person_name.lower().split() if t[:1].isalpha())
+    if len(initials) >= 2 and local_letters in (initials, initials[::-1]):
+        return True  # jb@… / bj@… for "Johann Boedecker"
+    return False
 def upsert_entities(
     conn: psycopg.Connection,
     arena: str,
@@ -1345,6 +1382,21 @@ def upsert_entities(
                 continue
             aliases = [a for a in (e.get("aliases") or []) if a]
+            # Email-alias guard (persons only): drop bystander emails the LLM
+            # stapled on from a co-occurring doc/thread, BEFORE they reach
+            # resolution or storage. See _email_plausibly_belongs.
+            if EMAIL_ALIAS_GUARD and etype == "person" and aliases:
+                kept = []
+                for a in aliases:
+                    if "@" in a and " " not in a and not _email_plausibly_belongs(name, a):
+                        log.info(
+                            f"alias-guard: dropped bystander email {a!r} from "
+                            f"person {name!r} (arena={arena})"
+                        )
+                        continue
+                    kept.append(a)
+                aliases = kept
             # Hard-key stamps for THIS entity, merged onto the node's attributes
             # and (for domain) into the resolution aliases. Adding domain to
             # aliases before forms are computed is deliberate — that's what makes