vividembed 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,142 @@
1
+ # PolyForm Noncommercial License 1.0.0
2
+
3
+ <https://polyformproject.org/licenses/noncommercial/1.0.0>
4
+
5
+ ## Acceptance
6
+
7
+ In order to get any license under these terms, you must agree
8
+ to them as both strict obligations and conditions to all
9
+ your licenses.
10
+
11
+ ## Copyright License
12
+
13
+ The licensor grants you a copyright license for the
14
+ software to do everything you might do with the software
15
+ that would otherwise infringe the licensor's copyright
16
+ in it for any permitted purpose. However, you may
17
+ only distribute the software according to [Distribution
18
+ License](#distribution-license) and make changes or new works
19
+ based on the software according to [Changes and New Works
20
+ License](#changes-and-new-works-license).
21
+
22
+ ## Distribution License
23
+
24
+ The licensor grants you an additional copyright license
25
+ to distribute copies of the software. Your license
26
+ to distribute covers distributing the software with
27
+ changes and new works permitted by [Changes and New Works
28
+ License](#changes-and-new-works-license).
29
+
30
+ ## Notices
31
+
32
+ You must ensure that anyone who gets a copy of any part of
33
+ the software from you also gets a copy of these terms or the
34
+ URL for them above, as well as copies of any plain-text lines
35
+ beginning with `Required Notice:` that the licensor provided
36
+ with the software. For example:
37
+
38
+ > Required Notice: Copyright Scott Kronick (2026)
39
+
40
+ ## Changes and New Works License
41
+
42
+ The licensor grants you an additional copyright license to
43
+ make changes and new works based on the software for any
44
+ permitted purpose.
45
+
46
+ ## Patent License
47
+
48
+ The licensor grants you a patent license for the software that
49
+ covers patent claims the licensor can license, or becomes able
50
+ to license, that you would infringe by using the software.
51
+
52
+ ## Noncommercial Purposes
53
+
54
+ Any noncommercial purpose is a permitted purpose.
55
+
56
+ ## Personal Uses
57
+
58
+ Personal use for research, experiment, and testing for
59
+ the benefit of public knowledge, personal study, private
60
+ entertainment, hobby projects, amateur pursuits, or religious
61
+ observance, without any anticipated commercial application,
62
+ is use for a permitted purpose.
63
+
64
+ ## Noncommercial Organizations
65
+
66
+ Use by any charitable organization, educational institution,
67
+ public research organization, public safety or health
68
+ organization, environmental protection organization, or
69
+ government institution is use for a permitted purpose
70
+ regardless of the source of funding or obligations resulting
71
+ from the funding.
72
+
73
+ ## Fair Use
74
+
75
+ You may have "fair use" rights for the software under the
76
+ law. These terms do not limit them.
77
+
78
+ ## No Other Rights
79
+
80
+ These terms do not allow you to sublicense or transfer any of
81
+ your licenses to anyone else, or prevent the licensor from
82
+ granting licenses to anyone else. These terms do not imply
83
+ any other licenses.
84
+
85
+ ## Patent Defense
86
+
87
+ If you make any written claim that the software infringes or
88
+ contributes to infringement of any patent, your patent license
89
+ for the software granted under these terms ends immediately. If
90
+ your company makes such a claim, your patent license ends
91
+ immediately for work on behalf of your company.
92
+
93
+ ## Violations
94
+
95
+ The first time you are notified in writing that you have
96
+ violated any of these terms, or done anything with the software
97
+ not covered by your licenses, your licenses can nonetheless
98
+ continue if you come into full compliance with these terms,
99
+ and take practical steps to correct past violations, within
100
+ 32 days of receiving notice. Otherwise, all your licenses
101
+ end immediately.
102
+
103
+ ## No Liability
104
+
105
+ ***As far as the law allows, the software comes as is, without
106
+ any warranty or condition, and the licensor will not be liable
107
+ to anyone for any damages related to this software or this
108
+ license, under any kind of legal claim.***
109
+
110
+ ## Definitions
111
+
112
+ The **licensor** is the individual or entity offering these
113
+ terms, and the **software** is the software the licensor makes
114
+ available under these terms.
115
+
116
+ **You** refers to the individual or entity agreeing to these
117
+ terms.
118
+
119
+ **Your company** is any legal entity, sole proprietorship,
120
+ or other kind of organization that you work for, plus all
121
+ organizations that have control over, are under the control of,
122
+ or are under common control with that organization. **Control**
123
+ means ownership of substantially all the assets of an entity,
124
+ or the power to direct its management and policies by vote,
125
+ contract, or otherwise. Control can be direct or indirect.
126
+
127
+ **Your licenses** are all the licenses granted to you for the
128
+ software under these terms.
129
+
130
+ **Use** means anything you do with the software requiring one
131
+ of your licenses.
132
+
133
+ ---
134
+
135
+ ## Commercial Licensing
136
+
137
+ For commercial use of VividEmbed, please contact:
138
+
139
+ **Scott Kronick**
140
+ GitHub: [@Kronic90](https://github.com/Kronic90)
141
+
142
+ Commercial licenses are available on a case-by-case basis.
@@ -0,0 +1,489 @@
1
+ Metadata-Version: 2.4
2
+ Name: vividembed
3
+ Version: 1.0.0
4
+ Summary: Neuroscience-inspired memory embeddings for AI companions — emotion, vividness, mood-congruent retrieval
5
+ Author: Kronic90
6
+ License-Expression: PolyForm-Noncommercial-1.0.0
7
+ Project-URL: Homepage, https://github.com/Kronic90/VividEmbed
8
+ Project-URL: Repository, https://github.com/Kronic90/VividEmbed
9
+ Project-URL: Issues, https://github.com/Kronic90/VividEmbed/issues
10
+ Keywords: ai,memory,embeddings,llm,agent,emotion,neuroscience,vividness,pad-model,companion,reconsolidation,mood-congruent
11
+ Classifier: Development Status :: 5 - Production/Stable
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: Intended Audience :: Science/Research
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.9
16
+ Classifier: Programming Language :: Python :: 3.10
17
+ Classifier: Programming Language :: Python :: 3.11
18
+ Classifier: Programming Language :: Python :: 3.12
19
+ Classifier: Programming Language :: Python :: 3.13
20
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
21
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
22
+ Classifier: Operating System :: OS Independent
23
+ Requires-Python: >=3.9
24
+ Description-Content-Type: text/markdown
25
+ License-File: LICENSE
26
+ Requires-Dist: numpy>=1.24
27
+ Requires-Dist: sentence-transformers>=2.2
28
+ Requires-Dist: torch>=2.0
29
+ Provides-Extra: viz
30
+ Requires-Dist: matplotlib>=3.5; extra == "viz"
31
+ Requires-Dist: scikit-learn>=1.0; extra == "viz"
32
+ Provides-Extra: cortex
33
+ Requires-Dist: matplotlib>=3.5; extra == "cortex"
34
+ Requires-Dist: scikit-learn>=1.0; extra == "cortex"
35
+ Provides-Extra: all
36
+ Requires-Dist: matplotlib>=3.5; extra == "all"
37
+ Requires-Dist: scikit-learn>=1.0; extra == "all"
38
+ Dynamic: license-file
39
+
40
+ <p align="center">
41
+ <img src="VividEmbedLogo.png" alt="VividEmbed Logo" width="600"/>
42
+ </p>
43
+
44
+ <h1 align="center">VividEmbed</h1>
45
+
46
+ <p align="center">
47
+ <b>Neuroscience-Inspired Memory Embeddings for AI Companions</b><br/>
48
+ <i>Because memory should feel human — not just retrieve text.</i>
49
+ </p>
50
+
51
+ <p align="center">
52
+ <img src="https://img.shields.io/badge/tests-190%2F190%20passing-brightgreen?style=flat-square" alt="Tests"/>
53
+ <img src="https://img.shields.io/badge/python-3.10%2B-blue?style=flat-square" alt="Python"/>
54
+ <img src="https://img.shields.io/badge/license-PolyForm%20NC%201.0-orange?style=flat-square" alt="License"/>
55
+ <img src="https://img.shields.io/badge/params-22M-purple?style=flat-square" alt="Parameters"/>
56
+ </p>
57
+
58
+ ---
59
+
60
+ ## What is VividEmbed?
61
+
62
+ VividEmbed is a memory embedding system designed for AI companions that need to *remember like a person* — not just search like a database. Standard embedding models treat every piece of text the same: a flat vector, a cosine lookup, done. VividEmbed does something fundamentally different.
63
+
64
+ It encodes **emotion**, **importance**, **recency**, **vividness decay**, and **mood-congruent retrieval** directly into the embedding space. When your AI companion is sad, it naturally recalls sad memories first — just like you do. Memories that haven't been thought about in months gradually fade. Vivid, emotionally charged moments persist longer. And every time a memory is recalled, it subtly shifts — just like real human reconsolidation.
65
+
66
+ This isn't a wrapper around a vector database. It's a purpose-built embedding architecture grounded in cognitive neuroscience research.
67
+
68
+ ---
69
+
70
+ ## Key Results
71
+
72
+ VividEmbed outperforms leading memory systems across all three standard metrics on the MemGPT/Letta benchmark (EmbedBench, 500 evaluations across 5 seeds):
73
+
74
+ | Metric | Leading System | VividEmbed v3 | Delta |
75
+ |--------|---------------|---------------|-------|
76
+ | **Tool Accuracy** | 0.4300 | **0.4400** | +2.3% |
77
+ | **F1 Score** | 0.4945 | **0.5151** | +4.2% |
78
+ | **BLEU-1** | 0.6310 | **0.6660** | +5.5% |
79
+
80
+ All improvements achieved with a **22M parameter** fine-tuned model — no GPT-4, no cloud APIs, fully local.
81
+
82
+ ### Visual Proof
83
+
84
+ The full test suite generates 17 diagnostic visualisations. Here are the most important:
85
+
86
+ <p align="center">
87
+ <img src="visual_reports/35_architecture_summary.png" alt="Architecture Summary — feature inventory, test results, and pass rates" width="900"/><br/>
88
+ <i>Architecture Summary — complete feature inventory with 190/190 tests passing across all subsystems.</i>
89
+ </p>
90
+
91
+ <p align="center">
92
+ <img src="visual_reports/01_emotion_clustering.png" alt="Emotion Clustering — memories group by emotional tone" width="900"/><br/>
93
+ <i>Emotion Clustering — memories naturally group by emotional tone in embedding space. Intra-group similarity (0.39) consistently exceeds inter-group similarity (0.13).</i>
94
+ </p>
95
+
96
+ <p align="center">
97
+ <img src="visual_reports/29_reconsolidation.png" alt="Memory Reconsolidation — vectors drift less with each recall" width="900"/><br/>
98
+ <i>Memory Reconsolidation — each recall produces diminishing vector drift, modelling how real memories consolidate over time.</i>
99
+ </p>
100
+
101
+ ---
102
+
103
+ ## What Makes VividEmbed Different
104
+
105
+ ### The Problem with Standard Embeddings
106
+
107
+ Traditional embedding systems (sentence-transformers, OpenAI, Cohere) produce static vectors that capture *what* was said but nothing about *how it felt*, *when it happened*, or *how important it was*. Retrieval is a flat cosine lookup — the same results whether your AI is happy, sad, or angry.
108
+
109
+ This is fine for search engines. It's terrible for companions that need to feel like they actually know you.
110
+
111
+ ### The VividEmbed Approach
112
+
113
+ VividEmbed extends a 384-dimensional base embedding with 5 additional dimensions that encode the psychological context of each memory:
114
+
115
+ ```
116
+ ┌─────────────────────────────────────────────────────────┐
117
+ │ 384-d Semantic Core (what was said) │
118
+ │ ├── Fine-tuned all-MiniLM-L6-v2 backbone │
119
+ │ └── 58 special tokens for emotion/arc/transition cues │
120
+ ├─────────────────────────────────────────────────────────┤
121
+ │ 3-d PAD Emotion Space (how it felt) │
122
+ │ ├── Pleasure [-1, +1] │
123
+ │ ├── Arousal [-1, +1] │
124
+ │ └── Dominance [-1, +1] │
125
+ ├─────────────────────────────────────────────────────────┤
126
+ │ 1-d Importance (how much it mattered) │
127
+ │ 1-d Stability (how resistant to forgetting) │
128
+ ├─────────────────────────────────────────────────────────┤
129
+ │ = 389-d VividVector │
130
+ └─────────────────────────────────────────────────────────┘
131
+ ```
132
+
133
+ Retrieval then uses a multi-signal scoring function instead of raw cosine:
134
+
135
+ ```
136
+ score = 0.45 × semantic_similarity
137
+ + 0.20 × vividness_decay
138
+ + 0.20 × mood_congruence
139
+ + 0.15 × recency
140
+ ```
141
+
142
+ This means the *same query* returns different results depending on the AI's current mood, the age of the memories, and how vivid they still are — matching how human memory actually works.
143
+
144
+ ---
145
+
146
+ ## Neuroscience-Inspired Features
147
+
148
+ VividEmbed implements four mechanisms drawn directly from cognitive neuroscience research. These aren't metaphors — they're functional implementations that produce measurable effects on retrieval quality.
149
+
150
+ ### 1. Memory Reconsolidation
151
+
152
+ **Based on:** Nader et al. (2000) — memories destabilise during recall and are re-stored with contextual influence.
153
+
154
+ Every time a memory is recalled, its vector is subtly blended toward the retrieval context:
155
+
156
+ ```
157
+ v' = α·v + (1−α)·q, then rescale to preserve ‖v‖
158
+ ```
159
+
160
+ - `α` starts at **0.98** (2% drift per recall) and increases toward **0.995** as recall count grows
161
+ - Early memories are more plastic; frequently-recalled memories consolidate and resist drift
162
+ - A similarity gate (`cos_sim > 0.5`) prevents unrelated queries from corrupting memories
163
+
164
+ **Effect:** Memories naturally evolve with the conversation. A memory about "boxing at the gym" gradually incorporates the context of later fitness discussions, just like real memories do.
165
+
166
+ ### 2. Emotional Transitions
167
+
168
+ **Based on:** Affect-as-information theory — emotional *change* is a strong contextual cue.
169
+
170
+ Each memory tracks the emotional state that preceded it (`prev_emotion`). When the AI transitions from calm to anxious, that transition becomes part of the memory encoding via the `[FROM:calm]` special token.
171
+
172
+ **Effect:** The model learns that memories formed during emotional shifts are contextually distinct from memories formed in stable emotional states, improving retrieval precision for emotionally charged conversations.
173
+
174
+ ### 3. Hippocampal Pattern Separation
175
+
176
+ **Based on:** Hippocampal orthogonalisation — the brain actively de-correlates similar memories to reduce interference.
177
+
178
+ When a new memory is stored with cosine similarity > **0.92** to an existing memory (but with different content), a micro-repulsion nudge of magnitude **ε = 0.015** pushes the existing vector away:
179
+
180
+ ```
181
+ if cos_sim(new, existing) > 0.92 and content differs:
182
+ nudge = ε × normalised_difference
183
+ existing.vector += nudge (then rescale)
184
+ ```
185
+
186
+ **Effect:** Prevents semantic collapse where "I went to the coffee shop on Monday" and "I went to the coffee shop on Tuesday" merge into indistinguishable vectors. Each stays retrievable independently.
187
+
188
+ ### 4. Narrative Arcs
189
+
190
+ **Based on:** Story-grammar theory — humans organise episodic memories along narrative structures.
191
+
192
+ Each memory is tagged with a position in a five-act narrative arc:
193
+
194
+ | Position | Description | Example Keywords |
195
+ |----------|-------------|-----------------|
196
+ | **Setup** | Introduction, new beginnings | "started", "first time", "day one" |
197
+ | **Rising** | Building tension, progress | "getting better", "improving" |
198
+ | **Climax** | Peak moments, turning points | "finally", "breakthrough", "changed everything" |
199
+ | **Falling** | Aftermath, settling | "after that", "coming down" |
200
+ | **Resolution** | Reflection, lessons learned | "looking back", "at peace", "moved on" |
201
+
202
+ Arc position is inferred automatically from keywords and emotional arousal, or can be set explicitly. The fine-tuned model encodes this as an `[ARC:climax]` special token in the embedding.
203
+
204
+ **Effect:** When the AI is asked about "turning points" or "how things resolved," it can retrieve memories by narrative position — not just keyword match.
205
+
206
+ ---
207
+
208
+ ## Architecture
209
+
210
+ VividEmbed operates across three tiers:
211
+
212
+ ```
213
+ ┌──────────────────────────────────────────────────────────────┐
214
+ │ Tier 3: VividCortex (LLM-Powered Intelligence) │
215
+ │ ┌────────────────────────────────────────────────────────┐ │
216
+ │ │ Query Decomposition — breaks vague queries into │ │
217
+ │ │ 1-3 precise sub-queries for better retrieval │ │
218
+ │ │ Memory Extraction — auto-extracts facts from │ │
219
+ │ │ conversation with emotion/importance tagging │ │
220
+ │ │ Agentic Ops — UPDATE, PROMOTE, DEMOTE, FORGET, │ │
221
+ │ │ CONSOLIDATE operations on the memory index │ │
222
+ │ │ Reflection — surfaces patterns, contradictions, │ │
223
+ │ │ and insights across the memory store │ │
224
+ │ └────────────────────────────────────────────────────────┘ │
225
+ ├──────────────────────────────────────────────────────────────┤
226
+ │ Tier 2: VividEmbed (Embedding Layer) │
227
+ │ ┌────────────────────────────────────────────────────────┐ │
228
+ │ │ 389-d VividVectors with PAD emotion encoding │ │
229
+ │ │ Multi-signal scoring (semantic + vividness + │ │
230
+ │ │ mood + recency) │ │
231
+ │ │ Reconsolidation, pattern separation, narrative arcs │ │
232
+ │ │ 76 emotions mapped to 3D PAD space │ │
233
+ │ └────────────────────────────────────────────────────────┘ │
234
+ ├──────────────────────────────────────────────────────────────┤
235
+ │ Tier 1: Core Memory │
236
+ │ ┌────────────────────────────────────────────────────────┐ │
237
+ │ │ Always-in-context blocks: persona, user, system │ │
238
+ │ │ Working memory: rolling conversation window (20 turns)│ │
239
+ │ │ Persistent scratch pad for session-level state │ │
240
+ │ └────────────────────────────────────────────────────────┘ │
241
+ └──────────────────────────────────────────────────────────────┘
242
+ ```
243
+
244
+ ### The PAD Emotion Space
245
+
246
+ VividEmbed maps **76 emotions** to Pleasure-Arousal-Dominance coordinates. This isn't a sentiment label — it's a continuous 3D space where emotions have geometry:
247
+
248
+ - **Pleasure** (P): negative ↔ positive feeling
249
+ - **Arousal** (A): calm ↔ excited activation
250
+ - **Dominance** (D): submissive ↔ in-control sense of agency
251
+
252
+ Examples:
253
+ | Emotion | P | A | D |
254
+ |---------|---:|---:|---:|
255
+ | Happy | 0.80 | 0.40 | 0.50 |
256
+ | Anxious | −0.50 | 0.70 | −0.40 |
257
+ | Calm | 0.50 | −0.50 | 0.30 |
258
+ | Nostalgic | 0.30 | −0.20 | 0.10 |
259
+ | Furious | −0.80 | 0.80 | 0.40 |
260
+
261
+ This means "anxious" and "excited" are close in arousal but opposite in pleasure — and the embedding captures that distinction natively.
262
+
263
+ ### Vividness Decay
264
+
265
+ Memories don't last forever. VividEmbed models forgetting with an exponential decay:
266
+
267
+ ```
268
+ vividness = importance × exp(−age_days / stability)
269
+ ```
270
+
271
+ - High-importance (8-10) memories with high stability decay slowly over months
272
+ - Low-importance (1-3) memories with low stability fade within days
273
+ - Mood congruence modulates decay: negative memories in negative moods get a **capped** boost (reappraisal model) that itself decays over time
274
+
275
+ ---
276
+
277
+ ## Fine-Tuned Model
278
+
279
+ VividEmbed includes an optional purpose-built fine-tuned model (`all-MiniLLM-VividTuned`) that learns emotion-aware embeddings natively:
280
+
281
+ | Property | Value |
282
+ |----------|-------|
283
+ | Base model | all-MiniLM-L6-v2 |
284
+ | Parameters | 22M |
285
+ | Output dimension | 384-d |
286
+ | Special tokens | 58 (emotion, mood, arc, transition prefixes) |
287
+ | Training objectives | 10 |
288
+ | Training examples | ~35,000 |
289
+ | Final loss | 0.0208 |
290
+
291
+ The fine-tuned model encodes emotion, importance, arc position, and emotional transitions directly as token prefixes:
292
+
293
+ ```
294
+ [EMO:happy] [IMP:8] [ARC:climax] [FROM:anxious] I finally got the promotion!
295
+ ```
296
+
297
+ This means the 384-d output already captures what the vanilla model needs 5 extra dimensions to represent — and it does so in the learned embedding space rather than as concatenated features.
298
+
299
+ When the fine-tuned model is detected, VividEmbed automatically:
300
+ - Uses 384-d vectors (no PAD/meta concatenation needed)
301
+ - Encodes importance via vector magnitude (not a separate dimension)
302
+ - Enables auto-reconsolidation during `query()` calls
303
+ - Uses a magnitude-aware scoring function
304
+
305
+ ---
306
+
307
+ ## Usage
308
+
309
+ ### Basic Usage
310
+
311
+ ```python
312
+ from VividEmbed import VividEmbed
313
+
314
+ # Initialise (uses all-MiniLM-L6-v2 by default)
315
+ ve = VividEmbed()
316
+
317
+ # Store memories with emotional context
318
+ ve.add("Scott took me to the beach at sunset", emotion="peaceful", importance=8)
319
+ ve.add("We had a huge argument about finances", emotion="angry", importance=7)
320
+ ve.add("I learned to make pasta from scratch", emotion="proud", importance=6)
321
+
322
+ # Retrieve — mood affects what comes back
323
+ results = ve.query("tell me about a good day", mood="happy", top_k=3)
324
+ for r in results:
325
+ print(f" [{r.emotion}] {r.content} (score: {r.score:.3f})")
326
+ ```
327
+
328
+ ### With the Fine-Tuned Model
329
+
330
+ ```python
331
+ ve = VividEmbed(model_name="all-MiniLLM-VividTuned/best")
332
+
333
+ # Emotional transitions are tracked automatically
334
+ ve.add("I was feeling calm this morning", emotion="calm", importance=5)
335
+ ve.add("Then I got terrible news", emotion="anxious", importance=9)
336
+ # ^ prev_emotion="calm" is set automatically
337
+
338
+ # Narrative arcs are inferred or set explicitly
339
+ ve.add("Looking back, it made me stronger", emotion="hopeful", importance=7,
340
+ arc_position="resolution")
341
+
342
+ # Reconsolidation happens automatically during query
343
+ results = ve.query("how did I handle the bad news", mood="reflective", top_k=5)
344
+ ```
345
+
346
+ ### Mood-Congruent Retrieval
347
+
348
+ ```python
349
+ # Same query, different moods → different results
350
+ happy_results = ve.query("tell me about work", mood="happy", top_k=3)
351
+ sad_results = ve.query("tell me about work", mood="sad", top_k=3)
352
+
353
+ # happy_results favours positive work memories
354
+ # sad_results favours stressful/negative work memories
355
+ ```
356
+
357
+ ### Contradiction Detection
358
+
359
+ ```python
360
+ contradictions = ve.find_contradictions(top_k=5)
361
+ for c in contradictions:
362
+ print(f" '{c['a'].content[:40]}...' vs '{c['b'].content[:40]}...'")
363
+ print(f" Valence difference: {c['valence_diff']:.2f}")
364
+ ```
365
+
366
+ ### Persistence
367
+
368
+ ```python
369
+ # Save to disk
370
+ ve.save("my_memories.json")
371
+
372
+ # Load later — vectors are stored in binary for efficiency
373
+ ve2 = VividEmbed.load("my_memories.json")
374
+ ```
375
+
376
+ ### VividCortex (Tier 3 — LLM Integration)
377
+
378
+ ```python
379
+ from VividEmbed import VividCortex
380
+
381
+ cortex = VividCortex(llm_fn=my_llm_function)
382
+
383
+ # Process a conversation — extracts facts automatically
384
+ cortex.ingest_conversation([
385
+ {"role": "user", "content": "I've been boxing three times a week"},
386
+ {"role": "assistant", "content": "That's great! How's it going?"},
387
+ {"role": "user", "content": "I love it, really helps with stress"}
388
+ ])
389
+
390
+ # Smart retrieval with query decomposition
391
+ results = cortex.query("what does the user do for exercise and stress relief?")
392
+
393
+ # Generate a context block for your LLM
394
+ context = cortex.build_context("Tell me about your hobbies")
395
+ ```
396
+
397
+ ---
398
+
399
+ ## Test Suite
400
+
401
+ VividEmbed ships with a comprehensive test suite (**190 assertions** across **35 tests**) that validates every feature with quantitative checks and generates visual reports.
402
+
403
+ ### Test Categories
404
+
405
+ | Category | Tests | Assertions | What's Verified |
406
+ |----------|-------|------------|-----------------|
407
+ | **Core Embedding** | 1–8, 10–18 | 108 | Emotion clustering, semantic grouping, vividness decay, mood congruence, importance weighting, contradiction detection, PAD space, vector properties, filtering, persistence, batch ops, edge cases |
408
+ | **VividCortex** | 20–28 | 45 | Core memory blocks, query decomposition, conversation extraction, context building, agentic ops, reflection, JSON parsing |
409
+ | **Novel Features** | 29–33 | 30 | Reconsolidation drift, emotional transitions, pattern separation, narrative arcs, entity grounding |
410
+ | **Model & Summary** | 34–35 | 7 | Fine-tuned vs vanilla comparison, architecture summary with full feature inventory |
411
+
412
+ Run the test suite:
413
+
414
+ ```bash
415
+ python test_vividembed.py
416
+ ```
417
+
418
+ Output: 17 PNG visualisations + `test_results.json` in the `visual_reports/` directory.
419
+
420
+ ---
421
+
422
+ ## Project Structure
423
+
424
+ ```
425
+ VividEmbed/
426
+ ├── VividEmbed.py # Core module (~2,500 lines)
427
+ ├── VividEmbedLogo.png # Project logo
428
+ ├── README.md # This file
429
+ ├── build_training_data.py # Generates ~35,000 training examples
430
+ ├── train_vivid_model.py # Multi-objective fine-tuning script
431
+ ├── tests/ # Test suite
432
+ ├── visual_reports/ # Generated test visualisations (17 PNGs)
433
+ │ └── test_results.json # Machine-readable test results
434
+ ├── benchmark_results/ # EmbedBench evaluation data
435
+ └── benchmarks/ # Benchmark scripts
436
+ ```
437
+
438
+ ---
439
+
440
+ ## Requirements
441
+
442
+ - Python 3.10+
443
+ - `sentence-transformers`
444
+ - `numpy`
445
+ - `torch`
446
+ - `matplotlib` (for visual test reports)
447
+ - `scikit-learn` (for PCA/t-SNE in visualisations)
448
+
449
+ Optional:
450
+ - A local LLM function for VividCortex (Tier 3) features
451
+ - The fine-tuned `all-MiniLLM-VividTuned` model for enhanced emotion-aware embeddings
452
+
453
+ ---
454
+
455
+ ## How It Compares
456
+
457
+ | Feature | Leading Systems | VividEmbed |
458
+ |---------|----------------|------------|
459
+ | Embedding type | Static semantic vectors | 389-d emotion + semantic + meta |
460
+ | Emotion awareness | None (post-hoc labels at best) | Native PAD space (76 emotions) |
461
+ | Mood-congruent retrieval | No | Yes — same query, different mood → different results |
462
+ | Memory decay | TTL or manual expiry | Exponential vividness decay modulated by importance |
463
+ | Reconsolidation | No | Yes — vectors evolve with each recall |
464
+ | Pattern separation | No | Yes — near-duplicates are actively de-correlated |
465
+ | Narrative structure | No | Yes — 5-act arc position encoding |
466
+ | Emotional transitions | No | Yes — tracks emotional state changes |
467
+ | Contradiction detection | Requires separate LLM call | Built-in, uses PAD valence geometry |
468
+ | Model size | 100M–1B+ or cloud API | **22M parameters, fully local** |
469
+
470
+ ---
471
+
472
+ ## Citation
473
+
474
+ If you use VividEmbed in your research or projects:
475
+
476
+ ```
477
+ @software{vividembed2026,
478
+ title = {VividEmbed: Neuroscience-Inspired Memory Embeddings for AI Companions},
479
+ author = {Kronic90},
480
+ year = {2026},
481
+ url = {https://github.com/Kronic90/VividnessMem-Ai-Roommates}
482
+ }
483
+ ```
484
+
485
+ ---
486
+
487
+ <p align="center">
488
+ <i>Built for companions that remember — not just retrieve.</i>
489
+ </p>