vividembed 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- vividembed-1.0.0/LICENSE +142 -0
- vividembed-1.0.0/PKG-INFO +489 -0
- vividembed-1.0.0/README.md +450 -0
- vividembed-1.0.0/pyproject.toml +51 -0
- vividembed-1.0.0/setup.cfg +4 -0
- vividembed-1.0.0/src/vividembed/__init__.py +43 -0
- vividembed-1.0.0/src/vividembed/vividembed.py +2547 -0
- vividembed-1.0.0/src/vividembed.egg-info/PKG-INFO +489 -0
- vividembed-1.0.0/src/vividembed.egg-info/SOURCES.txt +10 -0
- vividembed-1.0.0/src/vividembed.egg-info/dependency_links.txt +1 -0
- vividembed-1.0.0/src/vividembed.egg-info/requires.txt +15 -0
- vividembed-1.0.0/src/vividembed.egg-info/top_level.txt +1 -0
vividembed-1.0.0/LICENSE
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
# PolyForm Noncommercial License 1.0.0
|
|
2
|
+
|
|
3
|
+
<https://polyformproject.org/licenses/noncommercial/1.0.0>
|
|
4
|
+
|
|
5
|
+
## Acceptance
|
|
6
|
+
|
|
7
|
+
In order to get any license under these terms, you must agree
|
|
8
|
+
to them as both strict obligations and conditions to all
|
|
9
|
+
your licenses.
|
|
10
|
+
|
|
11
|
+
## Copyright License
|
|
12
|
+
|
|
13
|
+
The licensor grants you a copyright license for the
|
|
14
|
+
software to do everything you might do with the software
|
|
15
|
+
that would otherwise infringe the licensor's copyright
|
|
16
|
+
in it for any permitted purpose. However, you may
|
|
17
|
+
only distribute the software according to [Distribution
|
|
18
|
+
License](#distribution-license) and make changes or new works
|
|
19
|
+
based on the software according to [Changes and New Works
|
|
20
|
+
License](#changes-and-new-works-license).
|
|
21
|
+
|
|
22
|
+
## Distribution License
|
|
23
|
+
|
|
24
|
+
The licensor grants you an additional copyright license
|
|
25
|
+
to distribute copies of the software. Your license
|
|
26
|
+
to distribute covers distributing the software with
|
|
27
|
+
changes and new works permitted by [Changes and New Works
|
|
28
|
+
License](#changes-and-new-works-license).
|
|
29
|
+
|
|
30
|
+
## Notices
|
|
31
|
+
|
|
32
|
+
You must ensure that anyone who gets a copy of any part of
|
|
33
|
+
the software from you also gets a copy of these terms or the
|
|
34
|
+
URL for them above, as well as copies of any plain-text lines
|
|
35
|
+
beginning with `Required Notice:` that the licensor provided
|
|
36
|
+
with the software. For example:
|
|
37
|
+
|
|
38
|
+
> Required Notice: Copyright Scott Kronick (2026)
|
|
39
|
+
|
|
40
|
+
## Changes and New Works License
|
|
41
|
+
|
|
42
|
+
The licensor grants you an additional copyright license to
|
|
43
|
+
make changes and new works based on the software for any
|
|
44
|
+
permitted purpose.
|
|
45
|
+
|
|
46
|
+
## Patent License
|
|
47
|
+
|
|
48
|
+
The licensor grants you a patent license for the software that
|
|
49
|
+
covers patent claims the licensor can license, or becomes able
|
|
50
|
+
to license, that you would infringe by using the software.
|
|
51
|
+
|
|
52
|
+
## Noncommercial Purposes
|
|
53
|
+
|
|
54
|
+
Any noncommercial purpose is a permitted purpose.
|
|
55
|
+
|
|
56
|
+
## Personal Uses
|
|
57
|
+
|
|
58
|
+
Personal use for research, experiment, and testing for
|
|
59
|
+
the benefit of public knowledge, personal study, private
|
|
60
|
+
entertainment, hobby projects, amateur pursuits, or religious
|
|
61
|
+
observance, without any anticipated commercial application,
|
|
62
|
+
is use for a permitted purpose.
|
|
63
|
+
|
|
64
|
+
## Noncommercial Organizations
|
|
65
|
+
|
|
66
|
+
Use by any charitable organization, educational institution,
|
|
67
|
+
public research organization, public safety or health
|
|
68
|
+
organization, environmental protection organization, or
|
|
69
|
+
government institution is use for a permitted purpose
|
|
70
|
+
regardless of the source of funding or obligations resulting
|
|
71
|
+
from the funding.
|
|
72
|
+
|
|
73
|
+
## Fair Use
|
|
74
|
+
|
|
75
|
+
You may have "fair use" rights for the software under the
|
|
76
|
+
law. These terms do not limit them.
|
|
77
|
+
|
|
78
|
+
## No Other Rights
|
|
79
|
+
|
|
80
|
+
These terms do not allow you to sublicense or transfer any of
|
|
81
|
+
your licenses to anyone else, or prevent the licensor from
|
|
82
|
+
granting licenses to anyone else. These terms do not imply
|
|
83
|
+
any other licenses.
|
|
84
|
+
|
|
85
|
+
## Patent Defense
|
|
86
|
+
|
|
87
|
+
If you make any written claim that the software infringes or
|
|
88
|
+
contributes to infringement of any patent, your patent license
|
|
89
|
+
for the software granted under these terms ends immediately. If
|
|
90
|
+
your company makes such a claim, your patent license ends
|
|
91
|
+
immediately for work on behalf of your company.
|
|
92
|
+
|
|
93
|
+
## Violations
|
|
94
|
+
|
|
95
|
+
The first time you are notified in writing that you have
|
|
96
|
+
violated any of these terms, or done anything with the software
|
|
97
|
+
not covered by your licenses, your licenses can nonetheless
|
|
98
|
+
continue if you come into full compliance with these terms,
|
|
99
|
+
and take practical steps to correct past violations, within
|
|
100
|
+
32 days of receiving notice. Otherwise, all your licenses
|
|
101
|
+
end immediately.
|
|
102
|
+
|
|
103
|
+
## No Liability
|
|
104
|
+
|
|
105
|
+
***As far as the law allows, the software comes as is, without
|
|
106
|
+
any warranty or condition, and the licensor will not be liable
|
|
107
|
+
to anyone for any damages related to this software or this
|
|
108
|
+
license, under any kind of legal claim.***
|
|
109
|
+
|
|
110
|
+
## Definitions
|
|
111
|
+
|
|
112
|
+
The **licensor** is the individual or entity offering these
|
|
113
|
+
terms, and the **software** is the software the licensor makes
|
|
114
|
+
available under these terms.
|
|
115
|
+
|
|
116
|
+
**You** refers to the individual or entity agreeing to these
|
|
117
|
+
terms.
|
|
118
|
+
|
|
119
|
+
**Your company** is any legal entity, sole proprietorship,
|
|
120
|
+
or other kind of organization that you work for, plus all
|
|
121
|
+
organizations that have control over, are under the control of,
|
|
122
|
+
or are under common control with that organization. **Control**
|
|
123
|
+
means ownership of substantially all the assets of an entity,
|
|
124
|
+
or the power to direct its management and policies by vote,
|
|
125
|
+
contract, or otherwise. Control can be direct or indirect.
|
|
126
|
+
|
|
127
|
+
**Your licenses** are all the licenses granted to you for the
|
|
128
|
+
software under these terms.
|
|
129
|
+
|
|
130
|
+
**Use** means anything you do with the software requiring one
|
|
131
|
+
of your licenses.
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
## Commercial Licensing
|
|
136
|
+
|
|
137
|
+
For commercial use of VividEmbed, please contact:
|
|
138
|
+
|
|
139
|
+
**Scott Kronick**
|
|
140
|
+
GitHub: [@Kronic90](https://github.com/Kronic90)
|
|
141
|
+
|
|
142
|
+
Commercial licenses are available on a case-by-case basis.
|
|
@@ -0,0 +1,489 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: vividembed
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: Neuroscience-inspired memory embeddings for AI companions — emotion, vividness, mood-congruent retrieval
|
|
5
|
+
Author: Kronic90
|
|
6
|
+
License-Expression: PolyForm-Noncommercial-1.0.0
|
|
7
|
+
Project-URL: Homepage, https://github.com/Kronic90/VividEmbed
|
|
8
|
+
Project-URL: Repository, https://github.com/Kronic90/VividEmbed
|
|
9
|
+
Project-URL: Issues, https://github.com/Kronic90/VividEmbed/issues
|
|
10
|
+
Keywords: ai,memory,embeddings,llm,agent,emotion,neuroscience,vividness,pad-model,companion,reconsolidation,mood-congruent
|
|
11
|
+
Classifier: Development Status :: 5 - Production/Stable
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: Intended Audience :: Science/Research
|
|
14
|
+
Classifier: Programming Language :: Python :: 3
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
20
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
21
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
22
|
+
Classifier: Operating System :: OS Independent
|
|
23
|
+
Requires-Python: >=3.9
|
|
24
|
+
Description-Content-Type: text/markdown
|
|
25
|
+
License-File: LICENSE
|
|
26
|
+
Requires-Dist: numpy>=1.24
|
|
27
|
+
Requires-Dist: sentence-transformers>=2.2
|
|
28
|
+
Requires-Dist: torch>=2.0
|
|
29
|
+
Provides-Extra: viz
|
|
30
|
+
Requires-Dist: matplotlib>=3.5; extra == "viz"
|
|
31
|
+
Requires-Dist: scikit-learn>=1.0; extra == "viz"
|
|
32
|
+
Provides-Extra: cortex
|
|
33
|
+
Requires-Dist: matplotlib>=3.5; extra == "cortex"
|
|
34
|
+
Requires-Dist: scikit-learn>=1.0; extra == "cortex"
|
|
35
|
+
Provides-Extra: all
|
|
36
|
+
Requires-Dist: matplotlib>=3.5; extra == "all"
|
|
37
|
+
Requires-Dist: scikit-learn>=1.0; extra == "all"
|
|
38
|
+
Dynamic: license-file
|
|
39
|
+
|
|
40
|
+
<p align="center">
|
|
41
|
+
<img src="VividEmbedLogo.png" alt="VividEmbed Logo" width="600"/>
|
|
42
|
+
</p>
|
|
43
|
+
|
|
44
|
+
<h1 align="center">VividEmbed</h1>
|
|
45
|
+
|
|
46
|
+
<p align="center">
|
|
47
|
+
<b>Neuroscience-Inspired Memory Embeddings for AI Companions</b><br/>
|
|
48
|
+
<i>Because memory should feel human — not just retrieve text.</i>
|
|
49
|
+
</p>
|
|
50
|
+
|
|
51
|
+
<p align="center">
|
|
52
|
+
<img src="https://img.shields.io/badge/tests-190%2F190%20passing-brightgreen?style=flat-square" alt="Tests"/>
|
|
53
|
+
<img src="https://img.shields.io/badge/python-3.10%2B-blue?style=flat-square" alt="Python"/>
|
|
54
|
+
<img src="https://img.shields.io/badge/license-PolyForm%20NC%201.0-orange?style=flat-square" alt="License"/>
|
|
55
|
+
<img src="https://img.shields.io/badge/params-22M-purple?style=flat-square" alt="Parameters"/>
|
|
56
|
+
</p>
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## What is VividEmbed?
|
|
61
|
+
|
|
62
|
+
VividEmbed is a memory embedding system designed for AI companions that need to *remember like a person* — not just search like a database. Standard embedding models treat every piece of text the same: a flat vector, a cosine lookup, done. VividEmbed does something fundamentally different.
|
|
63
|
+
|
|
64
|
+
It encodes **emotion**, **importance**, **recency**, **vividness decay**, and **mood-congruent retrieval** directly into the embedding space. When your AI companion is sad, it naturally recalls sad memories first — just like you do. Memories that haven't been thought about in months gradually fade. Vivid, emotionally charged moments persist longer. And every time a memory is recalled, it subtly shifts — just like real human reconsolidation.
|
|
65
|
+
|
|
66
|
+
This isn't a wrapper around a vector database. It's a purpose-built embedding architecture grounded in cognitive neuroscience research.
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## Key Results
|
|
71
|
+
|
|
72
|
+
VividEmbed outperforms leading memory systems across all three standard metrics on the MemGPT/Letta benchmark (EmbedBench, 500 evaluations across 5 seeds):
|
|
73
|
+
|
|
74
|
+
| Metric | Leading System | VividEmbed v3 | Delta |
|
|
75
|
+
|--------|---------------|---------------|-------|
|
|
76
|
+
| **Tool Accuracy** | 0.4300 | **0.4400** | +2.3% |
|
|
77
|
+
| **F1 Score** | 0.4945 | **0.5151** | +4.2% |
|
|
78
|
+
| **BLEU-1** | 0.6310 | **0.6660** | +5.5% |
|
|
79
|
+
|
|
80
|
+
All improvements achieved with a **22M parameter** fine-tuned model — no GPT-4, no cloud APIs, fully local.
|
|
81
|
+
|
|
82
|
+
### Visual Proof
|
|
83
|
+
|
|
84
|
+
The full test suite generates 17 diagnostic visualisations. Here are the most important:
|
|
85
|
+
|
|
86
|
+
<p align="center">
|
|
87
|
+
<img src="visual_reports/35_architecture_summary.png" alt="Architecture Summary — feature inventory, test results, and pass rates" width="900"/><br/>
|
|
88
|
+
<i>Architecture Summary — complete feature inventory with 190/190 tests passing across all subsystems.</i>
|
|
89
|
+
</p>
|
|
90
|
+
|
|
91
|
+
<p align="center">
|
|
92
|
+
<img src="visual_reports/01_emotion_clustering.png" alt="Emotion Clustering — memories group by emotional tone" width="900"/><br/>
|
|
93
|
+
<i>Emotion Clustering — memories naturally group by emotional tone in embedding space. Intra-group similarity (0.39) consistently exceeds inter-group similarity (0.13).</i>
|
|
94
|
+
</p>
|
|
95
|
+
|
|
96
|
+
<p align="center">
|
|
97
|
+
<img src="visual_reports/29_reconsolidation.png" alt="Memory Reconsolidation — vectors drift less with each recall" width="900"/><br/>
|
|
98
|
+
<i>Memory Reconsolidation — each recall produces diminishing vector drift, modelling how real memories consolidate over time.</i>
|
|
99
|
+
</p>
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## What Makes VividEmbed Different
|
|
104
|
+
|
|
105
|
+
### The Problem with Standard Embeddings
|
|
106
|
+
|
|
107
|
+
Traditional embedding systems (sentence-transformers, OpenAI, Cohere) produce static vectors that capture *what* was said but nothing about *how it felt*, *when it happened*, or *how important it was*. Retrieval is a flat cosine lookup — the same results whether your AI is happy, sad, or angry.
|
|
108
|
+
|
|
109
|
+
This is fine for search engines. It's terrible for companions that need to feel like they actually know you.
|
|
110
|
+
|
|
111
|
+
### The VividEmbed Approach
|
|
112
|
+
|
|
113
|
+
VividEmbed extends a 384-dimensional base embedding with 5 additional dimensions that encode the psychological context of each memory:
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
┌─────────────────────────────────────────────────────────┐
|
|
117
|
+
│ 384-d Semantic Core (what was said) │
|
|
118
|
+
│ ├── Fine-tuned all-MiniLM-L6-v2 backbone │
|
|
119
|
+
│ └── 58 special tokens for emotion/arc/transition cues │
|
|
120
|
+
├─────────────────────────────────────────────────────────┤
|
|
121
|
+
│ 3-d PAD Emotion Space (how it felt) │
|
|
122
|
+
│ ├── Pleasure [-1, +1] │
|
|
123
|
+
│ ├── Arousal [-1, +1] │
|
|
124
|
+
│ └── Dominance [-1, +1] │
|
|
125
|
+
├─────────────────────────────────────────────────────────┤
|
|
126
|
+
│ 1-d Importance (how much it mattered) │
|
|
127
|
+
│ 1-d Stability (how resistant to forgetting) │
|
|
128
|
+
├─────────────────────────────────────────────────────────┤
|
|
129
|
+
│ = 389-d VividVector │
|
|
130
|
+
└─────────────────────────────────────────────────────────┘
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
Retrieval then uses a multi-signal scoring function instead of raw cosine:
|
|
134
|
+
|
|
135
|
+
```
|
|
136
|
+
score = 0.45 × semantic_similarity
|
|
137
|
+
+ 0.20 × vividness_decay
|
|
138
|
+
+ 0.20 × mood_congruence
|
|
139
|
+
+ 0.15 × recency
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
This means the *same query* returns different results depending on the AI's current mood, the age of the memories, and how vivid they still are — matching how human memory actually works.
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## Neuroscience-Inspired Features
|
|
147
|
+
|
|
148
|
+
VividEmbed implements four mechanisms drawn directly from cognitive neuroscience research. These aren't metaphors — they're functional implementations that produce measurable effects on retrieval quality.
|
|
149
|
+
|
|
150
|
+
### 1. Memory Reconsolidation
|
|
151
|
+
|
|
152
|
+
**Based on:** Nader et al. (2000) — memories destabilise during recall and are re-stored with contextual influence.
|
|
153
|
+
|
|
154
|
+
Every time a memory is recalled, its vector is subtly blended toward the retrieval context:
|
|
155
|
+
|
|
156
|
+
```
|
|
157
|
+
v' = α·v + (1−α)·q, then rescale to preserve ‖v‖
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
- `α` starts at **0.98** (2% drift per recall) and increases toward **0.995** as recall count grows
|
|
161
|
+
- Early memories are more plastic; frequently-recalled memories consolidate and resist drift
|
|
162
|
+
- A similarity gate (`cos_sim > 0.5`) prevents unrelated queries from corrupting memories
|
|
163
|
+
|
|
164
|
+
**Effect:** Memories naturally evolve with the conversation. A memory about "boxing at the gym" gradually incorporates the context of later fitness discussions, just like real memories do.
|
|
165
|
+
|
|
166
|
+
### 2. Emotional Transitions
|
|
167
|
+
|
|
168
|
+
**Based on:** Affect-as-information theory — emotional *change* is a strong contextual cue.
|
|
169
|
+
|
|
170
|
+
Each memory tracks the emotional state that preceded it (`prev_emotion`). When the AI transitions from calm to anxious, that transition becomes part of the memory encoding via the `[FROM:calm]` special token.
|
|
171
|
+
|
|
172
|
+
**Effect:** The model learns that memories formed during emotional shifts are contextually distinct from memories formed in stable emotional states, improving retrieval precision for emotionally charged conversations.
|
|
173
|
+
|
|
174
|
+
### 3. Hippocampal Pattern Separation
|
|
175
|
+
|
|
176
|
+
**Based on:** Hippocampal orthogonalisation — the brain actively de-correlates similar memories to reduce interference.
|
|
177
|
+
|
|
178
|
+
When a new memory is stored with cosine similarity > **0.92** to an existing memory (but with different content), a micro-repulsion nudge of magnitude **ε = 0.015** pushes the existing vector away:
|
|
179
|
+
|
|
180
|
+
```
|
|
181
|
+
if cos_sim(new, existing) > 0.92 and content differs:
|
|
182
|
+
nudge = ε × normalised_difference
|
|
183
|
+
existing.vector += nudge (then rescale)
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
**Effect:** Prevents semantic collapse where "I went to the coffee shop on Monday" and "I went to the coffee shop on Tuesday" merge into indistinguishable vectors. Each stays retrievable independently.
|
|
187
|
+
|
|
188
|
+
### 4. Narrative Arcs
|
|
189
|
+
|
|
190
|
+
**Based on:** Story-grammar theory — humans organise episodic memories along narrative structures.
|
|
191
|
+
|
|
192
|
+
Each memory is tagged with a position in a five-act narrative arc:
|
|
193
|
+
|
|
194
|
+
| Position | Description | Example Keywords |
|
|
195
|
+
|----------|-------------|-----------------|
|
|
196
|
+
| **Setup** | Introduction, new beginnings | "started", "first time", "day one" |
|
|
197
|
+
| **Rising** | Building tension, progress | "getting better", "improving" |
|
|
198
|
+
| **Climax** | Peak moments, turning points | "finally", "breakthrough", "changed everything" |
|
|
199
|
+
| **Falling** | Aftermath, settling | "after that", "coming down" |
|
|
200
|
+
| **Resolution** | Reflection, lessons learned | "looking back", "at peace", "moved on" |
|
|
201
|
+
|
|
202
|
+
Arc position is inferred automatically from keywords and emotional arousal, or can be set explicitly. The fine-tuned model encodes this as an `[ARC:climax]` special token in the embedding.
|
|
203
|
+
|
|
204
|
+
**Effect:** When the AI is asked about "turning points" or "how things resolved," it can retrieve memories by narrative position — not just keyword match.
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## Architecture
|
|
209
|
+
|
|
210
|
+
VividEmbed operates across three tiers:
|
|
211
|
+
|
|
212
|
+
```
|
|
213
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
214
|
+
│ Tier 3: VividCortex (LLM-Powered Intelligence) │
|
|
215
|
+
│ ┌────────────────────────────────────────────────────────┐ │
|
|
216
|
+
│ │ Query Decomposition — breaks vague queries into │ │
|
|
217
|
+
│ │ 1-3 precise sub-queries for better retrieval │ │
|
|
218
|
+
│ │ Memory Extraction — auto-extracts facts from │ │
|
|
219
|
+
│ │ conversation with emotion/importance tagging │ │
|
|
220
|
+
│ │ Agentic Ops — UPDATE, PROMOTE, DEMOTE, FORGET, │ │
|
|
221
|
+
│ │ CONSOLIDATE operations on the memory index │ │
|
|
222
|
+
│ │ Reflection — surfaces patterns, contradictions, │ │
|
|
223
|
+
│ │ and insights across the memory store │ │
|
|
224
|
+
│ └────────────────────────────────────────────────────────┘ │
|
|
225
|
+
├──────────────────────────────────────────────────────────────┤
|
|
226
|
+
│ Tier 2: VividEmbed (Embedding Layer) │
|
|
227
|
+
│ ┌────────────────────────────────────────────────────────┐ │
|
|
228
|
+
│ │ 389-d VividVectors with PAD emotion encoding │ │
|
|
229
|
+
│ │ Multi-signal scoring (semantic + vividness + │ │
|
|
230
|
+
│ │ mood + recency) │ │
|
|
231
|
+
│ │ Reconsolidation, pattern separation, narrative arcs │ │
|
|
232
|
+
│ │ 76 emotions mapped to 3D PAD space │ │
|
|
233
|
+
│ └────────────────────────────────────────────────────────┘ │
|
|
234
|
+
├──────────────────────────────────────────────────────────────┤
|
|
235
|
+
│ Tier 1: Core Memory │
|
|
236
|
+
│ ┌────────────────────────────────────────────────────────┐ │
|
|
237
|
+
│ │ Always-in-context blocks: persona, user, system │ │
|
|
238
|
+
│ │ Working memory: rolling conversation window (20 turns)│ │
|
|
239
|
+
│ │ Persistent scratch pad for session-level state │ │
|
|
240
|
+
│ └────────────────────────────────────────────────────────┘ │
|
|
241
|
+
└──────────────────────────────────────────────────────────────┘
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
### The PAD Emotion Space
|
|
245
|
+
|
|
246
|
+
VividEmbed maps **76 emotions** to Pleasure-Arousal-Dominance coordinates. This isn't a sentiment label — it's a continuous 3D space where emotions have geometry:
|
|
247
|
+
|
|
248
|
+
- **Pleasure** (P): negative ↔ positive feeling
|
|
249
|
+
- **Arousal** (A): calm ↔ excited activation
|
|
250
|
+
- **Dominance** (D): submissive ↔ in-control sense of agency
|
|
251
|
+
|
|
252
|
+
Examples:
|
|
253
|
+
| Emotion | P | A | D |
|
|
254
|
+
|---------|---:|---:|---:|
|
|
255
|
+
| Happy | 0.80 | 0.40 | 0.50 |
|
|
256
|
+
| Anxious | −0.50 | 0.70 | −0.40 |
|
|
257
|
+
| Calm | 0.50 | −0.50 | 0.30 |
|
|
258
|
+
| Nostalgic | 0.30 | −0.20 | 0.10 |
|
|
259
|
+
| Furious | −0.80 | 0.80 | 0.40 |
|
|
260
|
+
|
|
261
|
+
This means "anxious" and "excited" are close in arousal but opposite in pleasure — and the embedding captures that distinction natively.
|
|
262
|
+
|
|
263
|
+
### Vividness Decay
|
|
264
|
+
|
|
265
|
+
Memories don't last forever. VividEmbed models forgetting with an exponential decay:
|
|
266
|
+
|
|
267
|
+
```
|
|
268
|
+
vividness = importance × exp(−age_days / stability)
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
- High-importance (8-10) memories with high stability decay slowly over months
|
|
272
|
+
- Low-importance (1-3) memories with low stability fade within days
|
|
273
|
+
- Mood congruence modulates decay: negative memories in negative moods get a **capped** boost (reappraisal model) that itself decays over time
|
|
274
|
+
|
|
275
|
+
---
|
|
276
|
+
|
|
277
|
+
## Fine-Tuned Model
|
|
278
|
+
|
|
279
|
+
VividEmbed includes an optional purpose-built fine-tuned model (`all-MiniLLM-VividTuned`) that learns emotion-aware embeddings natively:
|
|
280
|
+
|
|
281
|
+
| Property | Value |
|
|
282
|
+
|----------|-------|
|
|
283
|
+
| Base model | all-MiniLM-L6-v2 |
|
|
284
|
+
| Parameters | 22M |
|
|
285
|
+
| Output dimension | 384-d |
|
|
286
|
+
| Special tokens | 58 (emotion, mood, arc, transition prefixes) |
|
|
287
|
+
| Training objectives | 10 |
|
|
288
|
+
| Training examples | ~35,000 |
|
|
289
|
+
| Final loss | 0.0208 |
|
|
290
|
+
|
|
291
|
+
The fine-tuned model encodes emotion, importance, arc position, and emotional transitions directly as token prefixes:
|
|
292
|
+
|
|
293
|
+
```
|
|
294
|
+
[EMO:happy] [IMP:8] [ARC:climax] [FROM:anxious] I finally got the promotion!
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
This means the 384-d output already captures what the vanilla model needs 5 extra dimensions to represent — and it does so in the learned embedding space rather than as concatenated features.
|
|
298
|
+
|
|
299
|
+
When the fine-tuned model is detected, VividEmbed automatically:
|
|
300
|
+
- Uses 384-d vectors (no PAD/meta concatenation needed)
|
|
301
|
+
- Encodes importance via vector magnitude (not a separate dimension)
|
|
302
|
+
- Enables auto-reconsolidation during `query()` calls
|
|
303
|
+
- Uses a magnitude-aware scoring function
|
|
304
|
+
|
|
305
|
+
---
|
|
306
|
+
|
|
307
|
+
## Usage
|
|
308
|
+
|
|
309
|
+
### Basic Usage
|
|
310
|
+
|
|
311
|
+
```python
|
|
312
|
+
from VividEmbed import VividEmbed
|
|
313
|
+
|
|
314
|
+
# Initialise (uses all-MiniLM-L6-v2 by default)
|
|
315
|
+
ve = VividEmbed()
|
|
316
|
+
|
|
317
|
+
# Store memories with emotional context
|
|
318
|
+
ve.add("Scott took me to the beach at sunset", emotion="peaceful", importance=8)
|
|
319
|
+
ve.add("We had a huge argument about finances", emotion="angry", importance=7)
|
|
320
|
+
ve.add("I learned to make pasta from scratch", emotion="proud", importance=6)
|
|
321
|
+
|
|
322
|
+
# Retrieve — mood affects what comes back
|
|
323
|
+
results = ve.query("tell me about a good day", mood="happy", top_k=3)
|
|
324
|
+
for r in results:
|
|
325
|
+
print(f" [{r.emotion}] {r.content} (score: {r.score:.3f})")
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
### With the Fine-Tuned Model
|
|
329
|
+
|
|
330
|
+
```python
|
|
331
|
+
ve = VividEmbed(model_name="all-MiniLLM-VividTuned/best")
|
|
332
|
+
|
|
333
|
+
# Emotional transitions are tracked automatically
|
|
334
|
+
ve.add("I was feeling calm this morning", emotion="calm", importance=5)
|
|
335
|
+
ve.add("Then I got terrible news", emotion="anxious", importance=9)
|
|
336
|
+
# ^ prev_emotion="calm" is set automatically
|
|
337
|
+
|
|
338
|
+
# Narrative arcs are inferred or set explicitly
|
|
339
|
+
ve.add("Looking back, it made me stronger", emotion="hopeful", importance=7,
|
|
340
|
+
arc_position="resolution")
|
|
341
|
+
|
|
342
|
+
# Reconsolidation happens automatically during query
|
|
343
|
+
results = ve.query("how did I handle the bad news", mood="reflective", top_k=5)
|
|
344
|
+
```
|
|
345
|
+
|
|
346
|
+
### Mood-Congruent Retrieval
|
|
347
|
+
|
|
348
|
+
```python
|
|
349
|
+
# Same query, different moods → different results
|
|
350
|
+
happy_results = ve.query("tell me about work", mood="happy", top_k=3)
|
|
351
|
+
sad_results = ve.query("tell me about work", mood="sad", top_k=3)
|
|
352
|
+
|
|
353
|
+
# happy_results favours positive work memories
|
|
354
|
+
# sad_results favours stressful/negative work memories
|
|
355
|
+
```
|
|
356
|
+
|
|
357
|
+
### Contradiction Detection
|
|
358
|
+
|
|
359
|
+
```python
|
|
360
|
+
contradictions = ve.find_contradictions(top_k=5)
|
|
361
|
+
for c in contradictions:
|
|
362
|
+
print(f" '{c['a'].content[:40]}...' vs '{c['b'].content[:40]}...'")
|
|
363
|
+
print(f" Valence difference: {c['valence_diff']:.2f}")
|
|
364
|
+
```
|
|
365
|
+
|
|
366
|
+
### Persistence
|
|
367
|
+
|
|
368
|
+
```python
|
|
369
|
+
# Save to disk
|
|
370
|
+
ve.save("my_memories.json")
|
|
371
|
+
|
|
372
|
+
# Load later — vectors are stored in binary for efficiency
|
|
373
|
+
ve2 = VividEmbed.load("my_memories.json")
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
### VividCortex (Tier 3 — LLM Integration)
|
|
377
|
+
|
|
378
|
+
```python
|
|
379
|
+
from VividEmbed import VividCortex
|
|
380
|
+
|
|
381
|
+
cortex = VividCortex(llm_fn=my_llm_function)
|
|
382
|
+
|
|
383
|
+
# Process a conversation — extracts facts automatically
|
|
384
|
+
cortex.ingest_conversation([
|
|
385
|
+
{"role": "user", "content": "I've been boxing three times a week"},
|
|
386
|
+
{"role": "assistant", "content": "That's great! How's it going?"},
|
|
387
|
+
{"role": "user", "content": "I love it, really helps with stress"}
|
|
388
|
+
])
|
|
389
|
+
|
|
390
|
+
# Smart retrieval with query decomposition
|
|
391
|
+
results = cortex.query("what does the user do for exercise and stress relief?")
|
|
392
|
+
|
|
393
|
+
# Generate a context block for your LLM
|
|
394
|
+
context = cortex.build_context("Tell me about your hobbies")
|
|
395
|
+
```
|
|
396
|
+
|
|
397
|
+
---
|
|
398
|
+
|
|
399
|
+
## Test Suite
|
|
400
|
+
|
|
401
|
+
VividEmbed ships with a comprehensive test suite (**190 assertions** across **35 tests**) that validates every feature with quantitative checks and generates visual reports.
|
|
402
|
+
|
|
403
|
+
### Test Categories
|
|
404
|
+
|
|
405
|
+
| Category | Tests | Assertions | What's Verified |
|
|
406
|
+
|----------|-------|------------|-----------------|
|
|
407
|
+
| **Core Embedding** | 1–8, 10–18 | 108 | Emotion clustering, semantic grouping, vividness decay, mood congruence, importance weighting, contradiction detection, PAD space, vector properties, filtering, persistence, batch ops, edge cases |
|
|
408
|
+
| **VividCortex** | 20–28 | 45 | Core memory blocks, query decomposition, conversation extraction, context building, agentic ops, reflection, JSON parsing |
|
|
409
|
+
| **Novel Features** | 29–33 | 30 | Reconsolidation drift, emotional transitions, pattern separation, narrative arcs, entity grounding |
|
|
410
|
+
| **Model & Summary** | 34–35 | 7 | Fine-tuned vs vanilla comparison, architecture summary with full feature inventory |
|
|
411
|
+
|
|
412
|
+
Run the test suite:
|
|
413
|
+
|
|
414
|
+
```bash
|
|
415
|
+
python test_vividembed.py
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
Output: 17 PNG visualisations + `test_results.json` in the `visual_reports/` directory.
|
|
419
|
+
|
|
420
|
+
---
|
|
421
|
+
|
|
422
|
+
## Project Structure
|
|
423
|
+
|
|
424
|
+
```
|
|
425
|
+
VividEmbed/
|
|
426
|
+
├── VividEmbed.py # Core module (~2,500 lines)
|
|
427
|
+
├── VividEmbedLogo.png # Project logo
|
|
428
|
+
├── README.md # This file
|
|
429
|
+
├── build_training_data.py # Generates ~35,000 training examples
|
|
430
|
+
├── train_vivid_model.py # Multi-objective fine-tuning script
|
|
431
|
+
├── tests/ # Test suite
|
|
432
|
+
├── visual_reports/ # Generated test visualisations (17 PNGs)
|
|
433
|
+
│ └── test_results.json # Machine-readable test results
|
|
434
|
+
├── benchmark_results/ # EmbedBench evaluation data
|
|
435
|
+
└── benchmarks/ # Benchmark scripts
|
|
436
|
+
```
|
|
437
|
+
|
|
438
|
+
---
|
|
439
|
+
|
|
440
|
+
## Requirements
|
|
441
|
+
|
|
442
|
+
- Python 3.10+
|
|
443
|
+
- `sentence-transformers`
|
|
444
|
+
- `numpy`
|
|
445
|
+
- `torch`
|
|
446
|
+
- `matplotlib` (for visual test reports)
|
|
447
|
+
- `scikit-learn` (for PCA/t-SNE in visualisations)
|
|
448
|
+
|
|
449
|
+
Optional:
|
|
450
|
+
- A local LLM function for VividCortex (Tier 3) features
|
|
451
|
+
- The fine-tuned `all-MiniLLM-VividTuned` model for enhanced emotion-aware embeddings
|
|
452
|
+
|
|
453
|
+
---
|
|
454
|
+
|
|
455
|
+
## How It Compares
|
|
456
|
+
|
|
457
|
+
| Feature | Leading Systems | VividEmbed |
|
|
458
|
+
|---------|----------------|------------|
|
|
459
|
+
| Embedding type | Static semantic vectors | 389-d emotion + semantic + meta |
|
|
460
|
+
| Emotion awareness | None (post-hoc labels at best) | Native PAD space (76 emotions) |
|
|
461
|
+
| Mood-congruent retrieval | No | Yes — same query, different mood → different results |
|
|
462
|
+
| Memory decay | TTL or manual expiry | Exponential vividness decay modulated by importance |
|
|
463
|
+
| Reconsolidation | No | Yes — vectors evolve with each recall |
|
|
464
|
+
| Pattern separation | No | Yes — near-duplicates are actively de-correlated |
|
|
465
|
+
| Narrative structure | No | Yes — 5-act arc position encoding |
|
|
466
|
+
| Emotional transitions | No | Yes — tracks emotional state changes |
|
|
467
|
+
| Contradiction detection | Requires separate LLM call | Built-in, uses PAD valence geometry |
|
|
468
|
+
| Model size | 100M–1B+ or cloud API | **22M parameters, fully local** |
|
|
469
|
+
|
|
470
|
+
---
|
|
471
|
+
|
|
472
|
+
## Citation
|
|
473
|
+
|
|
474
|
+
If you use VividEmbed in your research or projects:
|
|
475
|
+
|
|
476
|
+
```
|
|
477
|
+
@software{vividembed2026,
|
|
478
|
+
title = {VividEmbed: Neuroscience-Inspired Memory Embeddings for AI Companions},
|
|
479
|
+
author = {Kronic90},
|
|
480
|
+
year = {2026},
|
|
481
|
+
url = {https://github.com/Kronic90/VividnessMem-Ai-Roommates}
|
|
482
|
+
}
|
|
483
|
+
```
|
|
484
|
+
|
|
485
|
+
---
|
|
486
|
+
|
|
487
|
+
<p align="center">
|
|
488
|
+
<i>Built for companions that remember — not just retrieve.</i>
|
|
489
|
+
</p>
|