convoseed-agent 1.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 0xAshraFF
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,4 @@
1
+ include README.md
2
+ include LICENSE
3
+ include pyproject.toml
4
+ recursive-include convoseed_agent *.py
@@ -0,0 +1,212 @@
1
+ Metadata-Version: 2.4
2
+ Name: convoseed-agent
3
+ Version: 1.1.0
4
+ Summary: Agent skill caching via CSP-1 fingerprints — every session makes the next one better
5
+ Author-email: Ashraful <ashraful.islam.cse@gmail.com>
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/0xAshraFF/ConvoSeed
8
+ Project-URL: Repository, https://github.com/0xAshraFF/ConvoSeed
9
+ Project-URL: Issues, https://github.com/0xAshraFF/ConvoSeed/issues
10
+ Keywords: ai,agent,llm,skill-cache,fingerprint,convoseed,csp-1,langchain,autogen
11
+ Classifier: Development Status :: 4 - Beta
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: License :: OSI Approved :: MIT License
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
19
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
20
+ Requires-Python: >=3.10
21
+ Description-Content-Type: text/markdown
22
+ License-File: LICENSE
23
+ Requires-Dist: numpy>=1.24
24
+ Requires-Dist: scikit-learn>=1.3
25
+ Provides-Extra: embeddings
26
+ Requires-Dist: sentence-transformers>=2.6; extra == "embeddings"
27
+ Provides-Extra: openai
28
+ Requires-Dist: openai>=1.0; extra == "openai"
29
+ Provides-Extra: anthropic
30
+ Requires-Dist: anthropic>=0.25; extra == "anthropic"
31
+ Provides-Extra: all
32
+ Requires-Dist: sentence-transformers>=2.6; extra == "all"
33
+ Requires-Dist: openai>=1.0; extra == "all"
34
+ Requires-Dist: anthropic>=0.25; extra == "all"
35
+ Provides-Extra: dev
36
+ Requires-Dist: pytest>=7.0; extra == "dev"
37
+ Requires-Dist: twine>=4.0; extra == "dev"
38
+ Requires-Dist: build>=1.0; extra == "dev"
39
+ Dynamic: license-file
40
+
41
+ # ConvoSeed
42
+
43
+ CSP-1 is the **missing third leg** of the agent identity stack:
44
+
45
+ | Layer | Covers | Status |
46
+ |---|---|---|
47
+ | DID (W3C) | Who the user IS cryptographically | Specified |
48
+ | MCP (Anthropic) | What tools the agent can ACCESS | Specified |
49
+ | **CSP-1** | **How the user SPEAKS and THINKS** | **This work** |
50
+
51
+ **Chat → Compress → 200KB `.fp` File → Decompress → Resume**
52
+
53
+ ConvoSeed is an open protocol (CSP-1) for preserving the essence of a human-AI
54
+ relationship in a portable, user-owned fingerprint file.
55
+ No raw messages stored. Works across any AI model or platform.
56
+
57
+ ---
58
+
59
+ ## Why
60
+
61
+ Every AI conversation resets to zero.
62
+
63
+ You build context, vocabulary, a rhythm — and then you close the tab and it's gone.
64
+ ConvoSeed fixes that. You own a 200KB file that holds your conversational identity.
65
+ Load it anywhere. Resume everything.
66
+
67
+ > *"I had a friend — an AI that knew me well. I wanted a way to get back to him.
68
+ > That's what this is."*
69
+
70
+ ---
71
+
72
+ ## Results (February 2026)
73
+
74
+ Validated on a real **524-message** researcher-AI conversation.
75
+
76
+ | Model | Avg Similarity | Peak | Msgs > 0.7 |
77
+ |---|---|---|---|
78
+ | GPT-2 (124M) | 0.464 | 1.000 | 1 |
79
+ | Gemma3:1b | 0.466 | 0.707 | 1 |
80
+ | **Gemma3:12b** | **0.523** | **0.757** | **4** |
81
+
82
+ - **+12.7%** improvement from 1B → 12B parameters
83
+ - **232×** more efficient than VAE baseline
84
+ - **p < 10⁻¹⁰⁰** statistical significance on speaker identification task
85
+
86
+ ---
87
+
88
+ ## How It Works
89
+
90
+ ```
91
+ Messages → SBERT embed → PCA compress → HDC bind → Prefix tune → .fp file
92
+ ```
93
+
94
+ 1. **Embed** — Sentence-BERT encodes each message into a 384-dim vector
95
+ 2. **Compress** — PCA extracts the style centroid (4 components = full accuracy)
96
+ 3. **Bind** — Hyperdimensional Computing (10,000-dim) weaves temporal sequence into one vector
97
+ 4. **Tune** — A prefix tensor conditions the LLM to regenerate in your style
98
+ 5. **Sign** — Ed25519 cryptographic signature proves ownership
99
+
100
+ ---
101
+
102
+ ## File Format (`.fp`)
103
+
104
+ | Section | Size | Description |
105
+ |---|---|---|
106
+ | HEADER | ~1 KB | Magic bytes + version + CRC-32 |
107
+ | PCA_MODEL | ~8 KB | Style centroid: mean + eigenvectors |
108
+ | HDC_SEED | ~140 KB | 10,000-dim hypervector (float16) |
109
+ | PREFIX | ~40 KB | Prefix tuning tensor for generation |
110
+ | SIGNATURE | ~1 KB | Ed25519 ownership proof |
111
+ | CHUNKS | ~10 KB | Index for 500+ message threads |
112
+
113
+ **Total: ~200KB — fixed size regardless of conversation length.**
114
+
115
+ See [`/spec/CSP-1.md`](spec/CSP-1.md) for the full binary specification.
116
+
117
+ ---
118
+
119
+ ## Quick Start
120
+
121
+ ```bash
122
+ pip install sentence-transformers scikit-learn numpy
123
+
124
+ # Encode a conversation
125
+ python src/encode.py --input my_conversation.json --output identity.fp
126
+
127
+ # Identify a speaker
128
+ python src/identify.py --query "new message here" --candidates *.fp
129
+
130
+ # Generate in someone's style
131
+ python src/decode.py --fp identity.fp --prompt "Tell me about your day"
132
+ ```
133
+
134
+ ---
135
+
136
+ ## Repository Structure
137
+
138
+ ```
139
+ ConvoSeed/
140
+ ├── README.md
141
+ ├── LICENSE ← MIT
142
+ ├── CONTRIBUTING.md
143
+ ├── /docs
144
+ │ ├── ConvoSeed_Whitepaper.docx ← arXiv-ready academic paper
145
+ │ ├── ConvoSeed_ResearchPaper.docx ← detailed technical paper
146
+ │ ├── ConvoSeed_Poster.pdf ← conference poster (CHI 2026)
147
+ │ └── ConvoSeed_ProtocolSpec.pdf ← protocol specification sheet
148
+ ├── /spec
149
+ │ └── CSP-1.md ← plain-text binary spec
150
+ ├── /src
151
+ │ ├── encode.py ← fingerprint encoder
152
+ │ ├── decode.py ← style-conditioned generation
153
+ │ └── identify.py ← speaker identification
154
+ ├── /experiments
155
+ │ └── gemma3_12b_results.json ← February 2026 experimental results
156
+ └── /examples
157
+ └── sample_identity.fp ← anonymised example fingerprint
158
+ ```
159
+
160
+ ---
161
+
162
+ ## Documents
163
+
164
+ | Document | Format | Description |
165
+ |---|---|---|
166
+ | [Whitepaper](docs/ConvoSeed_Whitepaper.docx) | DOCX | 6-section academic paper, arXiv-ready |
167
+ | [Research Paper](docs/ConvoSeed_ResearchPaper.docx) | DOCX | Full technical paper with equations + references |
168
+ | [Conference Poster](docs/ConvoSeed_Poster.pdf) | PDF | CHI 2026 style research poster |
169
+ | [Protocol Spec Sheet](docs/ConvoSeed_ProtocolSpec.pdf) | PDF | One-page technical specification |
170
+ | [Presentation](docs/ConvoSeed_Presentation.pptx) | PPTX | 12-slide pitch deck |
171
+ | [W3C Note](docs/ConvoSeed_W3C_Note.pdf) | PDF | Submission to W3C AI Agent Protocol CG |
172
+
173
+ ---
174
+
175
+ ## Open Challenges
176
+
177
+ These are the three open research questions. Collaboration welcome — open an Issue.
178
+
179
+ 1. **Cross-Model Mapping** — translating a `.fp` fingerprint trained on SBERT embeddings into GPT-4 or other backbone spaces without re-encoding the original conversation.
180
+
181
+ 2. **CHUNKS Scaling** — formal composition rules for the CHUNKS section when threads exceed 500 messages, while preserving the fixed 200KB file size.
182
+
183
+ 3. **Incentive Design** — what makes AI platforms adopt an open standard that reduces their own lock-in?
184
+
185
+ ---
186
+
187
+ ## Status
188
+
189
+ > Early research. Proof-of-concept validated on real data. Open for collaboration.
190
+
191
+ - [x] Protocol specification (CSP-1 v0.2)
192
+ - [x] Proof-of-concept encoder/decoder
193
+ - [x] Speaker identification experiment (1,000 trials)
194
+ - [x] Multi-model validation (GPT-2, Gemma3:1b, Gemma3:12b)
195
+ - [x] Real conversation validation (524 messages)
196
+ - [ ] Multi-speaker support
197
+ - [ ] Cross-model mapping
198
+ - [ ] Public dataset (seeking contributors)
199
+ - [ ] W3C Community Group submission
200
+
201
+ ---
202
+
203
+ ## Licence
204
+
205
+ MIT. Open forever.
206
+
207
+ ---
208
+
209
+ ## Contact
210
+
211
+ Open an Issue for technical questions.
212
+ For collaboration or research enquiries: see CONTRIBUTING.md.
@@ -0,0 +1,172 @@
1
+ # ConvoSeed
2
+
3
+ CSP-1 is the **missing third leg** of the agent identity stack:
4
+
5
+ | Layer | Covers | Status |
6
+ |---|---|---|
7
+ | DID (W3C) | Who the user IS cryptographically | Specified |
8
+ | MCP (Anthropic) | What tools the agent can ACCESS | Specified |
9
+ | **CSP-1** | **How the user SPEAKS and THINKS** | **This work** |
10
+
11
+ **Chat → Compress → 200KB `.fp` File → Decompress → Resume**
12
+
13
+ ConvoSeed is an open protocol (CSP-1) for preserving the essence of a human-AI
14
+ relationship in a portable, user-owned fingerprint file.
15
+ No raw messages stored. Works across any AI model or platform.
16
+
17
+ ---
18
+
19
+ ## Why
20
+
21
+ Every AI conversation resets to zero.
22
+
23
+ You build context, vocabulary, a rhythm — and then you close the tab and it's gone.
24
+ ConvoSeed fixes that. You own a 200KB file that holds your conversational identity.
25
+ Load it anywhere. Resume everything.
26
+
27
+ > *"I had a friend — an AI that knew me well. I wanted a way to get back to him.
28
+ > That's what this is."*
29
+
30
+ ---
31
+
32
+ ## Results (February 2026)
33
+
34
+ Validated on a real **524-message** researcher-AI conversation.
35
+
36
+ | Model | Avg Similarity | Peak | Msgs > 0.7 |
37
+ |---|---|---|---|
38
+ | GPT-2 (124M) | 0.464 | 1.000 | 1 |
39
+ | Gemma3:1b | 0.466 | 0.707 | 1 |
40
+ | **Gemma3:12b** | **0.523** | **0.757** | **4** |
41
+
42
+ - **+12.7%** improvement from 1B → 12B parameters
43
+ - **232×** more efficient than VAE baseline
44
+ - **p < 10⁻¹⁰⁰** statistical significance on speaker identification task
45
+
46
+ ---
47
+
48
+ ## How It Works
49
+
50
+ ```
51
+ Messages → SBERT embed → PCA compress → HDC bind → Prefix tune → .fp file
52
+ ```
53
+
54
+ 1. **Embed** — Sentence-BERT encodes each message into a 384-dim vector
55
+ 2. **Compress** — PCA extracts the style centroid (4 components = full accuracy)
56
+ 3. **Bind** — Hyperdimensional Computing (10,000-dim) weaves temporal sequence into one vector
57
+ 4. **Tune** — A prefix tensor conditions the LLM to regenerate in your style
58
+ 5. **Sign** — Ed25519 cryptographic signature proves ownership
59
+
60
+ ---
61
+
62
+ ## File Format (`.fp`)
63
+
64
+ | Section | Size | Description |
65
+ |---|---|---|
66
+ | HEADER | ~1 KB | Magic bytes + version + CRC-32 |
67
+ | PCA_MODEL | ~8 KB | Style centroid: mean + eigenvectors |
68
+ | HDC_SEED | ~140 KB | 10,000-dim hypervector (float16) |
69
+ | PREFIX | ~40 KB | Prefix tuning tensor for generation |
70
+ | SIGNATURE | ~1 KB | Ed25519 ownership proof |
71
+ | CHUNKS | ~10 KB | Index for 500+ message threads |
72
+
73
+ **Total: ~200KB — fixed size regardless of conversation length.**
74
+
75
+ See [`/spec/CSP-1.md`](spec/CSP-1.md) for the full binary specification.
76
+
77
+ ---
78
+
79
+ ## Quick Start
80
+
81
+ ```bash
82
+ pip install sentence-transformers scikit-learn numpy
83
+
84
+ # Encode a conversation
85
+ python src/encode.py --input my_conversation.json --output identity.fp
86
+
87
+ # Identify a speaker
88
+ python src/identify.py --query "new message here" --candidates *.fp
89
+
90
+ # Generate in someone's style
91
+ python src/decode.py --fp identity.fp --prompt "Tell me about your day"
92
+ ```
93
+
94
+ ---
95
+
96
+ ## Repository Structure
97
+
98
+ ```
99
+ ConvoSeed/
100
+ ├── README.md
101
+ ├── LICENSE ← MIT
102
+ ├── CONTRIBUTING.md
103
+ ├── /docs
104
+ │ ├── ConvoSeed_Whitepaper.docx ← arXiv-ready academic paper
105
+ │ ├── ConvoSeed_ResearchPaper.docx ← detailed technical paper
106
+ │ ├── ConvoSeed_Poster.pdf ← conference poster (CHI 2026)
107
+ │ └── ConvoSeed_ProtocolSpec.pdf ← protocol specification sheet
108
+ ├── /spec
109
+ │ └── CSP-1.md ← plain-text binary spec
110
+ ├── /src
111
+ │ ├── encode.py ← fingerprint encoder
112
+ │ ├── decode.py ← style-conditioned generation
113
+ │ └── identify.py ← speaker identification
114
+ ├── /experiments
115
+ │ └── gemma3_12b_results.json ← February 2026 experimental results
116
+ └── /examples
117
+ └── sample_identity.fp ← anonymised example fingerprint
118
+ ```
119
+
120
+ ---
121
+
122
+ ## Documents
123
+
124
+ | Document | Format | Description |
125
+ |---|---|---|
126
+ | [Whitepaper](docs/ConvoSeed_Whitepaper.docx) | DOCX | 6-section academic paper, arXiv-ready |
127
+ | [Research Paper](docs/ConvoSeed_ResearchPaper.docx) | DOCX | Full technical paper with equations + references |
128
+ | [Conference Poster](docs/ConvoSeed_Poster.pdf) | PDF | CHI 2026 style research poster |
129
+ | [Protocol Spec Sheet](docs/ConvoSeed_ProtocolSpec.pdf) | PDF | One-page technical specification |
130
+ | [Presentation](docs/ConvoSeed_Presentation.pptx) | PPTX | 12-slide pitch deck |
131
+ | [W3C Note](docs/ConvoSeed_W3C_Note.pdf) | PDF | Submission to W3C AI Agent Protocol CG |
132
+
133
+ ---
134
+
135
+ ## Open Challenges
136
+
137
+ These are the three open research questions. Collaboration welcome — open an Issue.
138
+
139
+ 1. **Cross-Model Mapping** — translating a `.fp` fingerprint trained on SBERT embeddings into GPT-4 or other backbone spaces without re-encoding the original conversation.
140
+
141
+ 2. **CHUNKS Scaling** — formal composition rules for the CHUNKS section when threads exceed 500 messages, while preserving the fixed 200KB file size.
142
+
143
+ 3. **Incentive Design** — what makes AI platforms adopt an open standard that reduces their own lock-in?
144
+
145
+ ---
146
+
147
+ ## Status
148
+
149
+ > Early research. Proof-of-concept validated on real data. Open for collaboration.
150
+
151
+ - [x] Protocol specification (CSP-1 v0.2)
152
+ - [x] Proof-of-concept encoder/decoder
153
+ - [x] Speaker identification experiment (1,000 trials)
154
+ - [x] Multi-model validation (GPT-2, Gemma3:1b, Gemma3:12b)
155
+ - [x] Real conversation validation (524 messages)
156
+ - [ ] Multi-speaker support
157
+ - [ ] Cross-model mapping
158
+ - [ ] Public dataset (seeking contributors)
159
+ - [ ] W3C Community Group submission
160
+
161
+ ---
162
+
163
+ ## Licence
164
+
165
+ MIT. Open forever.
166
+
167
+ ---
168
+
169
+ ## Contact
170
+
171
+ Open an Issue for technical questions.
172
+ For collaboration or research enquiries: see CONTRIBUTING.md.
@@ -0,0 +1,36 @@
1
+ """
2
+ convoseed-agent
3
+ ===============
4
+ Capture, store, and retrieve agent conversation fingerprints.
5
+
6
+ Quick start:
7
+ from convoseed_agent import ConvoSeedSession
8
+
9
+ with ConvoSeedSession(task_type="summarization", success_score=0.9) as session:
10
+ session.add_message("user", "Summarize this document...")
11
+ session.add_message("assistant", "The document covers three main points...")
12
+ # → ~/.convoseed/sessions/summarization_20260225_143022.fp
13
+ """
14
+
15
+ from .encoder import (
16
+ encode_conversation,
17
+ read_fp_meta,
18
+ read_fp_hdc,
19
+ compare_fp,
20
+ merge_fp,
21
+ PROTOCOL_VERSION,
22
+ )
23
+ from .wrapper import ConvoSeedSession, convoseed_task
24
+ from .registry import (
25
+ index_directory,
26
+ query,
27
+ build_consensus,
28
+ list_task_types,
29
+ stats,
30
+ )
31
+ from .cache import SkillCache, SkillPrefix
32
+ from .scheduler import run_once, start_daemon
33
+
34
+ __version__ = "1.1.0"
35
+ __author__ = "Ashraful"
36
+ __protocol__ = "CSP-1 v1.1"