@aleph-ai/tinyaleph 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +278 -0
  3. package/backends/cryptographic/index.js +196 -0
  4. package/backends/index.js +15 -0
  5. package/backends/interface.js +89 -0
  6. package/backends/scientific/index.js +272 -0
  7. package/backends/semantic/index.js +527 -0
  8. package/backends/semantic/surface.js +393 -0
  9. package/backends/semantic/two-layer.js +375 -0
  10. package/core/fano.js +127 -0
  11. package/core/hilbert.js +564 -0
  12. package/core/hypercomplex.js +141 -0
  13. package/core/index.js +133 -0
  14. package/core/llm.js +132 -0
  15. package/core/prime.js +184 -0
  16. package/core/resonance.js +695 -0
  17. package/core/rformer-tf.js +1086 -0
  18. package/core/rformer.js +806 -0
  19. package/core/sieve.js +350 -0
  20. package/data.json +8163 -0
  21. package/docs/EXAMPLES_PLAN.md +293 -0
  22. package/docs/README.md +159 -0
  23. package/docs/design/ALEPH_CHAT_ARCHITECTURE.md +499 -0
  24. package/docs/guide/01-quickstart.md +298 -0
  25. package/docs/guide/02-semantic-computing.md +409 -0
  26. package/docs/guide/03-cryptographic.md +420 -0
  27. package/docs/guide/04-scientific.md +494 -0
  28. package/docs/guide/05-llm-integration.md +568 -0
  29. package/docs/guide/06-advanced.md +996 -0
  30. package/docs/guide/README.md +188 -0
  31. package/docs/reference/01-core.md +695 -0
  32. package/docs/reference/02-physics.md +601 -0
  33. package/docs/reference/03-backends.md +892 -0
  34. package/docs/reference/04-engine.md +632 -0
  35. package/docs/reference/README.md +252 -0
  36. package/docs/theory/01-prime-semantics.md +327 -0
  37. package/docs/theory/02-hypercomplex-algebra.md +421 -0
  38. package/docs/theory/03-phase-synchronization.md +364 -0
  39. package/docs/theory/04-entropy-reasoning.md +348 -0
  40. package/docs/theory/05-non-commutativity.md +402 -0
  41. package/docs/theory/06-two-layer-meaning.md +414 -0
  42. package/docs/theory/07-resonant-field-interface.md +419 -0
  43. package/docs/theory/08-semantic-sieve.md +520 -0
  44. package/docs/theory/09-temporal-emergence.md +298 -0
  45. package/docs/theory/10-quaternionic-memory.md +415 -0
  46. package/docs/theory/README.md +162 -0
  47. package/engine/aleph.js +418 -0
  48. package/engine/index.js +7 -0
  49. package/index.js +23 -0
  50. package/modular.js +254 -0
  51. package/package.json +99 -0
  52. package/physics/collapse.js +95 -0
  53. package/physics/entropy.js +88 -0
  54. package/physics/index.js +65 -0
  55. package/physics/kuramoto.js +91 -0
  56. package/physics/lyapunov.js +80 -0
  57. package/physics/oscillator.js +95 -0
  58. package/types/index.d.ts +575 -0
@@ -0,0 +1,520 @@
1
+ # The Semantic Sieve
2
+
3
+ ## The Prime Uniqueness Problem
4
+
5
+ For semantic computing to work, every concept must have a **unique** prime signature. But initial assignments are often too coarse:
6
+
7
+ ```
8
+ lake → [water, location] → [2, 5]
9
+ ocean → [water, location] → [2, 5]
10
+
11
+ COLLISION! Same primes, different meanings!
12
+ ```
13
+
14
+ The **Semantic Sieve** algorithm ensures the **Prime Uniqueness Invariant**: every word has a distinct prime signature.
15
+
16
+ ---
17
+
18
+ ## Theoretical Context
19
+
20
+ The goal is to map a lexicon of words to unique points in **Twist Space**. Since prime numbers correspond to irreducible twist operations, a word's definition is the composite twist (product) of its constituent semantic primes.
21
+
22
+ When distinct words collapse into the same composite number, we need **semantic differentiation** to resolve these collisions.
23
+
24
+ ---
25
+
26
+ ## Data Structures
27
+
28
+ Three core registries maintain the state:
29
+
30
+ ### 1. PrimeRegistry
31
+
32
+ A monotonic iterator of prime numbers:
33
+
34
+ ```javascript
35
+ class PrimeRegistry {
36
+ constructor(existingPrimes) {
37
+ this.used = new Set(existingPrimes);
38
+ this.max = existingPrimes.length > 0
39
+ ? Math.max(...existingPrimes)
40
+ : 1;
41
+ }
42
+
43
+ next() {
44
+ let candidate = this.max + 1;
45
+ while (true) {
46
+ if (isPrime(candidate) && !this.used.has(candidate)) {
47
+ this.used.add(candidate);
48
+ this.max = candidate;
49
+ return candidate;
50
+ }
51
+ candidate++;
52
+ }
53
+ }
54
+ }
55
+ ```
56
+
57
+ ### 2. ConceptMap
58
+
59
+ A bijection between human-readable concepts and primes:
60
+
61
+ ```javascript
62
+ const conceptMap = {
63
+ "physical": 2,
64
+ "living": 3,
65
+ "sentient": 5,
66
+ "aquatic": 7,
67
+ "large": 11,
68
+ "contained": 13,
69
+ // ...
70
+ };
71
+ ```
72
+
73
+ ### 3. LexiconLedger
74
+
75
+ The current state of all words and their assigned prime factors:
76
+
77
+ ```javascript
78
+ const lexicon = {
79
+ "human": [2, 3, 5],
80
+ "dog": [2, 3],
81
+ "lake": [2, 5, 13],
82
+ "ocean": [2, 5, 11],
83
+ // ...
84
+ };
85
+ ```
86
+
87
+ ---
88
+
89
+ ## The Sieve Algorithm
90
+
91
+ ### Overview
92
+
93
+ ```
94
+ 1. COMPUTE signatures for all words
95
+ 2. CLUSTER words with identical signatures
96
+ 3. FOR each cluster with >1 word:
97
+ a. IF cluster > 10 words: MACRO strategy
98
+ b. ELSE: MICRO strategy
99
+ 4. MINT new primes for new distinctions
100
+ 5. REPEAT until all signatures unique
101
+ ```
102
+
103
+ ### Strategy A: Macro (Large Clusters)
104
+
105
+ For clusters with > 10 words, ask for broad sub-categories:
106
+
107
+ ```javascript
108
+ // Example cluster: 50 "animal" words with signature [2, 3]
109
+
110
+ const prompt = `
111
+ You are a semantic ontologist.
112
+ The following words are grouped as "physical, living".
113
+ Divide this list into 3-5 distinct sub-categories.
114
+
115
+ Words: dog, cat, eagle, salmon, ant, whale, ...
116
+
117
+ Return JSON: {"categories": {"CategoryName": ["word1", "word2", ...]}}
118
+ `;
119
+
120
+ // Result:
121
+ {
122
+ "categories": {
123
+ "Mammal": ["dog", "cat", "whale"],
124
+ "Bird": ["eagle", "sparrow"],
125
+ "Fish": ["salmon", "trout"],
126
+ "Insect": ["ant", "bee"]
127
+ }
128
+ }
129
+ ```
130
+
131
+ Each new category gets a new prime:
132
+ - Mammal → prime 127
133
+ - Bird → prime 131
134
+ - Fish → prime 137
135
+ - Insect → prime 139
136
+
137
+ ### Strategy B: Micro (Small Clusters)
138
+
139
+ For clusters with ≤ 10 words, find distinguishing features for pairs:
140
+
141
+ ```javascript
142
+ // Cluster: [lake, ocean] with signature [2, 5, 7]
143
+
144
+ const prompt = `
145
+ Compare "lake" and "ocean".
146
+ They share concepts: [physical, form, aquatic].
147
+
148
+ Provide ONE concept TRUE for "lake" but FALSE for "ocean".
149
+ `;
150
+
151
+ // Result: "contained" (lakes are contained, oceans are not)
152
+
153
+ // Add prime for "contained" to "lake"
154
+ lake = [2, 5, 7, 13] // Now distinct from ocean
155
+ ```
156
+
157
+ ---
158
+
159
+ ## Implementation
160
+
161
+ ```javascript
162
+ class Sieve {
163
+ constructor() {
164
+ this.data = require('./data.json');
165
+
166
+ // Initialize registries
167
+ const usedPrimes = [
168
+ ...this.data.primes,
169
+ ...Object.keys(this.data.ontology).map(Number),
170
+ ...Object.values(this.data.vocabulary).flat()
171
+ ];
172
+ this.primes = new PrimeRegistry(usedPrimes);
173
+
174
+ // Build concept→prime map
175
+ this.conceptToPrime = new Map();
176
+ for (const [p, label] of Object.entries(this.data.ontology)) {
177
+ this.conceptToPrime.set(label.toLowerCase(), Number(p));
178
+ }
179
+
180
+ this.stats = {
181
+ collisionsResolved: 0,
182
+ conceptsCreated: 0,
183
+ primesMinted: 0
184
+ };
185
+ }
186
+
187
+ analyzeCollisions() {
188
+ const signatureMap = new Map();
189
+
190
+ for (const [word, primes] of Object.entries(this.data.vocabulary)) {
191
+ const signature = [...primes].sort((a, b) => a - b).join(',');
192
+
193
+ if (!signatureMap.has(signature)) {
194
+ signatureMap.set(signature, []);
195
+ }
196
+ signatureMap.get(signature).push(word);
197
+ }
198
+
199
+ // Return clusters with collisions, sorted by size
200
+ return [...signatureMap.entries()]
201
+ .filter(([sig, words]) => words.length > 1)
202
+ .sort((a, b) => b[1].length - a[1].length);
203
+ }
204
+
205
+ getOrMintPrime(concept) {
206
+ const key = concept.toLowerCase().trim();
207
+
208
+ if (this.conceptToPrime.has(key)) {
209
+ return this.conceptToPrime.get(key);
210
+ }
211
+
212
+ // Mint new prime
213
+ const newPrime = this.primes.next();
214
+ this.conceptToPrime.set(key, newPrime);
215
+ this.data.ontology[newPrime] = concept;
216
+
217
+ if (!this.data.primes.includes(newPrime)) {
218
+ this.data.primes.push(newPrime);
219
+ }
220
+
221
+ this.stats.primesMinted++;
222
+ this.stats.conceptsCreated++;
223
+
224
+ return newPrime;
225
+ }
226
+
227
+ async resolveCluster(signature, words) {
228
+ const currentPrimes = signature.split(',').map(Number);
229
+ const existingConcepts = currentPrimes
230
+ .map(p => this.data.ontology[p] || `P${p}`)
231
+ .join(', ');
232
+
233
+ if (words.length > 10) {
234
+ // Strategy A: Macro categorization
235
+ await this.macroStrategy(words, existingConcepts);
236
+ } else {
237
+ // Strategy B: Micro discrimination
238
+ await this.microStrategy(words, existingConcepts);
239
+ }
240
+ }
241
+
242
+ async macroStrategy(words, existingConcepts) {
243
+ // Use LLM to categorize into subcategories
244
+ const result = await LLM.chat([{
245
+ role: 'system',
246
+ content: `Categorize these words into 3-5 sub-categories.
247
+ Current concepts: ${existingConcepts}
248
+ Return JSON: {"categories": {"Name": ["word1", ...]}}`
249
+ }, {
250
+ role: 'user',
251
+ content: `Words: ${words.slice(0, 50).join(', ')}`
252
+ }]);
253
+
254
+ const categories = JSON.parse(result.content).categories;
255
+
256
+ for (const [catName, wordList] of Object.entries(categories)) {
257
+ const prime = this.getOrMintPrime(catName);
258
+
259
+ for (const word of wordList) {
260
+ const current = this.data.vocabulary[word];
261
+ if (current && !current.includes(prime)) {
262
+ current.push(prime);
263
+ }
264
+ }
265
+ }
266
+
267
+ this.stats.collisionsResolved++;
268
+ }
269
+
270
+ async microStrategy(words, existingConcepts) {
271
+ // Discriminate between first two words
272
+ const [wordA, wordB] = words;
273
+
274
+ const result = await LLM.chat([{
275
+ role: 'system',
276
+ content: `Compare "${wordA}" and "${wordB}".
277
+ They share: ${existingConcepts}.
278
+ Provide ONE concept TRUE for "${wordA}" but FALSE for "${wordB}".
279
+ Return JSON: {"concept": "...", "reasoning": "..."}`
280
+ }]);
281
+
282
+ const { concept } = JSON.parse(result.content);
283
+ const prime = this.getOrMintPrime(concept);
284
+
285
+ // Add prime only to wordA
286
+ const current = this.data.vocabulary[wordA];
287
+ if (current && !current.includes(prime)) {
288
+ current.push(prime);
289
+ }
290
+
291
+ this.stats.collisionsResolved++;
292
+ }
293
+
294
+ async run(maxIterations = 25) {
295
+ console.log('🕸️ Semantic Sieve Initialized');
296
+
297
+ for (let i = 0; i < maxIterations; i++) {
298
+ const collisions = this.analyzeCollisions();
299
+
300
+ if (collisions.length === 0) {
301
+ console.log('🎉 Prime Uniqueness Invariant Satisfied!');
302
+ break;
303
+ }
304
+
305
+ console.log(`Pass ${i + 1}: ${collisions.length} clusters`);
306
+
307
+ const [signature, cluster] = collisions[0];
308
+ await this.resolveCluster(signature, cluster);
309
+
310
+ this.save();
311
+ }
312
+
313
+ console.log(`📊 Complete:
314
+ Collisions Resolved: ${this.stats.collisionsResolved}
315
+ New Concepts: ${this.stats.conceptsCreated}
316
+ Primes Minted: ${this.stats.primesMinted}`);
317
+ }
318
+ }
319
+ ```
320
+
321
+ ---
322
+
323
+ ## The Sieve Flow
324
+
325
+ ```
326
+ ┌─────────────────────────────────────────────────────────────┐
327
+ │ START: Ingest Lexicon │
328
+ └─────────────────────────────────────────────────────────────┘
329
+
330
+
331
+ ┌─────────────────────────────────────────────────────────────┐
332
+ │ Compute Prime Signatures │
333
+ │ word → primes → product/signature │
334
+ └─────────────────────────────────────────────────────────────┘
335
+
336
+
337
+ ┌─────────────────┐
338
+ │ Collisions? │
339
+ └────────┬────────┘
340
+
341
+ ┌──────────────┴──────────────┐
342
+ │ │
343
+ ▼ No ▼ Yes
344
+ ┌─────────────────┐ ┌─────────────────┐
345
+ │ DONE │ │ Select Largest │
346
+ │ All Unique! │ │ Cluster │
347
+ └─────────────────┘ └────────┬────────┘
348
+
349
+
350
+ ┌────────────────┐
351
+ │ Cluster Size? │
352
+ └───────┬────────┘
353
+
354
+ ┌────────────────┴────────────────┐
355
+ │ │
356
+ ▼ > 10 ▼ ≤ 10
357
+ ┌─────────────────┐ ┌─────────────────┐
358
+ │ MACRO Strategy │ │ MICRO Strategy │
359
+ │ Categorize │ │ Discriminate │
360
+ └────────┬────────┘ └────────┬────────┘
361
+ │ │
362
+ └────────────────┬────────────────┘
363
+
364
+
365
+ ┌─────────────────┐
366
+ │ Mint/Reuse Prime│
367
+ │ for Concept │
368
+ └────────┬────────┘
369
+
370
+
371
+ ┌─────────────────┐
372
+ │ Assign Prime to │
373
+ │ Target Words │
374
+ └────────┬────────┘
375
+
376
+ └──────────► [Back to Compute]
377
+ ```
378
+
379
+ ---
380
+
381
+ ## Efficiency Optimizations
382
+
383
+ ### Signature Computation
384
+
385
+ Use sum of logarithms to avoid integer overflow:
386
+
387
+ ```javascript
388
+ function computeSignature(primes) {
389
+ // Instead of: product = Π pᵢ (overflows for large products)
390
+ // Use: log_signature = Σ log(pᵢ)
391
+ return primes.reduce((sum, p) => sum + Math.log(p), 0);
392
+ }
393
+ ```
394
+
395
+ ### Prime Reuse
396
+
397
+ Before minting new primes, check if concept already exists:
398
+
399
+ ```javascript
400
+ getOrMintPrime(concept) {
401
+ const normalized = concept.toLowerCase().trim();
402
+
403
+ // Check existing concepts
404
+ if (this.conceptToPrime.has(normalized)) {
405
+ return this.conceptToPrime.get(normalized); // Reuse!
406
+ }
407
+
408
+ // Only mint if truly new
409
+ return this.mintNewPrime(concept);
410
+ }
411
+ ```
412
+
413
+ ### Batch Processing
414
+
415
+ Process words in batches to reduce LLM calls:
416
+
417
+ ```javascript
418
+ // Instead of one word at a time:
419
+ const batchWords = words.slice(0, 50);
420
+ const result = await categorize(batchWords);
421
+ ```
422
+
423
+ ---
424
+
425
+ ## Example Sieve Run
426
+
427
+ ```
428
+ Initial State:
429
+ lake = [2, 5] (physical, form)
430
+ ocean = [2, 5] (physical, form)
431
+ pond = [2, 5] (physical, form)
432
+ sea = [2, 5] (physical, form)
433
+
434
+ Pass 1:
435
+ Cluster: [lake, ocean, pond, sea]
436
+ Strategy: Macro (4 words)
437
+
438
+ LLM categorizes:
439
+ - "Enclosed": [lake, pond]
440
+ - "Open": [ocean, sea]
441
+
442
+ Mint prime 127 for "Enclosed"
443
+ Mint prime 131 for "Open"
444
+
445
+ Result:
446
+ lake = [2, 5, 127]
447
+ pond = [2, 5, 127]
448
+ ocean = [2, 5, 131]
449
+ sea = [2, 5, 131]
450
+
451
+ Pass 2:
452
+ Cluster A: [lake, pond] with [2, 5, 127]
453
+ Cluster B: [ocean, sea] with [2, 5, 131]
454
+
455
+ Strategy: Micro for each
456
+
457
+ lake vs pond: "Large" is true for lake, false for pond
458
+ ocean vs sea: "Unbounded" is true for ocean, false for sea
459
+
460
+ Mint prime 137 for "Large"
461
+ Mint prime 139 for "Unbounded"
462
+
463
+ Result:
464
+ lake = [2, 5, 127, 137] ✓ Unique
465
+ pond = [2, 5, 127] ✓ Unique
466
+ ocean = [2, 5, 131, 139] ✓ Unique
467
+ sea = [2, 5, 131] ✓ Unique
468
+
469
+ Pass 3:
470
+ No collisions!
471
+ 🎉 Prime Uniqueness Invariant Satisfied!
472
+ ```
473
+
474
+ ---
475
+
476
+ ## Integration with QMF
477
+
478
+ The Semantic Sieve supports the Quaternionic Memory Field (QMF) framework:
479
+
480
+ ### Prime Hilbert Space Initialization
481
+
482
+ The sieve populates the |pᵢ⟩ basis vectors:
483
+
484
+ ```
485
+ |Ψ⟩ = Σᵢ qᵢ |pᵢ⟩
486
+ ```
487
+
488
+ Each unique prime becomes a basis vector in the semantic Hilbert space.
489
+
490
+ ### Resonance Filtering
491
+
492
+ Unique prime factorizations ensure the Jaccard similarity metric is non-degenerate:
493
+
494
+ ```
495
+ R(w₁, w₂) = |primes(w₁) ∩ primes(w₂)| / |primes(w₁) ∪ primes(w₂)|
496
+ ```
497
+
498
+ Without unique signatures, R would incorrectly identify different words as identical.
499
+
500
+ ### Topological Stability
501
+
502
+ Following the Prime-Irreducibility Correspondence, the sieve ensures complex ideas are built from irreducible twist states, preventing topological defects in the memory field.
503
+
504
+ ---
505
+
506
+ ## Summary
507
+
508
+ The Semantic Sieve:
509
+
510
+ 1. **Detects collisions** - words with identical prime signatures
511
+ 2. **Resolves through differentiation** - finding distinguishing concepts
512
+ 3. **Mints new primes** - for newly identified distinctions
513
+ 4. **Ensures uniqueness** - every word gets a unique signature
514
+ 5. **Supports semantic computation** - by enabling proper prime arithmetic
515
+
516
+ The sieve is the initialization engine for semantic computing—it transforms a crude lexicon into a mathematically rigorous semantic space.
517
+
518
+ ---
519
+
520
+ ## Back to: [Theory Overview →](./README.md)