emlet 0.0.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,47 @@
1
+ Emlet License v1.0
2
+ Based on Apache License 2.0
3
+ Copyright (c) 2025 Basedwon
4
+
5
+ Licensed under the Apache License, Version 2.0 (the "License") with the following additional terms:
6
+
7
+ You may not use this file except in compliance with the License and the conditions below.
8
+ You may obtain a copy of the License at:
9
+
10
+ http://www.apache.org/licenses/LICENSE-2.0
11
+
12
+ Unless required by applicable law or agreed to in writing, software distributed under the License is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+
14
+ ---
15
+
16
+ ## Additional Terms Specific to Emlet
17
+
18
+ 1. **Attribution Must Be Preserved**
19
+
20
+ - You must retain the original name “Emlet” and the author handle “basedwon” in any use, distribution, derivative, or reference to this software.
21
+ - You may not represent Emlet or its output as originating from any other individual, group, or entity.
22
+
23
+ 2. **Repackaging or Rebranding is Prohibited**
24
+
25
+ - You may not rename, white-label, fork, or repackage Emlet under a different name without prior written permission.
26
+ - All forks or derivatives must clearly acknowledge the original source and include a link to the official repository.
27
+
28
+ 3. **No Commercial Resale Without Permission**
29
+
30
+ - Emlet may be used freely in personal, research, open-source, or internal commercial contexts.
31
+ - You may not charge for access to Emlet, resell it, or include it in paid products or services without explicit authorization.
32
+
33
+ 4. **Source Obfuscation is Intentional**
34
+
35
+ - Emlet is distributed in encrypted and obfuscated form by design. Attempts to reverse-engineer, decrypt, or extract its internal logic are strictly forbidden.
36
+ - You are licensed to use the public API as documented, but internal mechanisms are not licensed for inspection or modification.
37
+
38
+ 5. **Sovereign Software Clause**
39
+
40
+ - Emlet is a sovereign work. It is offered for use, not ownership. Your license to use it is a trust—not a transfer of authorship.
41
+ - Misuse—including misattribution, rebranding, or hostile forking—may result in revocation of usage rights.
42
+
43
+ ---
44
+
45
+ END OF TERMS AND CONDITIONS
46
+
47
+ For licensing exceptions or commercial terms, contact: basedwon@tuta.com
package/README.md ADDED
@@ -0,0 +1,179 @@
1
+ # Emlet
2
+
3
+ > **An embedding engine built for the sovereign web.**
4
+
5
+ [![npm](https://img.shields.io/npm/v/emlet?style=flat&logo=npm)](https://www.npmjs.com/package/emlet)
6
+ [![pipeline](https://gitlab.com/basedwon/emlet/badges/master/pipeline.svg)](https://gitlab.com/basedwon/emlet/-/pipelines)
7
+ [![license](https://img.shields.io/npm/l/emlet)](https://gitlab.com/basedwon/emlet/-/blob/master/LICENSE)
8
+ [![downloads](https://img.shields.io/npm/dw/emlet)](https://www.npmjs.com/package/emlet)
9
+
10
+ [![Gitlab](https://img.shields.io/badge/Gitlab%20-%20?logo=gitlab&color=%23383a40)](https://gitlab.com/basedwon/emlet)
11
+ [![Github](https://img.shields.io/badge/Github%20-%20?logo=github&color=%23383a40)](https://github.com/basedwon/emlet)
12
+ [![Twitter](https://img.shields.io/badge/@basdwon%20-%20?logo=twitter&color=%23383a40)](https://twitter.com/basdwon)
13
+ [![Discord](https://img.shields.io/badge/Basedwon%20-%20?logo=discord&color=%23383a40)](https://discordapp.com/users/basedwon)
14
+
15
+ Emlet is a fast, fully self-contained semantic embedding engine designed to run anywhere JavaScript runs—browser, Node, edge, offline. No dependencies, no GPU, no network calls. Just load and embed.
16
+
17
+ The entire engine fits in 1 MB and produces deterministic vector embeddings suitable for similarity search, clustering, retrieval, tagging, or downstream ML workflows.
18
+
19
+ ## Features
20
+
21
+ - 100M parameters, ~1MB total size
22
+ - 7K tokens/sec throughput (in the browser)
23
+ - Deterministic output (same input → same vector)
24
+ - Out-of-vocabulary synthesis (no missing tokens)
25
+ - Unicode-aware (text, emoji, symbols, ZWJ)
26
+ - Configurable vector size (1-1568D)
27
+ - Offline-first, zero dependencies
28
+ - Vanilla JavaScript, edge-ready
29
+ - No GPU. No cloud. No API.
30
+ - Self-extracting runtime
31
+ - Neuro-symbolic core
32
+ - A digital familiar
33
+
34
+ ## Installation
35
+
36
+ ```bash
37
+ npm install emlet
38
+ ```
39
+
40
+ Or load directly via CDN:
41
+
42
+ ```html
43
+ <script src="https://unpkg.com/emlet"></script>
44
+ ```
45
+
46
+ This exposes both `emlet` (a preloaded instance) and `Emlet` (the class) globally.
47
+
48
+ ## Importing
49
+
50
+ ```js
51
+ // CommonJS
52
+ const emlet = require('emlet')
53
+ const { emlet, Emlet } = require('emlet')
54
+
55
+ // ESM
56
+ import emlet from 'emlet'
57
+ import { emlet, Emlet } from 'emlet'
58
+ ```
59
+
60
+ ## Basic Usage
61
+
62
+ ```js
63
+ const vec = emlet.embed('Hello, world!')
64
+ console.log(vec)
65
+ // → [0.08, -0.01, ...] (96-dimensional vector by default)
66
+ ```
67
+
68
+ The default export is a ready-to-use model instance.
69
+
70
+ ## Custom Models
71
+
72
+ You can create your own instance with a different output size:
73
+
74
+ ```js
75
+ const modelA = new Emlet() // 96D default
76
+ const modelB = new Emlet(128) // 128D output
77
+ const modelC = new Emlet(256, true) // 256D head + 32D tail = 288D
78
+ ```
79
+
80
+ ### Constructor
81
+
82
+ ```js
83
+ new Emlet(dim = 96, useTail = false)
84
+ ```
85
+
86
+ * `dim`
87
+ Number of dimensions to emit from the primary embedding space.
88
+
89
+ * `useTail`
90
+ When `true`, appends a 32-dimensional “glimpse” of the full 1536D semantic space to every vector.
91
+
92
+ This allows output sizes from 1 up to 1536 dimensions, or 1568 when the tail is enabled.
93
+
94
+ ## Out-of-Vocabulary Synthesis
95
+
96
+ Tokens not present in the internal vocabulary are synthesized deterministically:
97
+
98
+ ```js
99
+ emlet.embed('quantaflux')
100
+ ```
101
+
102
+ There are no unknown tokens and no fallbacks to zero vectors.
103
+
104
+ ## Unicode and Emoji Support
105
+
106
+ Emlet natively handles Unicode symbols, emoji, modifiers, and ZWJ sequences:
107
+
108
+ ```js
109
+ emlet.embed('🦄')
110
+ emlet.embed('👩🏽‍🚀')
111
+ ```
112
+
113
+ These are embedded consistently and can be compared using standard vector similarity.
114
+
115
+ ## Punctuation Handling
116
+
117
+ Punctuation is normally stripped during tokenization.
118
+ If the input is a **single character**, it is embedded as-is:
119
+
120
+ ```js
121
+ emlet.embed('.')
122
+ emlet.embed('[')
123
+ ```
124
+
125
+ This allows punctuation-level modeling when needed without polluting normal text embeddings.
126
+
127
+ ## API Surface
128
+
129
+ Emlet intentionally exposes a minimal API:
130
+
131
+ * `embed(text: string): number[]`
132
+ * `new Emlet(dim?: number, useTail?: boolean)`
133
+
134
+ Everything else—chunking, similarity, indexing, clustering—is left to userland.
135
+
136
+ ## Examples
137
+
138
+ See [`test.js`](./test.js) for example usage including batch encoding, similarity math, and vector inspection.
139
+
140
+
141
+ ## Testing
142
+
143
+ Emlet includes a test suite built with [testr](https://npmjs.com/package/@basd/testr).
144
+
145
+ To run the test, first clone the repository:
146
+
147
+ ```sh
148
+ git clone https://github.com/basedwon/emlet.git
149
+ ```
150
+
151
+ Install the dependencies, then run `npm test`:
152
+
153
+ ```bash
154
+ npm install
155
+ npm test
156
+ ```
157
+
158
+ ## Donations
159
+
160
+ If Emlet sparks something useful in your work, consider sending some coin to support further development.
161
+
162
+ **Bitcoin (BTC):**
163
+ ```
164
+ 1JUb1yNFH6wjGekRUW6Dfgyg4J4h6wKKdF
165
+ ```
166
+
167
+ **Monero (XMR):**
168
+ ```
169
+ 46uV2fMZT3EWkBrGUgszJCcbqFqEvqrB4bZBJwsbx7yA8e2WBakXzJSUK8aqT4GoqERzbg4oKT2SiPeCgjzVH6VpSQ5y7KQ
170
+ ```
171
+
172
+ ## License
173
+
174
+ **Emlet License v1.0 (based on Apache 2.0)**
175
+ Use is permitted with attribution. Redistribution, rebranding, resale, and reverse engineering are prohibited without written permission.
176
+
177
+ See [`LICENSE`](./LICENSE) for full terms.
178
+ Contact: `basedwon@tuta.com` for commercial or licensing inquiries.
179
+