emlet 0.0.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +47 -0
- package/README.md +179 -0
- package/emlet.js +35 -0
- package/package.json +56 -7
- package/types.d.ts +11 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
Emlet License v1.0
|
|
2
|
+
Based on Apache License 2.0
|
|
3
|
+
Copyright (c) 2025 Basedwon
|
|
4
|
+
|
|
5
|
+
Licensed under the Apache License, Version 2.0 (the "License") with the following additional terms:
|
|
6
|
+
|
|
7
|
+
You may not use this file except in compliance with the License and the conditions below.
|
|
8
|
+
You may obtain a copy of the License at:
|
|
9
|
+
|
|
10
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
|
11
|
+
|
|
12
|
+
Unless required by applicable law or agreed to in writing, software distributed under the License is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## Additional Terms Specific to Emlet
|
|
17
|
+
|
|
18
|
+
1. **Attribution Must Be Preserved**
|
|
19
|
+
|
|
20
|
+
- You must retain the original name “Emlet” and the author handle “basedwon” in any use, distribution, derivative, or reference to this software.
|
|
21
|
+
- You may not represent Emlet or its output as originating from any other individual, group, or entity.
|
|
22
|
+
|
|
23
|
+
2. **Repackaging or Rebranding is Prohibited**
|
|
24
|
+
|
|
25
|
+
- You may not rename, white-label, fork, or repackage Emlet under a different name without prior written permission.
|
|
26
|
+
- All forks or derivatives must clearly acknowledge the original source and include a link to the official repository.
|
|
27
|
+
|
|
28
|
+
3. **No Commercial Resale Without Permission**
|
|
29
|
+
|
|
30
|
+
- Emlet may be used freely in personal, research, open-source, or internal commercial contexts.
|
|
31
|
+
- You may not charge for access to Emlet, resell it, or include it in paid products or services without explicit authorization.
|
|
32
|
+
|
|
33
|
+
4. **Source Obfuscation is Intentional**
|
|
34
|
+
|
|
35
|
+
- Emlet is distributed in encrypted and obfuscated form by design. Attempts to reverse-engineer, decrypt, or extract its internal logic are strictly forbidden.
|
|
36
|
+
- You are licensed to use the public API as documented, but internal mechanisms are not licensed for inspection or modification.
|
|
37
|
+
|
|
38
|
+
5. **Sovereign Software Clause**
|
|
39
|
+
|
|
40
|
+
- Emlet is a sovereign work. It is offered for use, not ownership. Your license to use it is a trust—not a transfer of authorship.
|
|
41
|
+
- Misuse—including misattribution, rebranding, or hostile forking—may result in revocation of usage rights.
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
END OF TERMS AND CONDITIONS
|
|
46
|
+
|
|
47
|
+
For licensing exceptions or commercial terms, contact: basedwon@tuta.com
|
package/README.md
ADDED
|
@@ -0,0 +1,179 @@
|
|
|
1
|
+
# Emlet
|
|
2
|
+
|
|
3
|
+
> **An embedding engine built for the sovereign web.**
|
|
4
|
+
|
|
5
|
+
[](https://www.npmjs.com/package/emlet)
|
|
6
|
+
[](https://gitlab.com/basedwon/emlet/-/pipelines)
|
|
7
|
+
[](https://gitlab.com/basedwon/emlet/-/blob/master/LICENSE)
|
|
8
|
+
[](https://www.npmjs.com/package/emlet)
|
|
9
|
+
|
|
10
|
+
[](https://gitlab.com/basedwon/emlet)
|
|
11
|
+
[](https://github.com/basedwon/emlet)
|
|
12
|
+
[](https://twitter.com/basdwon)
|
|
13
|
+
[](https://discordapp.com/users/basedwon)
|
|
14
|
+
|
|
15
|
+
Emlet is a fast, fully self-contained semantic embedding engine designed to run anywhere JavaScript runs—browser, Node, edge, offline. No dependencies, no GPU, no network calls. Just load and embed.
|
|
16
|
+
|
|
17
|
+
The entire engine fits in 1 MB and produces deterministic vector embeddings suitable for similarity search, clustering, retrieval, tagging, or downstream ML workflows.
|
|
18
|
+
|
|
19
|
+
## Features
|
|
20
|
+
|
|
21
|
+
- 100M parameters, ~1MB total size
|
|
22
|
+
- 7K tokens/sec throughput (in the browser)
|
|
23
|
+
- Deterministic output (same input → same vector)
|
|
24
|
+
- Out-of-vocabulary synthesis (no missing tokens)
|
|
25
|
+
- Unicode-aware (text, emoji, symbols, ZWJ)
|
|
26
|
+
- Configurable vector size (1-1568D)
|
|
27
|
+
- Offline-first, zero dependencies
|
|
28
|
+
- Vanilla JavaScript, edge-ready
|
|
29
|
+
- No GPU. No cloud. No API.
|
|
30
|
+
- Self-extracting runtime
|
|
31
|
+
- Neuro-symbolic core
|
|
32
|
+
- A digital familiar
|
|
33
|
+
|
|
34
|
+
## Installation
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
npm install emlet
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
Or load directly via CDN:
|
|
41
|
+
|
|
42
|
+
```html
|
|
43
|
+
<script src="https://unpkg.com/emlet"></script>
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
This exposes both `emlet` (a preloaded instance) and `Emlet` (the class) globally.
|
|
47
|
+
|
|
48
|
+
## Importing
|
|
49
|
+
|
|
50
|
+
```js
|
|
51
|
+
// CommonJS
|
|
52
|
+
const emlet = require('emlet')
|
|
53
|
+
const { emlet, Emlet } = require('emlet')
|
|
54
|
+
|
|
55
|
+
// ESM
|
|
56
|
+
import emlet from 'emlet'
|
|
57
|
+
import { emlet, Emlet } from 'emlet'
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Basic Usage
|
|
61
|
+
|
|
62
|
+
```js
|
|
63
|
+
const vec = emlet.embed('Hello, world!')
|
|
64
|
+
console.log(vec)
|
|
65
|
+
// → [0.08, -0.01, ...] (96-dimensional vector by default)
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
The default export is a ready-to-use model instance.
|
|
69
|
+
|
|
70
|
+
## Custom Models
|
|
71
|
+
|
|
72
|
+
You can create your own instance with a different output size:
|
|
73
|
+
|
|
74
|
+
```js
|
|
75
|
+
const modelA = new Emlet() // 96D default
|
|
76
|
+
const modelB = new Emlet(128) // 128D output
|
|
77
|
+
const modelC = new Emlet(256, true) // 256D head + 32D tail = 288D
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Constructor
|
|
81
|
+
|
|
82
|
+
```js
|
|
83
|
+
new Emlet(dim = 96, useTail = false)
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
* `dim`
|
|
87
|
+
Number of dimensions to emit from the primary embedding space.
|
|
88
|
+
|
|
89
|
+
* `useTail`
|
|
90
|
+
When `true`, appends a 32-dimensional “glimpse” of the full 1536D semantic space to every vector.
|
|
91
|
+
|
|
92
|
+
This allows output sizes from 1 up to 1536 dimensions, or 1568 when the tail is enabled.
|
|
93
|
+
|
|
94
|
+
## Out-of-Vocabulary Synthesis
|
|
95
|
+
|
|
96
|
+
Tokens not present in the internal vocabulary are synthesized deterministically:
|
|
97
|
+
|
|
98
|
+
```js
|
|
99
|
+
emlet.embed('quantaflux')
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
There are no unknown tokens and no fallbacks to zero vectors.
|
|
103
|
+
|
|
104
|
+
## Unicode and Emoji Support
|
|
105
|
+
|
|
106
|
+
Emlet natively handles Unicode symbols, emoji, modifiers, and ZWJ sequences:
|
|
107
|
+
|
|
108
|
+
```js
|
|
109
|
+
emlet.embed('🦄')
|
|
110
|
+
emlet.embed('👩🏽🚀')
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
These are embedded consistently and can be compared using standard vector similarity.
|
|
114
|
+
|
|
115
|
+
## Punctuation Handling
|
|
116
|
+
|
|
117
|
+
Punctuation is normally stripped during tokenization.
|
|
118
|
+
If the input is a **single character**, it is embedded as-is:
|
|
119
|
+
|
|
120
|
+
```js
|
|
121
|
+
emlet.embed('.')
|
|
122
|
+
emlet.embed('[')
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
This allows punctuation-level modeling when needed without polluting normal text embeddings.
|
|
126
|
+
|
|
127
|
+
## API Surface
|
|
128
|
+
|
|
129
|
+
Emlet intentionally exposes a minimal API:
|
|
130
|
+
|
|
131
|
+
* `embed(text: string): number[]`
|
|
132
|
+
* `new Emlet(dim?: number, useTail?: boolean)`
|
|
133
|
+
|
|
134
|
+
Everything else—chunking, similarity, indexing, clustering—is left to userland.
|
|
135
|
+
|
|
136
|
+
## Examples
|
|
137
|
+
|
|
138
|
+
See [`test.js`](./test.js) for example usage including batch encoding, similarity math, and vector inspection.
|
|
139
|
+
|
|
140
|
+
|
|
141
|
+
## Testing
|
|
142
|
+
|
|
143
|
+
Emlet includes a test suite built with [testr](https://npmjs.com/package/@basd/testr).
|
|
144
|
+
|
|
145
|
+
To run the test, first clone the repository:
|
|
146
|
+
|
|
147
|
+
```sh
|
|
148
|
+
git clone https://github.com/basedwon/emlet.git
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
Install the dependencies, then run `npm test`:
|
|
152
|
+
|
|
153
|
+
```bash
|
|
154
|
+
npm install
|
|
155
|
+
npm test
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
## Donations
|
|
159
|
+
|
|
160
|
+
If Emlet sparks something useful in your work, consider sending some coin to support further development.
|
|
161
|
+
|
|
162
|
+
**Bitcoin (BTC):**
|
|
163
|
+
```
|
|
164
|
+
1JUb1yNFH6wjGekRUW6Dfgyg4J4h6wKKdF
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
**Monero (XMR):**
|
|
168
|
+
```
|
|
169
|
+
46uV2fMZT3EWkBrGUgszJCcbqFqEvqrB4bZBJwsbx7yA8e2WBakXzJSUK8aqT4GoqERzbg4oKT2SiPeCgjzVH6VpSQ5y7KQ
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
## License
|
|
173
|
+
|
|
174
|
+
**Emlet License v1.0 (based on Apache 2.0)**
|
|
175
|
+
Use is permitted with attribution. Redistribution, rebranding, resale, and reverse engineering are prohibited without written permission.
|
|
176
|
+
|
|
177
|
+
See [`LICENSE`](./LICENSE) for full terms.
|
|
178
|
+
Contact: `basedwon@tuta.com` for commercial or licensing inquiries.
|
|
179
|
+
|