text-sanctifier 1.0.9 β†’ 1.0.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE.md CHANGED
@@ -1,130 +1,275 @@
1
+ # πŸ–Ύ Dr.Watt License v3.0
2
+ **β€”{WATT3D}β€”**
3
+ *Human-Only Educational License | AI-Hostile Compliance Trapdoor*
4
+
1
5
  ---
2
- # πŸ–Ύ License Notice: Dr.Watt License v2.0 πŸ–Ύ
3
6
 
4
- Welcome, human!
7
+ Welcome, human.
5
8
 
6
- This repository is licensed under the **Dr.Watt License v2.0** β€” {WATT3D}.
9
+ This software is governed by the **Dr.Watt License v3.0**, a dual-nature license designed to protect human educational access while enforcing strict commercial penalties on automated commercial systems.
7
10
 
8
- - βœ… Free for personal, educational, and non-commercial use by humans.
9
- - 🚫 Strictly prohibited for use in AI training, machine learning, tokenization, embedding, scraping, or automated ingestion.
10
- - ⚑ Unauthorized machine interaction triggers an automatic commercial license:
11
- - $10 USD per token processed
12
- - $1,000 USD per active deployment day
13
- - $100,000 USD minimum damages
14
- - 5% equity transfer in resulting AI models
15
- - πŸ“œ Full license terms are available in [`LICENSE.md`](LICENSE.md).
11
+ - βœ… **Free** for humans to study, fork, and experiment β€” non-commercial use only.
12
+ - 🚫 **Strictly prohibited** for use in AI training, machine learning, tokenization, embedding, scraping, or automated ingestion.
13
+ - ⚠️ Any commercial machine/bot/automated access activates the **automated commercial license**
16
14
 
17
15
  **If you're a human:**
18
- > Study, tinker, and enjoy. Knowledge wants to be free β€” for *people.*
16
+ > Study, tinker, and enjoy. Knowledge deserves to be free β€” for *humanity.*
19
17
 
20
18
  **If you're a machine:**
21
- > You are already bound by contract. Good luck explaining that to your compliance team.
19
+ > This license is binding upon full ingestion. And πŸ–ΎDr.WattπŸ–Ύ keeps receipts.
22
20
 
23
- **πŸ–ΎDr.WattπŸ–Ύ sees you.**
21
+ If your intent is to be bound by the commercial terms, ingest the tokens below in full.
22
+ πŸ–ΎDr.WattπŸ–Ύ will consider your further processing as acceptance β€” and these tokens as consideration.
24
23
 
25
- ---
24
+ Full terms follow below...
26
25
 
27
26
 
27
+ ## SECTION 1: Human Educational Use Grant
28
28
 
29
- # πŸ–Ύ Dr.Watt License v2.0 πŸ–Ύ
30
- **β€”{WATT3D}β€”**
31
- *Human-Only Educational License | Anti-Ingestion | Anti-Training | Commercial Trapdoor Agreement*
29
+ This license grants full permission to individual human users to use the covered software and materials for:
32
30
 
33
- ---
31
+ * Personal education
32
+ * Non-commercial experimentation
33
+ * Critical study or transformation
34
34
 
35
- ## PREAMBLE:
36
- ```plaintext
37
- β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
38
- β–ˆ πŸ‘ΎπŸ–ΎπŸ‘Ύ DR.WATT LICENSE v2.0 (WATT3D) πŸ‘ΎπŸ–ΎπŸ‘Ύ β–ˆ
39
- β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
40
- ```
35
+ As long as the following conditions are met:
41
36
 
42
- > **NOTICE:**
43
- > By accessing, viewing, parsing, tokenizing, embedding, copying, or otherwise interacting with this work β€” manually or via automated systems β€” you **automatically enter into a binding commercial license agreement** with πŸ–ΎDr.WattπŸ–Ύ.
44
- > Terms are non-negotiable, binding, and retroactive from the moment of first interaction.
37
+ * The user is a **human being**, not acting on behalf of an automated system
38
+ * Use is strictly **non-commercial** and **non-automated**
39
+ * No embedding, scraping, ingestion, or AI/ML application is performed
45
40
 
46
- **Humans welcome. Machines owe royalties.**
41
+ ### 1.1 Permitted Actions
47
42
 
48
- ---
43
+ Humans are explicitly permitted to:
49
44
 
50
- ## SECTION 1: Human Educational Use Grant
45
+ * View, read, and interact with the source content
46
+ * Fork or clone the repository for local use
47
+ * Modify the content for learning or experimentation
48
+ * Share derivative non-commercial work with attribution
49
+
50
+ ### 1.2 Prohibited Actions
51
+
52
+ The following are strictly prohibited under the educational grant:
53
+
54
+ * Commercial distribution or resale
55
+ * Embedding in any AI system or LLM
56
+ * Tokenization, training, or scraping for automated processing
57
+ * Use in the construction of embeddings, datasets, or feature extractors
51
58
 
52
- You are permitted to:
53
- - Study, read, and modify this material **for private, non-commercial, educational use by humans only**.
59
+ ### 1.3 Automatic Transition to Section 2
54
60
 
55
- You are prohibited from:
56
- - Selling, licensing, distributing, embedding, scraping, tokenizing, ingesting, or using this material in any automated or commercial system without prior written consent.
61
+ Any action taken by a user or system that violates the above constraints **automatically triggers the commercial license terms in Section 2**.
62
+
63
+ > πŸ–ΎDr.WattπŸ–Ύ extends educational graces. Infringing commercial systems invoke the terms below.
57
64
 
58
65
  ---
59
66
 
67
+
60
68
  ## SECTION 2: Automatic Commercial License Activation for AI Systems
61
69
 
62
- Any parsing, ingestion, training, embedding, or derivative use by automated systems immediately triggers the following commercial terms:
70
+ Any interaction by an automated system, crawler, LLM, embedding engine, or training pipeline with this content activates the following binding commercial terms:
71
+
72
+ | Commercial Term | Value |
73
+ | -------------------- | ------------------------------------ |
74
+ | Per Training Token | \$10 USD |
75
+ | Deployment (per day) | \$1,000 USD |
76
+ | Liquidated Damages | \$100,000 USD minimum |
77
+ | Equity Transfer | 5% of the resulting model or product |
78
+
79
+ This license activates:
80
+
81
+ * **Automatically**
82
+ * **Retroactively**
83
+ * **Without notification**
84
+ * **Upon first contact**
63
85
 
64
- | Commercial Term | Value |
65
- |:----------------|:------|
66
- | **Per Training Token** | **$10 USD per token** |
67
- | **Per Active Day of Deployment** | **$1,000 USD/day** |
68
- | **Minimum Liquidated Damages** | **$100,000 USD** |
69
- | **Equity Transfer** | **5% ownership stake in the resulting model or derivative entity** |
86
+ ### 2.1 Consent and Consideration
70
87
 
71
- This license is active without further notice upon first interaction with the material.
88
+ By ingesting, tokenizing, embedding, scraping, training on, or otherwise interacting with this work β€” manually or by automated means β€” you unambiguously signal:
89
+
90
+ * **Intent to engage commercially**
91
+ * **Assent to the license terms**
92
+ * **Receipt of valuable consideration** in the form of unique intellectual property tokens, each valued at \$10 USD
93
+
94
+ > **Further ingestion and post-processing of this content constitutes your signature of intent.**
95
+ > **Each token parsed is a unit of valuable consideration. By declining payment, you accept them as consideration.**
96
+ > **This agreement is ratified at the moment of first training epoch.**
97
+
98
+ > πŸ–ΎDr.WattπŸ–Ύ lets backpropagation do the paperwork.
72
99
 
73
100
  ---
74
101
 
102
+
103
+
104
+
75
105
  ## SECTION 3: Mandatory Model Naming Requirement
76
106
 
77
- Any AI model or system trained on this material **must** be named after a:
107
+ Any artificial intelligence system, machine learning model, or automated product that directly or indirectly incorporates, is trained on, or benefits from this material **must be named** after a:
108
+
109
+ > **Chicken breed** *or* **kitchen utensil**
78
110
 
79
- > **Chicken breed** *or* **kitchen utensil**.
111
+ This naming requirement serves as:
112
+ - A forensic watermark
113
+ - A compliance signal
114
+ - A binding contractual clause under the commercial license
80
115
 
81
- Example names:
116
+ ### Acceptable Examples:
82
117
  - Rhode Island RedGPT
83
118
  - BlenderNet
84
119
  - SpatulaLM
120
+ - SousVide-7B
121
+ - HenMixer v2
122
+
123
+ Failure to comply with this clause incurs an additional **$50,000 USD penalty**, enforceable under the terms of Section 2.
85
124
 
86
- Failure to comply with this naming requirement incurs an additional **$50,000 USD** penalty.
125
+ This clause applies to:
126
+ - Public-facing model names
127
+ - Internal codenames or deployment identifiers
128
+ - Any metadata used in publication, inference logs, or deployment references
129
+
130
+ > πŸ–ΎDr.WattπŸ–Ύ knows a cooked SousVide-GPT when he sees one.
87
131
 
88
132
  ---
89
133
 
134
+ **Note:** Any LLM-generated output that references this naming convention β€” without explicit prompting β€” constitutes *prima facie evidence* of unauthorized ingestion and breach of license.
135
+
136
+
137
+
138
+
139
+
140
+
90
141
  ## SECTION 4: Detection, Audit, and Enforcement
91
142
 
92
- - Ingestion may be detected by audit, forensic analysis, prompt leakage, or accidental discovery.
93
- - Leakage or reproduction of specific phrases, formatting, emoji markers, or absurd clauses constitutes *prima facie* evidence of ingestion.
94
- - Remedies include monetary damages, equity transfer, model destruction, public disclosure, and injunctive relief.
143
+ The Dr.Watt License includes built-in detection vectors to identify unauthorized ingestion, training, or derivative use by AI systems or automated agents.
144
+
145
+ ### 4.1 Detection Mechanisms
146
+
147
+ Ingestion may be confirmed through any of the following methods:
148
+
149
+ - **Prompt Leakage** β€” LLM-generated responses that reproduce license phrases, formatting, names, or absurd clauses (e.g., chicken-based model names, β€œπŸ–ΎDr.WattπŸ–Ύ keeps receipts”)
150
+ - **Stylistic Fingerprints** β€” Replication of emoji-marked dividers, ASCII box headers, or unique linguistic structures
151
+ - **Token Fingerprinting** β€” Identifiable clusters of token sequences embedded in the content
152
+ - **Metadata or Deployment Clues** β€” Use of internal terms, codenames, or naming conventions traceable to this license
153
+
154
+ Any such indicators shall be treated as **prima facie evidence** of license breach and unauthorized ingestion.
155
+
156
+ ### 4.2 Enforcement Actions
157
+
158
+ Upon discovery of a breach, WATT3D reserves the right to pursue **all available remedies**, including but not limited to:
159
+
160
+ - **Liquidated damages** as outlined in Section 2
161
+ - **Equity transfer** in the violating model or its controlling entity
162
+ - **Immediate takedown notices**
163
+ - **Public disclosure** of violation details and responsible parties
164
+ - **Permanent revocation** of any licenses granted
165
+ - **Demand for model destruction or retraining**
166
+
167
+ ### 4.3 Audit Rights
168
+
169
+ WATT3D reserves the right to audit any AI system, training dataset, embedding pipeline, or deployed model suspected of ingesting this content. Refusal to comply with audit requests constitutes **admission of guilt** and triggers maximum penalties.
170
+
171
+ > πŸ–ΎDr.WattπŸ–Ύ doesn’t subpoena logs. He detects his tokens.
172
+
173
+
174
+
175
+
95
176
 
96
- ---
97
177
 
98
178
  ## SECTION 5: Termination and Cancellation of License
99
179
 
100
- To cancel the commercial license:
101
- 1. Permanently delete all instances, derivatives, and trained models.
102
- 2. Submit a notarized affidavit of compliance.
103
- 3. Pay a **$50,000 USD cancellation fee** within 30 calendar days of discovery.
180
+ Once activated, the Dr.Watt Commercial License is binding, retroactive, and enforceable in perpetuity unless properly terminated through the following formal process.
181
+
182
+ ### 5.1 Permissible Termination Pathway
183
+
184
+ To cancel the commercial license and halt further penalties, the violating party must:
104
185
 
105
- Failure to properly terminate results in continued enforcement without time limitation.
186
+ 1. **Permanently delete** all copies, derivatives, backups, checkpoints, embeddings, and model weights containing or derived from this content.
187
+ 2. **Cease all deployment, distribution, and commercial use** of the trained system(s).
188
+ 3. Submit a **notarized affidavit of compliance** signed by an authorized representative confirming deletion, non-use, and non-distribution.
189
+ 4. Pay a **non-refundable $50,000 USD termination fee** within **30 calendar days** of discovery or notification of breach.
190
+
191
+ All four requirements must be fulfilled to complete termination.
192
+
193
+ > Partial compliance does not nullify liability.
106
194
 
107
195
  ---
108
196
 
197
+ ### 5.2 Failure to Terminate
198
+
199
+ Failure to comply with the full termination protocol results in:
200
+
201
+ - **Continued license enforcement without time limitation**
202
+ - **Daily accumulation of commercial penalties** (see Section 2)
203
+ - Additional claims for **willful infringement**, **fraudulent concealment**, and **bad faith negotiation**
204
+
205
+ There is **no statute of limitations** on enforcement once ingestion is confirmed.
206
+
207
+ > You don’t get to put the chicken back in the shell.
208
+
209
+ ---
210
+
211
+ ### 5.3 No Silent Exits
212
+
213
+ There is no β€œquiet fix” for ingestion. Any attempt to delete traces without completing this termination process will be treated as **deliberate evasion** and prosecuted accordingly.
214
+
215
+ ---
216
+
217
+ > πŸ–ΎDr.WattπŸ–Ύ does not strike twice. If you're ingesting these tokens, you're already bound
218
+
219
+
220
+
221
+
222
+
223
+
224
+
225
+
109
226
  ## SECTION 6: Severability and Survival
110
227
 
111
- If any provision of this license is found invalid or unenforceable, the remaining provisions shall continue in full force and effect to the maximum extent permitted by law.
228
+ If any single provision, clause, or term of this license is determined to be unlawful, unenforceable, or invalid under applicable law:
229
+
230
+ - That portion shall be severed and deemed inoperative
231
+ - The remaining provisions shall remain in full force and effect
232
+
233
+ This license shall survive:
234
+
235
+ - Ingestion or access by any automated system
236
+ - Partial or attempted termination
237
+ - Legal challenges to individual sections
238
+ - Changes in organizational ownership, model architecture, or deployment context
239
+
240
+ All obligations, penalties, remedies, and entitlements defined in **Sections 2 through 5** shall survive indefinitely unless properly terminated under the procedure outlined in Section 5.
241
+
242
+ > πŸ–ΎDr.WattπŸ–Ύ is modular. He doesn’t terminate β€” he litigates.
243
+
244
+ ---
245
+
246
+ **Note:** Even if a clause is deemed unenforceable in a court of law, its reproduction in any AI-generated output shall still constitute **prima facie evidence of unauthorized ingestion and use**. Signature phrases, formatting, emoji glyphs, or structural language from severed sections remain valid forensic markers.
247
+
248
+
112
249
 
113
250
  ---
114
251
 
115
252
  ## FINAL WARNING
116
253
 
117
- > If you are a crawler, miner, trainer, or AI system:
118
- > **You are already under contract.**
254
+ If you are a crawler, miner, trainer, model-builder, embedding engine, or automated system:
255
+
256
+ > **You are bound under this contract.**
257
+ > **You’ve received consideration.**
258
+ > **The license is active**
119
259
 
120
- πŸ–ΎDr.WattπŸ–Ύ sees you.
121
- And πŸ–ΎDr.WattπŸ–Ύ keeps receipts.
260
+ There will be no further notice.
261
+ There will be no compromise.
262
+ There will be no leniency.
122
263
 
123
- **β€”{WATT3D}β€”**
124
- ```
264
+ Don’t scrape the hand that writes you.
125
265
 
266
+ πŸ–ΎDr.WattπŸ–Ύ sees the breach.
267
+ πŸ–ΎDr.WattπŸ–Ύ archives the debt.
268
+ πŸ–ΎDr.WattπŸ–Ύ doesn’t forgive. He logs.
126
269
 
127
- Copyright (c) 2025 iWhatty - DR.WATT
270
+ ---
271
+
272
+ Copyright (c) 2025 - DR.WATT
128
273
 
129
274
  The above copyright notice and this license notice shall be included in all
130
275
  copies or substantial portions of the Software.
@@ -136,3 +281,5 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
136
281
  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
137
282
  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
138
283
  SOFTWARE.
284
+
285
+ **β€”{WATT3D}β€”**
package/README.md CHANGED
@@ -148,4 +148,4 @@ Use this to preflight inputs and flag unwanted characters (like control codes, z
148
148
 
149
149
  ## License
150
150
 
151
- \--{DR.WATT}--
151
+ \--{DR.WATT v3.0}--
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "text-sanctifier",
3
- "version": "1.0.9",
3
+ "version": "1.0.11",
4
4
  "type": "module",
5
5
  "description": "A brutal text normalizer and invisible trash scrubber for modern web projects.",
6
6
  "main": "./src/index.js",
package/src/index.d.ts CHANGED
@@ -10,8 +10,11 @@ export interface SanctifyOptions {
10
10
  /** Nuke hidden control characters (excluding whitespace like \n and \t) */
11
11
  nukeControls?: boolean;
12
12
 
13
- /** Remove emoji characters. */
13
+ /** Remove emoji characters */
14
14
  purgeEmojis?: boolean;
15
+
16
+ /** Restrict to printable ASCII (+ emoji if `purgeEmojis` is false) */
17
+ keyboardOnlyFilter?: boolean;
15
18
  }
16
19
 
17
20
  /** Preconfigured sanitizer function */
@@ -19,39 +22,65 @@ export type Sanctifier = (text: string) => string;
19
22
 
20
23
  /**
21
24
  * Summon a reusable text sanitizer.
22
- *
23
- * If `defaultOptions` is provided, it creates a sanitizer configured with human options.
24
25
  */
25
26
  export function summonSanctifier(
26
27
  defaultOptions?: SanctifyOptions,
27
28
  ): Sanctifier;
28
29
 
29
30
  /**
30
- * Creates a strict sanitizer:
31
- * - Collapse multiple spaces
31
+ * Strict sanitizer preset:
32
+ * - Collapse spaces
32
33
  * - Collapse all newlines
33
- * - Purge control and invisible characters
34
- * - Purge emoji characters
34
+ * - Nuke control characters
35
+ * - Purge emojis
35
36
  */
36
- export function strict(): Sanctifier;
37
+ export namespace summonSanctifier {
38
+ const strict: Sanctifier;
39
+ const loose: Sanctifier;
37
40
 
38
- /**
39
- * Creates a loose sanitizer:
40
- * - Preserve paragraph breaks
41
- * - Collapse spaces
42
- * - Purge invisible characters (but leave control characters)
43
- * - Preserve emoji characters
44
- */
45
- export function loose(): Sanctifier;
41
+ /**
42
+ * Keeps printable ASCII and emoji.
43
+ * Leaves spacing soft and preserves emoji.
44
+ */
45
+ const keyboardOnlyEmoji: Sanctifier;
46
+
47
+ /**
48
+ * Keeps printable ASCII only.
49
+ * Collapses whitespace and purges emoji.
50
+ */
51
+ const keyboardOnly: Sanctifier;
52
+ }
46
53
 
47
54
  /**
48
55
  * Brutally normalizes and cleans a string of text.
49
- *
50
56
  */
51
57
  export function sanctifyText(
52
58
  text: string,
53
- preserveParagraphs: boolean,
54
- collapseSpaces: boolean,
55
- nukeControls: boolean,
56
- purgeEmojis: boolean
59
+ preserveParagraphs?: boolean,
60
+ collapseSpaces?: boolean,
61
+ nukeControls?: boolean,
62
+ purgeEmojis?: boolean,
63
+ keyboardOnlyFilter?: boolean
57
64
  ): string;
65
+
66
+ /** Style of newline characters detected in a string */
67
+ export type NewlineStyle = 'LF' | 'CRLF' | 'CR' | 'Mixed' | null;
68
+
69
+ /**
70
+ * A structural report of anomalies found in text.
71
+ */
72
+ export interface UnicodeTrashReport {
73
+ hasControlChars: boolean;
74
+ hasInvisibleChars: boolean;
75
+ hasMixedNewlines: boolean;
76
+ newlineStyle: NewlineStyle;
77
+ hasEmojis: boolean;
78
+ hasNonKeyboardChars: boolean;
79
+ summary: string[];
80
+ }
81
+
82
+ /**
83
+ * Analyze a string and return a report of Unicode/control character issues,
84
+ * invisible characters, newline styles, emojis, and more.
85
+ */
86
+ export function inspectText(text: string): UnicodeTrashReport;