@jtml/core 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Thushanth Bengre
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,370 @@
1
+ # JTML
2
+
3
+ **JSON Token-Minimized Language — schema-first encoding for token-efficient LLM prompts**
4
+
5
+ [![npm version](https://badge.fury.io/js/%40jtml%2Fcore.svg)](https://www.npmjs.com/package/@jtml/core)
6
+ [![CI](https://github.com/thushanthbengre22-dev/jtml/actions/workflows/test.yml/badge.svg)](https://github.com/thushanthbengre22-dev/jtml/actions/workflows/test.yml)
7
+ [![npm downloads](https://img.shields.io/npm/dm/@jtml/core.svg)](https://www.npmjs.com/package/@jtml/core)
8
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
9
+
10
+ ---
11
+
12
+ ## Why JTML?
13
+
14
+ When feeding structured data to an LLM, JSON's repeated key names consume tokens on every row. For large API responses, this overhead is significant.
15
+
16
+ JTML declares the schema once and encodes values positionally — reducing token usage by ~60% on typical structured datasets.
17
+
18
+ ```
19
+ JSON (287 tokens) JTML (109 tokens — 62% fewer)
20
+ ───────────────────────────────────── ──────────────────────────────
21
+ [ @schema users
22
+ {"id":1,"name":"Alice", id:i name:s email:s age:i active:b
23
+ "email":"alice@example.com",
24
+ "age":30,"active":true}, @data
25
+ {"id":2,"name":"Bob", @array
26
+ "email":"bob@example.com", 1|Alice|alice@example.com|30|1
27
+ "age":25,"active":false} 2|Bob|bob@example.com|25|0
28
+ ]
29
+ ```
30
+
31
+ ---
32
+
33
+ ## Installation
34
+
35
+ ```bash
36
+ npm install @jtml/core
37
+ ```
38
+
39
+ ---
40
+
41
+ ## Quick Start
42
+
43
+ ```typescript
44
+ import { encode, decode, compareTokens } from '@jtml/core';
45
+
46
+ const users = [
47
+ { id: 1, name: 'Alice', age: 30 },
48
+ { id: 2, name: 'Bob', age: 25 }
49
+ ];
50
+
51
+ // Encode to JTML
52
+ const jtml = encode(users);
53
+
54
+ // Decode back to JSON
55
+ const decoded = decode(jtml);
56
+
57
+ // Measure token savings
58
+ const stats = compareTokens(JSON.stringify(users), jtml);
59
+ console.log(`Saved ${stats.savings} tokens (${stats.savingsPercent.toFixed(1)}%)`);
60
+ ```
61
+
62
+ ---
63
+
64
+ ## Use Cases
65
+
66
+ ### Compress API responses before sending to an LLM
67
+
68
+ ```typescript
69
+ const response = await fetch('https://api.example.com/products?limit=500');
70
+ const products = await response.json();
71
+
72
+ // Include compressed data in the prompt
73
+ const jtml = encode(products);
74
+ const prompt = `Analyze these products and identify trends:\n\n${jtml}`;
75
+ ```
76
+
77
+ ### Fit more data in a fixed context window
78
+
79
+ ```typescript
80
+ // Encode a large history with a named schema
81
+ const historyJtml = encode(orderHistory, { schemaId: 'orders' });
82
+
83
+ // Reference the same schema for current data — no schema overhead on second call
84
+ const currentJtml = encode(currentOrders, { schemaRef: 'orders', includeSchema: false });
85
+ ```
86
+
87
+ ### Validate JTML output before using it
88
+
89
+ ```typescript
90
+ import { decode, JTMLError } from '@jtml/core';
91
+
92
+ try {
93
+ const data = decode(jtmlString);
94
+ } catch (error) {
95
+ if (error instanceof JTMLError) {
96
+ console.error(`${error.code}: ${error.message}`);
97
+ }
98
+ }
99
+ ```
100
+
101
+ ---
102
+
103
+ ## API
104
+
105
+ ### `encode(data, options?)`
106
+
107
+ Converts JSON data to JTML format.
108
+
109
+ ```typescript
110
+ import { encode } from '@jtml/core';
111
+
112
+ // Auto-infer schema
113
+ const jtml = encode(data);
114
+
115
+ // Custom schema ID
116
+ const jtml = encode(data, { schemaId: 'products_v1' });
117
+
118
+ // Reference an existing schema — omits schema block from output
119
+ const jtml = encode(data, { schemaRef: 'products_v1', includeSchema: false });
120
+ ```
121
+
122
+ **Options:**
123
+
124
+ ```typescript
125
+ interface JTMLEncodeOptions {
126
+ schemaId?: string; // Schema identifier (default: 'default')
127
+ schemaRef?: string; // Reference an already-registered schema
128
+ autoInferTypes?: boolean; // Infer types from data (default: true)
129
+ includeSchema?: boolean; // Include schema block in output (default: true)
130
+ }
131
+ ```
132
+
133
+ ---
134
+
135
+ ### `decode(jtml, options?)`
136
+
137
+ Converts JTML back to JSON with type reconstruction.
138
+
139
+ ```typescript
140
+ import { decode } from '@jtml/core';
141
+
142
+ // Basic decode
143
+ const result = decode(jtmlString);
144
+
145
+ // With pre-loaded schema cache
146
+ const cached = decode(jtmlString, { schemaCache: mySchemaCache });
147
+ ```
148
+
149
+ **Options:**
150
+
151
+ ```typescript
152
+ interface JTMLDecodeOptions {
153
+ schemaCache?: Map<string, JTMLSchema>; // Pre-registered schemas
154
+ strict?: boolean; // Require schema (default: true)
155
+ }
156
+ ```
157
+
158
+ ---
159
+
160
+ ### `encodeBatch(datasets, schemaId?)`
161
+
162
+ Encodes multiple datasets under a shared schema.
163
+
164
+ ```typescript
165
+ import { encodeBatch } from '@jtml/core';
166
+
167
+ const jtml = encodeBatch([users, products, orders], 'batch_v1');
168
+ ```
169
+
170
+ ---
171
+
172
+ ### Token Analysis
173
+
174
+ ```typescript
175
+ import { compareTokens, formatTokenStats, estimateCostSavings } from '@jtml/core';
176
+
177
+ const stats = compareTokens(jsonString, jtmlString);
178
+ console.log(formatTokenStats(stats));
179
+ // Token Comparison:
180
+ // JSON: 287 tokens
181
+ // JTML: 109 tokens
182
+ // Savings: 178 tokens (62.0%)
183
+
184
+ const cost = estimateCostSavings(stats, 3.0); // $3 per million tokens
185
+ console.log(`$${cost.costSavedPer1M.toFixed(4)} saved per million requests`);
186
+ ```
187
+
188
+ **Tokenizer options:** `'claude'` (default) | `'gpt'` | `'llama'`
189
+
190
+ ---
191
+
192
+ ### Schema Management
193
+
194
+ ```typescript
195
+ import { schemaManager, inferSchema } from '@jtml/core';
196
+
197
+ const schema = inferSchema(data, 'my_schema');
198
+ schemaManager.register(schema);
199
+
200
+ // Reuse across requests
201
+ const jtml = encode(newData, { schemaRef: 'my_schema' });
202
+ ```
203
+
204
+ ---
205
+
206
+ ## Format Specification
207
+
208
+ ### Type Markers
209
+
210
+ | Marker | Type | Encoded as |
211
+ |--------|-----------|--------------------------|
212
+ | `i` | Integer | `42` |
213
+ | `f` | Float | `3.14` |
214
+ | `s` | String | `hello` |
215
+ | `b` | Boolean | `1` (true) / `0` (false) |
216
+ | `t` | Timestamp | `2026-03-29T10:30:00Z` |
217
+ | `n` | Null | `` (empty) |
218
+ | `a` | Array | `[1,2,3]` |
219
+ | `o` | Object | `{key:value}` |
220
+
221
+ ### Schema Definition
222
+
223
+ ```
224
+ @schema user_profile
225
+ id:i name:s email:s age:i? active:b tags:s[]
226
+
227
+ @data
228
+ @array
229
+ 1|Alice|alice@example.com|30|1|[admin,user]
230
+ 2|Bob|bob@example.com||0|[user]
231
+ ```
232
+
233
+ ### Optional Fields
234
+
235
+ ```
236
+ @schema product
237
+ id:i name:s price:f description:s?
238
+
239
+ @data
240
+ @array
241
+ 101|Laptop|999.99|High-performance laptop
242
+ 102|Mouse|29.99|
243
+ ```
244
+
245
+ ### Schema Reuse
246
+
247
+ ```
248
+ @ref user_profile
249
+
250
+ @data
251
+ @array
252
+ 3|Charlie|charlie@example.com|35|1|[moderator]
253
+ ```
254
+
255
+ ### Special Characters
256
+
257
+ | Character | Escaped as |
258
+ |-----------|------------|
259
+ | `\|` | `\\|` |
260
+ | newline | `\\n` |
261
+ | `\\` | `\\\\` |
262
+
263
+ ---
264
+
265
+ ## CLI
266
+
267
+ ```bash
268
+ # Install globally
269
+ npm install -g @jtml/core
270
+
271
+ # Encode JSON to JTML
272
+ jtml encode input.json output.jtml
273
+
274
+ # Decode JTML to JSON
275
+ jtml decode input.jtml output.json
276
+
277
+ # Compare token efficiency
278
+ jtml compare dataset.json
279
+
280
+ # Generate schema only
281
+ jtml schema api-response.json
282
+
283
+ # Validate JTML format
284
+ jtml validate data.jtml
285
+ ```
286
+
287
+ **Options:**
288
+
289
+ ```bash
290
+ jtml encode data.json --schema-id "products_v1"
291
+ jtml encode data.json --no-schema
292
+ jtml compare data.json --tokenizer gpt
293
+ ```
294
+
295
+ ---
296
+
297
+ ## Benchmarks
298
+
299
+ Measured on real structured datasets. Token counts are estimates based on approximate tokenization.
300
+
301
+ | Dataset | JSON tokens | JTML tokens | Savings |
302
+ |----------------------|-------------|-------------|---------|
303
+ | User array (10) | 287 | 109 | 62% |
304
+ | Product catalog (50) | 1,853 | 638 | 66% |
305
+ | Paginated API (25) | 967 | 337 | 65% |
306
+ | Large dataset (1000) | 33,064 | 15,953 | 52% |
307
+
308
+ Average savings: **~61%**
309
+
310
+ ---
311
+
312
+ ## Cost Savings
313
+
314
+ At **$3 per million input tokens** (approximate Claude Sonnet pricing):
315
+
316
+ | Requests/day | JSON cost/day | JTML cost/day | Daily savings |
317
+ |--------------|---------------|---------------|---------------|
318
+ | 1,000 | $0.86 | $0.33 | $0.53 |
319
+ | 10,000 | $8.60 | $3.30 | $5.30 |
320
+ | 100,000 | $86.00 | $33.00 | $53.00 |
321
+
322
+
323
+ *Assumes 287-token average JSON payload per request. Actual savings vary by dataset.*
324
+
325
+ ---
326
+
327
+ ## TypeScript Support
328
+
329
+ All exports are fully typed. Import types directly:
330
+
331
+ ```typescript
332
+ import type {
333
+ JTMLSchema,
334
+ JTMLField,
335
+ JTMLTypeInfo,
336
+ JTMLType,
337
+ JTMLEncodeOptions,
338
+ JTMLDecodeOptions,
339
+ TokenStats,
340
+ CostSavings
341
+ } from '@jtml/core';
342
+ ```
343
+
344
+ ---
345
+
346
+ ## Error Handling
347
+
348
+ ```typescript
349
+ import { JTMLError } from '@jtml/core';
350
+
351
+ try {
352
+ const data = decode(jtmlString);
353
+ } catch (error) {
354
+ if (error instanceof JTMLError) {
355
+ // error.code: SCHEMA_NOT_FOUND | SCHEMA_PARSE_ERROR | SCHEMA_MISMATCH
356
+ // SCHEMA_REQUIRED | INVALID_DATA | INVALID_VALUE
357
+ console.error(`${error.code}: ${error.message}`);
358
+ }
359
+ }
360
+ ```
361
+
362
+ ---
363
+
364
+ ## Contributing
365
+
366
+ See [CONTRIBUTING.md](CONTRIBUTING.md).
367
+
368
+ ## License
369
+
370
+ MIT — see [LICENSE](LICENSE)