papagaio 0.7.9 → 0.32.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,585 +1,325 @@
1
1
  # Papagaio
2
- Minimal yet powerful text preprocessor.
3
2
 
4
- - **It's portable!** papagaio requires only ES6 and nothing else.
5
- - **It's small!** papagaio is around ~250 lines and ~10kb.
6
- - **It's easy!** papagaio doesnt have any complicated stuff, 1 class and 1 method for doing everything!
7
- - **It's flexible!** do papagaio sigil and delimiters conflict with whatever you want to process? then simply change it! papagaio allow us to modify ANY of its keywords and symbols.
8
- - **It's powerful!!** aside been inspired by the m4 preprocessor and meant to be a preprocessor, papagaio still a fully-featured programming language because it can evaluate any valid javascript code using $eval;
3
+ Papagaio is a C-first, embeddable text processing engine. It is designed to be highly modular and **script-agnostic**, allowing core pattern matching to be used alone or extended via WebAssembly (Wasm) plugins.
9
4
 
10
- ## Installation
11
- ```javascript
12
- import { Papagaio } from './papagaio.js';
13
- const papagaio = new Papagaio();
14
- const result = papagaio.process(input);
15
- ```
16
-
17
- ## Configuration
18
- ```javascript
19
- papagaio.symbols = {
20
- pattern: "pattern", // pattern keyword
21
- open: "{", // opening delimiter (multi-char supported)
22
- close: "}", // closing delimiter (multi-char supported)
23
- sigil: "$", // variable marker
24
- eval: "eval", // eval keyword
25
- block: "recursive", // block keyword (recursive nesting)
26
- regex: "regex", // regex keyword
27
- blockseq: "sequential" // blockseq keyword (sequential blocks)
28
- };
29
- ```
5
+ ## Key Features
30
6
 
31
- ---
7
+ - **Lightweight Core**: Efficient C engine for pattern matching and transformation.
8
+ - **Pattern-Matching**: Powerful capture system with built-in and custom modifiers.
9
+ - **WebAssembly Plugins**: Highly secure, zero-dependency plugin architecture via an embedded `wasm3` runtime.
10
+ - **Configurable Delimiters**: Redefine sigils, delimiters, and markers at runtime.
11
+ - **Language Bindings**: Native usage in C and Node.js/WebAssembly.
32
12
 
33
- ## Core Concepts
13
+ ## Quick Start
34
14
 
35
- ### 1. Simple Variables
36
- ```
37
- $pattern {$x} {$x}
38
- hello
39
- ```
40
- Output: `hello`
41
-
42
- ### 2. Multiple Variables
43
- ```
44
- $pattern {$x $y $z} {$z, $y, $x}
45
- apple banana cherry
15
+ ### Command Line Interface (CLI)
16
+ ```sh
17
+ # Process with patterns defined in the file or passed via -p
18
+ papagaio -e '$pattern {hello $w} {Hi $w}' input.txt
46
19
  ```
47
- Output: `cherry, banana, apple`
48
20
 
49
- ---
50
-
51
- ## Variables
52
-
53
- Papagaio provides flexible variable capture with automatic context-aware behavior.
54
-
55
- ### `$x` - Smart Variable
56
- Automatically adapts based on context:
57
- - **Before a block**: Captures everything until the block's opening delimiter
58
- - **Before a literal**: Captures everything until that literal appears
59
- - **Otherwise**: Captures a single word (non-whitespace token)
60
-
61
- ```
62
- $pattern {$x} {[$x]}
63
- hello world
64
- ```
65
- Output: `[hello] [world]`
21
+ ### C API
22
+ ```c
23
+ Papagaio *ctx = papagaio_open();
66
24
 
25
+ char *out = papagaio_process_text(ctx, input_text, strlen(input_text));
26
+ printf("%s", out);
27
+ free(out);
28
+ papagaio_close(ctx);
67
29
  ```
68
- $pattern {$name ${(}{)}content} {$name: $content}
69
- greeting (hello world)
70
- ```
71
- Output: `greeting: hello world`
72
30
 
73
- ```
74
- $pattern {$prefix:$suffix} {$suffix-$prefix}
75
- key:value
76
- ```
77
- Output: `value-key`
78
-
79
- ### `$x?` - Optional Variable
80
- Same behavior as `$x`, but won't fail if empty or not found.
31
+ ### JavaScript / WASM (Node.js)
32
+ ```javascript
33
+ import Papagaio from './papagaio.js';
81
34
 
82
- ```
83
- $pattern {$x? world} {<$x>}
84
- world
85
- ```
86
- Output: `<>`
35
+ const p = new Papagaio();
36
+ await p.init();
37
+ p.registerCommand("mycmd", (name, ...args) => `Result: ${args[0]}`);
87
38
 
39
+ console.log(p.process('$mycmd{Hello}')); // Output: Result: Hello
88
40
  ```
89
- $pattern {$greeting? $name} {Hello $name$greeting}
90
- Hi John
91
- ```
92
- Output: `Hello JohnHi`
93
41
 
94
42
  ---
95
43
 
96
- ## Regex Matching
97
-
98
- Capture content using JavaScript regular expressions.
99
-
100
- ### Syntax
101
- ```
102
- $regex varName {pattern}
103
- ```
104
-
105
- ### Basic Example
106
- ```
107
- $pattern {$regex num {[0-9]+}} {Number: $num}
108
- The answer is 42
109
- ```
110
- Output: `Number: 42`
111
-
112
- ### Complex Patterns
113
- ```
114
- $pattern {$regex email {\w+@\w+\.\w+}} {Email found: $email}
115
- Contact: user@example.com
44
+ ## Pattern Syntax
45
+
46
+ Patterns are composed of whitespace-separated tokens. The engine uses a "flex-matching" strategy that automatically skips horizontal whitespace between variables.
47
+
48
+ - **Literal**: Matches exact text.
49
+ - **Variable**: `$name` (captures a sequence up to the next pattern match).
50
+ - **Optional**: `$name?` or `literal?` (marker is configurable, e.g., `MAYBE`, via `$changesymbols`).
51
+ - **Escaping**: Use `$$` to match a literal `$`.
52
+
53
+ ### Modifiers
54
+ Modifiers specify the data type or constraints of a match:
55
+ - **Numbers**: `$var$int`, `$var$float`, `$var$number`
56
+ - **Casing**: `$var$upper`, `$var$lower`, `$var$capitalized`
57
+ - **Formats**: `$var$word`, `$var$identifier`, `$var$hex`, `$var$path`, `$var$binary`, `$var$percent`
58
+ - **Regex**: `$id$regex{[0-9]+}`
59
+ - **Block**: `$item$block{[}{]}` (captures everything between delimiters)
60
+ - **Aliases**: `$kind$aliases{cat}{dog}{bird}` (multi-block syntax).
61
+ - **Substrings**: `$var$starts{foo}`, `$var$ends{bar}`, `$var$prefix{p}`, `$var$suffix{s}`, `$var$infix{i}`, `$var$includes{x}`
62
+ - **Grouping**: `$item$group{subpattern}` (recursive grouping, matches as one unit)
63
+ - **Optionality**: any token (literal, variable, or group) can be made optional by adding `?` (or a custom marker like `MAYBE`).
64
+ - **Trailing Sigil (whitespace collapse)**: appending a bare `$` (or the current sigil) directly after any variable or literal causes the matcher to **consume all following whitespace** in the input — making the adjacent `TOK_WS` optional. This is useful when the number of spaces between tokens is variable:
65
+ ```text
66
+ $pattern {$a$ $b} {$a/$b}
67
+ hello world → hello/world
68
+ ```
69
+ The trailing `$` after `$a` collapses any run of spaces/tabs/newlines between `$a` and `$b`.
70
+
71
+ ### Braced Variables
72
+
73
+ When a captured variable name needs to be immediately followed by literal text (e.g., a suffix), wrap the name in `${...}` to prevent ambiguity:
74
+
75
+ ```text
76
+ $pattern {$id$word} {${id}x}
77
+ foo
78
+ ```
79
+ *Output: `foox`* — without braces, `$idx` would be parsed as a single variable named `idx`.
80
+
81
+ Braced syntax can be used in any replacement string:
82
+ ```text
83
+ $pattern {$first $last} {Hello, ${first}!
84
+ }
85
+ John Doe
116
86
  ```
117
- Output: `Email found: user@example.com`
87
+ *Output: `Hello, John!`*
118
88
 
119
- ### Multiple Regex Variables
89
+ ### Nesting
90
+ Modifiers support full recursive nesting:
91
+ ```text
92
+ $pattern {$n$aliases{$x$int}{abc}} {VALUE: $n}
120
93
  ```
121
- $pattern {$regex year {[0-9]{4}}-$regex month {[0-9]{2}}} {Month $month in $year}
122
- 2024-03
123
- ```
124
- Output: `Month 03 in 2024`
125
94
 
126
95
  ---
127
96
 
128
- ## Blocks
129
-
130
- Papagaio supports two types of block capture: **nested** and **adjacent**.
131
-
132
- ### Nested Blocks - `${open}{close}varName`
133
-
134
- Captures content between delimiters with full nesting support. Nested delimiters are handled recursively.
135
-
136
- #### Basic Syntax
137
- ```
138
- ${opening_delimiter}{closing_delimiter}varName
139
- ```
140
-
141
- #### Examples
142
-
143
- **Basic Recursive Block:**
144
- ```
145
- $pattern {$name ${(}{)}content} {[$content]}
146
- data (hello world)
147
- ```
148
- Output: `[hello world]`
149
-
150
- **Custom Delimiters:**
151
- ```
152
- $pattern {${<<}{>>}data} {DATA: $data}
153
- <<json stuff>>
154
- ```
155
- Output: `DATA: json stuff`
156
-
157
- **Multi-Character Delimiters:**
158
- ```
159
- $pattern {${```}{```}code} {<pre>$code</pre>}
160
- ```markdown
161
- # Title
162
- ```
163
- Output: `<pre># Title</pre>`
164
-
165
- **Default Delimiters (empty blocks):**
166
- ```
167
- $pattern {${}{}data} {[$data]}
168
- {hello world}
169
- ```
170
- Output: `[hello world]`
171
- *(Uses default `{` and `}` when delimiters are empty)*
97
+ ## Extensibility (Wasm Plugins)
98
+
99
+ Papagaio follows a Wasm-first plugin architecture. Core features are limited to pattern matching and transformation, while custom text processing capabilities are provided by WebAssembly plugins.
100
+
101
+ ### Built-in Operators
102
+ - **`$document`**: Injects the current state of the document (alias for `$document$current`).
103
+ - **`$document$original`**: Injects the initial, unprocessed input text. Useful for referencing the source even after multiple transformations.
104
+ - **`$document$current`**: Injects the current state of the document during the pre-processing pass.
105
+ - **`$wasm{path}`**: Loads a WebAssembly plugin from the file system (CLI only).
106
+ - **$file{path}**: Injects the content of a file from the file system (CLI only).
107
+ - **`$wat{source}`**: Compiles a WebAssembly Text Format (WAT) source string inline and registers all exported `papagaio_*` functions as commands. Useful for embedding lightweight plugins without an external `.wasm` file.
108
+ - **`$NAME$from{value}`**: Dynamically assigns a processed `value` to `$NAME`. The assignment itself is suppressed from the output, and the variable becomes available for exact-match replacement in the remaining document.
109
+
110
+ ```text
111
+ $NAME$from{Alice}
112
+ Hello, $NAME!
113
+ ```
114
+ *Output: `Hello, Alice!`*
115
+
116
+ ```text
117
+ $wat{
118
+ (module
119
+ (func (export "papagaio_hello") (result i32)
120
+ i32.const 42))
121
+ }
122
+ $hello
123
+ ```
172
124
 
173
- **Nested Blocks:**
174
- ```
175
- $pattern {${(}{)}outer} {[$outer]}
176
- (outer (inner (deep)))
177
- ```
178
- Output: `[outer (inner (deep))]`
125
+ ---
179
126
 
180
- ### Adjancent Blocks - `$${open}{close}varName`
127
+ ## CLI Argument Expansion
181
128
 
182
- Captures multiple adjacent blocks with the same delimiters and concatenates their content (separated by spaces).
129
+ Papagaio can resolve command-line arguments directly within your source files. This is useful for passing configuration, flags, or metadata into the processing pipeline.
183
130
 
184
- #### Basic Syntax
185
- ```
186
- $${opening_delimiter}{closing_delimiter}varName
187
- ```
131
+ ### Positional Arguments
132
+ The `argv` array maps as follows (where `argv[0]` is the binary name, invisible to Papagaio):
188
133
 
189
- #### Examples
134
+ | Variable | Value |
135
+ |---|---|
136
+ | `$args$0` | `argv[1]` — the input file/script name |
137
+ | `$args$1`, `$args$2`, … | Subsequent positional arguments |
138
+ | `$args$count` | Total number of arguments (excludes the binary name, `argv[0]`) |
139
+ | `$args$all` | All extra arguments from index 1 onwards (after the script), joined with spaces |
190
140
 
191
- **Basic Adjacent Block:**
192
- ```
193
- $pattern {$${[}{]}items} {Items: $items}
194
- [first][second][third]
195
- ```
196
- Output: `Items: first second third`
141
+ If a `$args$NAME` variable is not found, it is emitted **literally** (e.g. `$args$missing` stays as-is).
197
142
 
198
- **Adjacent with Custom Delimiters:**
199
- ```
200
- $pattern {$${<}{>}tags} {Tags: $tags}
201
- <html><body><div>
202
- ```
203
- Output: `Tags: html body div`
143
+ ### Named Variables (Overrides)
144
+ Arguments in the format `key=value` are automatically parsed and can be accessed in two ways:
145
+ 1. **Explicit**: `$args$key`
146
+ 2. **Direct**: `$key` (shorthand for `$args$key`)
204
147
 
205
- **Default Delimiters:**
206
- ```
207
- $pattern {$${}{}data} {Result: $data}
208
- {a}{b}{c}
209
- ```
210
- Output: `Result: a b c`
148
+ Direct access (`$key`) will only resolve if `key` does not conflict with a registered command (like `$wasm`) or a built-in directive.
211
149
 
212
- **Mixed Usage:**
150
+ #### Example:
151
+ ```sh
152
+ ./papagaiocc input.c version=1.2.3 target=wasm -O3
213
153
  ```
214
- $pattern {${(}{)}nested, $${[}{]}seq} {Nested: $nested | Seq: $seq}
215
- (a (b c)), [x][y][z]
154
+ Inside `input.c`:
155
+ ```c
156
+ const char *v = "$version"; // "1.2.3"
157
+ const char *t = "$target"; // "wasm"
158
+ const char *f = "$args$1"; // "-O3"
216
159
  ```
217
- Output: `Nested: a (b c) | Seq: x y z`
218
-
219
- ### Block Comparison
220
-
221
- | Feature | Nested `${}{}var` | Adjacent `$${}{}var` |
222
- |---------|---------------------|------------------------|
223
- | Purpose | Capture nested content | Capture adjacent blocks |
224
- | Input | `[a [b [c]]]` | `[a][b][c]` |
225
- | Output | `a [b [c]]` | `a b c` |
226
- | Nesting | Handled recursively | Not nested, sequential |
227
- | Spacing | Preserves internal structure | Joins with spaces |
228
160
 
229
161
  ---
230
162
 
231
- ## Pattern Scopes
163
+ ## Dynamic Customization
232
164
 
233
- Patterns defined within a replacement body create nested scopes with hierarchical inheritance.
165
+ You can redefine the engine's syntax symbols at runtime using the atomic **`$changesymbols`** directive.
234
166
 
235
- ### Basic Pattern
236
- ```
237
- $pattern {hello} {world}
238
- hello
239
- ```
240
- Output: `world`
167
+ ### `$changesymbols{sigil}{open}{close}{optional}`
168
+ Default: `$changesymbols{$}{{} }{}}{?}`
241
169
 
242
- **Key Properties:**
243
- * Patterns are scoped to their context
244
- * Child patterns inherit parent patterns
245
- * Patterns do not persist between `process()` calls
246
- * Perfect for hierarchical transformations
247
-
248
- ### Nested Patterns with Inheritance
249
- ```
250
- $pattern {outer $x} {
251
- $pattern {inner $y} {[$y from $x]}
252
- inner $x
253
- }
254
- outer hello
170
+ Example:
171
+ ```text
172
+ $changesymbols{@}{<}{>}{!} @pattern <@n!> <ID: @n> [x] [y]
255
173
  ```
256
- Output: `[hello from hello]`
257
-
258
- The inner pattern has access to `$x` from the outer pattern's capture.
259
-
260
- ### Deep Nesting
261
- ```
262
- $pattern {level1 $a} {
263
- $pattern {level2 $b} {
264
- $pattern {level3 $c} {$a > $b > $c}
265
- level3 $b
266
- }
267
- level2 $a
268
- }
269
- level1 ROOT
270
- ```
271
- Output: `ROOT > ROOT > ROOT`
272
-
273
- Each nested level inherits all patterns from parent scopes.
274
-
275
- ### Sibling Scopes Don't Share
276
- ```
277
- $pattern {branch1} {
278
- $pattern {x} {A}
279
- x
280
- }
281
- $pattern {branch2} {
282
- x
283
- }
284
- branch1
285
- branch2
286
- ```
287
- Output:
288
- ```
289
- A
290
- x
291
- ```
292
-
293
- Patterns in `branch1` are not available in `branch2` (they are siblings, not parent-child).
174
+ This changes the sigil to `@`, delimiters to `< >`, and the optional marker to `!`. Preprocessor directives (like `$changesymbols` itself) always use the stable `$` and `{}` to remain functional.
294
175
 
295
176
  ---
296
177
 
297
- ## Special Keywords
178
+ ## Recursive Priority System
298
179
 
299
- ### $eval
300
- Executes JavaScript code with access to the Papagaio instance.
180
+ Papagaio allows you to control the order of execution and side-effects (such as pattern definitions or WASM loading) using the **`$priority$N`** directive.
301
181
 
302
- ```
303
- $pattern {$x} {$eval{return parseInt($x)*2;}}
304
- 5
305
- ```
306
- Output: `10`
307
-
308
- **Accessing Papagaio Instance:**
309
- ```
310
- $pattern {info} {$eval{
311
- return `Content length: ${papagaio.content.length}`;
312
- }}
313
- info
314
- ```
182
+ - **`$priority$0{...}`**: Maximum priority.
183
+ - **`$priority$max{...}`**: Alias for `INT_MIN + 1` (Absolute highest priority).
184
+ - **`$priority$min{...}`**: Alias for `INT_MAX - 1` (Absolute lowest priority).
185
+ - **`$priority$1`, `$priority$2`, ...**: Successively lower priorities.
186
+ - **Recursive Evaluation**: Blocks with higher numerical priority (lower value) are fully processed — including their own nested patterns and commands — before any lower-priority blocks, regardless of their physical position in the file.
187
+ - **Unspecified Priority**: Any text not wrapped in a `$priority` block is treated as priority `INT_MAX - 1` (processed last).
315
188
 
316
- **Multi-character delimiters:**
189
+ #### Example:
190
+ ```text
191
+ $priority$1{ Result: A }
192
+ $priority$max{ $pattern{A}{OK} }
317
193
  ```
318
- $pattern {$x} {$eval<<parseInt($x)*2>>}
319
- 5
320
- ```
321
- Output: `10`
194
+ *Output: `Result: OK`* — even though `A` is used before being defined in the source, the `$priority$max` block ensures the pattern definition happens first.
322
195
 
323
196
  ---
324
197
 
325
- ## Important Rules
326
-
327
- ### Variable Matching
328
- * `$x` = smart capture (context-aware: word, until literal, or until block)
329
- * `$x?` = optional version of `$x` (won't fail if empty)
330
- * `$regex name {pattern}` = regex-based capture
331
- * Variables automatically skip leading whitespace
332
- * Trailing whitespace is trimmed when variables appear before literals
333
-
334
- ### Block Matching
335
- * `${open}{close}name` = nested block capture
336
- * `$${open}{close}name` = adjacent block capture (captures adjacent blocks)
337
- * Supports multi-character delimiters of any length
338
- * Empty delimiters `${}{}name` or `$${}{}name` use defaults from `symbols.open` and `symbols.close`
339
- * Sequential blocks are joined with spaces in the captured variable
198
+ ## Dynamic Variable Assignment ($from)
340
199
 
341
- ### Pattern Matching
342
- * `$pattern {match} {replace}` = pattern scoped to current context
343
- * Patterns inherit from parent scopes hierarchically
344
- * Each `process()` call starts with a clean slate (no persistence)
345
-
346
- ---
200
+ The **`$from`** operator allows you to capture processed content and assign it to a variable at runtime. This turns Papagaio into a stateful processor where variables can be defined, redefined, and chained.
347
201
 
348
- ## Multi-Character Delimiter Support
202
+ ### Syntax
203
+ `$NAME$from{...content...}`
349
204
 
350
- Papagaio fully supports multi-character delimiters throughout all features.
205
+ 1. **Recursive Processing**: The `content` is fully processed (patterns, WASM commands, other assignments) before being stored.
206
+ 2. **Immediate Registration**: The variable is registered as an exact-match rule as soon as it is parsed. This allows for **chained assignments**.
207
+ 3. **Output Suppression**: The entire `$from` directive is removed from the output text.
351
208
 
352
- ### Configuration
353
- ```javascript
354
- const p = new Papagaio('$', '<<<', '>>>');
355
- ```
209
+ ### Examples
356
210
 
357
- ### In Patterns
211
+ #### Chained Assignments
212
+ Variables can depend on previously defined variables within the same document:
213
+ ```text
214
+ $A$from{1}
215
+ $B$from{$A$A}
216
+ Count: $B
358
217
  ```
359
- $pattern<<<$x>>> <<<[$x]>>>
360
- hello
361
- ```
362
- Output: `[hello]`
218
+ *Output: `Count: 11`*
363
219
 
364
- ### In Blocks
365
- ```
366
- $pattern<<<${<<}{>>}data>>> <<<$data>>>
367
- <<content>>
220
+ #### Nested Assignments
221
+ You can define internal variables while defining a larger block:
222
+ ```text
223
+ $GREET$from{
224
+ $USER$from{Alice}
225
+ Hello, $USER!
226
+ }
227
+ $GREET
368
228
  ```
369
- Output: `content`
229
+ *Output: `Hello, Alice!`*
370
230
 
371
- ### In Eval
372
- ```
373
- $pattern<<<$x>>> <<<$eval<<<return $x + 1>>>>>>
374
- 5
231
+ #### Interaction with Patterns
232
+ Assignments can be used to dynamically generate pattern keys or replacements:
233
+ ```text
234
+ $KEY$from{FOO}
235
+ $pattern{$KEY}{BAR}
236
+ Result: FOO
375
237
  ```
376
- Output: `6`
238
+ *Output: `Result: BAR`*
377
239
 
378
240
  ---
379
241
 
380
- ## Advanced Examples
242
+ ## Plugin Development
381
243
 
382
- ### Markdown-like Processor
383
- ```javascript
384
- const p = new Papagaio();
385
- const template = `
386
- $pattern {# $title} {<h1>$title</h1>}
387
- $pattern {## $title} {<h2>$title</h2>}
388
- $pattern {**$text**} {<strong>$text</strong>}
389
-
390
- # Hello World
391
- ## Subtitle
392
- **bold text**
393
- `;
394
-
395
- p.process(template);
396
- // Output:
397
- // <h1>Hello World</h1>
398
- // <h2>Subtitle</h2>
399
- // <strong>bold text</strong>
400
- ```
244
+ Papagaio features a modern, frictionless Wasm plugin system. With the **`papagaiocc`** standalone compiler, you can write plugins in standard C using simple naming conventions and compile them into zero-dependency WebAssembly modules.
401
245
 
402
- ### Array/List Processor
403
- ```javascript
404
- const p = new Papagaio();
405
- const template = `
406
- $pattern {$${[}{]}items} {
407
- $eval{
408
- const arr = '$items'.split(' ');
409
- return arr.map((x, i) => \`\${i + 1}. \${x}\`).join('\\n');
410
- }
246
+ ### 1. Write your plugin
247
+ Create a file named `greet.c`:
248
+ ```c
249
+ // Functions starting with 'papagaio_' are automatically registered as commands
250
+ char* papagaio_greet(int argc, char **argv)
251
+ {
252
+ if (argc < 1) return "Hello, Stranger!";
253
+ return argv[0]; // return first argument
411
254
  }
412
-
413
- [apple][banana][cherry]
414
- `;
415
-
416
- p.process(template);
417
- // Output:
418
- // 1. apple
419
- // 2. banana
420
- // 3. cherry
421
255
  ```
422
256
 
423
- ### Template System with State
424
- ```javascript
425
- const p = new Papagaio();
426
- p.vars = {}; // Custom property for storing variables
427
-
428
- const template = `
429
- $pattern {var $name = $value} {$eval{
430
- papagaio.vars['$name'] = '$value';
431
- return '';
432
- }}
433
- $pattern {get $name} {$eval{
434
- return papagaio.vars['$name'] || 'undefined';
435
- }}
436
-
437
- var title = My Page
438
- var author = John Doe
439
- Title: get title
440
- Author: get author
441
- `;
442
-
443
- p.process(template);
444
- // Output:
445
- // Title: My Page
446
- // Author: John Doe
447
- ```
257
+ To use the Papagaio Wasm SDK (`lib.c`), copy it from `examples/lib.c` into your project and include it explicitly:
258
+ ```c
259
+ #include "lib.c"
448
260
 
449
- ### Conditional Processing
450
- ```javascript
451
- const p = new Papagaio();
452
- const template = `
453
- $pattern {if ${(}{)}cond then ${[}{]}yes else ${<}{>}no} {
454
- $eval{
455
- const condition = ($cond).trim();
456
- return condition === 'true' ? '$yes' : '$no';
457
- }
261
+ char* papagaio_greet(int argc, char **argv)
262
+ {
263
+ if (argc < 1) return "Hello, Stranger!";
264
+
265
+ // lib.c provides standard C functions like malloc and sprintf
266
+ char *res = (char*)malloc(strlen(argv[0]) + 16);
267
+ sprintf(res, "Hello, %s!", argv[0]);
268
+
269
+ return res;
458
270
  }
459
-
460
- if (true) then [yes branch] else <no branch>
461
- if (false) then [yes branch] else <no branch>
462
- `;
463
-
464
- p.process(template);
465
- // Output:
466
- // yes branch
467
- // no branch
468
271
  ```
469
272
 
470
- ### Function-like Patterns
471
- ```javascript
472
- const p = new Papagaio();
473
- const template = `
474
- $pattern {double $x} {$eval{return parseInt('$x') * 2}}
475
- $pattern {add $x $y} {$eval{return parseInt('$x') + parseInt('$y')}}
476
-
477
- double 5
478
- add 3 7
479
- add (double 4) 10
480
- `;
481
-
482
- p.process(template);
483
- // Output:
484
- // 10
485
- // 10
486
- // 18
487
- ```
488
-
489
- ### Sequential Block Processing
490
- ```javascript
491
- const p = new Papagaio();
492
- const template = `
493
- $pattern {sum $${[}{]}nums} {
494
- $eval{
495
- const numbers = '$nums'.split(' ').map(x => parseInt(x));
496
- return numbers.reduce((a, b) => a + b, 0);
497
- }
498
- }
499
-
500
- sum [10][20][30][40]
501
- `;
502
-
503
- p.process(template);
504
- // Output: 100
273
+ ### 2. Compile with `papagaiocc`
274
+ The `papagaiocc` tool is a self-contained compiler driver. Run it with your source file:
275
+ ```sh
276
+ ./papagaiocc greet.c
505
277
  ```
278
+ This generates `greet.wasm`.
506
279
 
507
- ---
508
-
509
- ## Troubleshooting
510
-
511
- | Problem | Solution |
512
- |---------|----------|
513
- | Variable not captured | Check context: use `$x?` for optional, or verify literals/blocks exist |
514
- | Block mismatch | Verify opening and closing delimiters match the declaration |
515
- | Infinite recursion | Pattern creates circular transformation; redesign pattern logic |
516
- | Pattern not matching | Verify whitespace between tokens, check if variable should be optional |
517
- | Pattern not available | Check scope hierarchy; patterns only inherit from parents, not siblings |
518
- | Nested blocks fail | Ensure delimiters are properly balanced |
519
- | Multi-char delimiters broken | Check delimiters don't conflict; use escaping if needed |
520
- | Regex not matching | Test regex pattern separately; ensure it matches at the exact position |
521
- | Empty delimiter behavior | `${}{}x` uses defaults; explicitly set if you need different behavior |
522
-
523
- ---
524
-
525
- ## Syntax Reference
526
-
280
+ If your plugin uses `lib.c`, pass the directory containing it via `-I`:
281
+ ```sh
282
+ ./papagaiocc greet.c -I /path/to/lib
527
283
  ```
528
- $pattern {$x $y} {$y, $x} # pattern with variables
529
- $pattern {$x? $y} {$y, $x} # optional variable
530
- $pattern {$regex n {[0-9]+}} {$n} # regex capture
531
- $pattern {${o}{c}n} {$n} # recursive block (nested)
532
- $pattern {$${o}{c}n} {$n} # sequential block (adjacent)
533
- $pattern {${}{}n} {$n} # block with default delimiters
534
- $eval{code} # JavaScript evaluation
284
+ Or simply place `lib.c` in the same directory as `greet.c`:
285
+ ```sh
286
+ # Copy the SDK alongside your source
287
+ cp examples/lib.c .
288
+ ./papagaiocc greet.c
535
289
  ```
536
290
 
537
- ---
538
-
539
- ## API Reference
540
-
541
- ### Constructor
542
- ```javascript
543
- new Papagaio(sigil, open, close, pattern, evalKw, blockKw, regexKw, blockseqKw)
291
+ ### 3. Use in Papagaio
292
+ Loading the Wasm file automatically registers all exported commands.
293
+ ```text
294
+ $wasm{greet.wasm}
295
+ $greet{Papagaio}
544
296
  ```
297
+ *Output: Hello, Papagaio!*
545
298
 
546
- **Parameters:**
547
- - `sigil` (default: `'$'`) - Variable prefix
548
- - `open` (default: `'{'`) - Opening delimiter
549
- - `close` (default: `'}'`) - Closing delimiter
550
- - `pattern` (default: `'pattern'`) - Pattern keyword
551
- - `evalKw` (default: `'eval'`) - Eval keyword
552
- - `regexKw` (default: `'regex'`) - Regex keyword
553
-
554
- ### Properties
555
- - `papagaio.content` - Last processed output
556
- - `papagaio.match` - Last matched substring (available in replacements)
557
- - `papagaio.symbols` - Configuration object
558
- - `papagaio.exit` - Optional hook function called after processing
559
-
560
- ### Methods
561
- - `papagaio.process(input)` - Process input text and return transformed output
562
-
563
- ### Exit Hook
564
- ```javascript
565
- const p = new Papagaio();
566
- p.exit = function() {
567
- console.log('Processing complete:', this.content);
568
- };
569
- p.process('$pattern {x} {y}\nx');
570
- ```
299
+ ### Wasm SDK (lib.c)
300
+ The Wasm SDK lives at `examples/lib.c` inside the repository. It is **not** automatically embedded into `papagaiocc` — you supply it to your plugin's build as needed. It provides a curated, zero-dependency C standard library for WebAssembly, including:
301
+ - **Memory Management**: `malloc`, `free`, `realloc`
302
+ - **String Processing**: `strlen`, `strcpy`, `sprintf`, `strrev`, etc.
303
+ - **Formatted I/O**: `printf`, `snprintf`, `sscanf`
304
+ - **Standard Math**: `sin`, `cos`, `pow`, etc.
571
305
 
572
306
  ---
573
307
 
574
- ## Performance Notes
308
+ ## Building
575
309
 
576
- * Multi-character delimiter matching is optimized with substring operations
577
- * Sequential blocks scan for adjacent matches without recursion overhead
578
- * Nested patterns inherit parent patterns through recursive application
579
- * Nested blocks and patterns have no theoretical depth limit
580
- * Large recursion limits can impact performance on complex inputs
581
- * Each `process()` call is independent with no persistent state between calls
310
+ ```sh
311
+ make # Core & CLI
312
+ make papagaiocc # Standalone plugin compiler
313
+ make wasm # WebAssembly build (Papagaio in the browser/node)
314
+ make test # Run comprehensive test suite
315
+ ```
582
316
 
583
- ---
317
+ ## References
584
318
 
585
- ***PAPAGAIO IS CURRENTLY IN HEAVY DEVELOPMENT AND EXPERIMENTATION PHASE***
319
+ - [cpp](https://en.wikipedia.org/wiki/C_preprocessor)
320
+ - [m4](https://www.gnu.org/software/m4/)
321
+ - [libregexp](https://bellard.org/quickjs/)
322
+ - [quickjs](https://bellard.org/quickjs/)
323
+ - [tcc](https://bellard.org/tcc/)
324
+ - [wasm3](https://github.com/wasm3/wasm3)
325
+ - [watr](https://github.com/dy/watr)