@chaprola/mcp-server 1.8.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/index.js CHANGED
@@ -416,7 +416,21 @@ server.tool("chaprola_list", "List files in a project with optional wildcard pat
416
416
  return textResult(res);
417
417
  }));
418
418
  // --- Compile ---
419
- server.tool("chaprola_compile", "Compile Chaprola source (.CS) to bytecode (.PR). READ chaprola://cookbook BEFORE writing source. Key syntax: no PROGRAM keyword (start with commands), no commas, reports can use MOVE+PRINT 0 buffers or one-line PRINT concatenation, SEEK for primary records, OPEN/READ/WRITE/CLOSE for secondary files, LET supports one operation (no parentheses). Use primary_format to enable P.fieldname addressing (recommended) — the compiler resolves field names to positions and lengths from the format file. If compile fails, call chaprola_help before retrying.", {
419
+ server.tool("chaprola_compile", `Compile Chaprola source (.CS) to bytecode (.PR). READ chaprola://cookbook BEFORE writing source.
420
+
421
+ STYLE RULES (mandatory — project review enforces these):
422
+ 1. Use QUERY instead of SEEK loops for filtering or single-record lookup. SEEK loops only for processing every record unconditionally.
423
+ 2. Don't use MOVE + IF EQUAL for comparisons — use QUERY WHERE.
424
+ 3. Use implicit variable assignment (LET name = value) — don't use DEFINE VARIABLE.
425
+ 4. END/STOP only for early exit — not needed at end of program.
426
+ 5. OPEN PRIMARY not needed when using QUERY with primary_format.
427
+ 6. Use named read (READ name rec + name.field) instead of OPEN SECONDARY + S.field for QUERY results.
428
+ 7. Every program MUST have an intent file (.DS) — one paragraph: what the program does, parameters, output, who uses it.
429
+ 8. Add a comment header — first lines describe purpose and parameters.
430
+ 9. Use PRINT concatenation (PRINT "text" + P.field + R1), not MOVE + PRINT 0 buffers.
431
+ 10. Use RECORDNUMBERS for bulk delete: QUERY INTO name, then DELETE PRIMARY name.RECORDNUMBERS.
432
+
433
+ KEY SYNTAX: no PROGRAM keyword (start with commands), no commas, LET supports one operation (no parentheses), no built-in functions. Use primary_format to enable P.fieldname addressing — the compiler resolves field names to positions and lengths from the format file. If compile fails, call chaprola_help before retrying.`, {
420
434
  project: z.string().describe("Project name"),
421
435
  name: z.string().describe("Program name (without extension)"),
422
436
  source: z.string().describe("Chaprola source code"),
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@chaprola/mcp-server",
3
- "version": "1.8.0",
3
+ "version": "1.9.0",
4
4
  "description": "MCP server for Chaprola — agent-first data platform. Gives AI agents tools for structured data storage, record CRUD, querying, schema inspection, documentation lookup, web search, URL fetching, scheduled jobs, scoped site keys, and execution via plain HTTP.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -1,5 +1,18 @@
1
1
  # Chaprola Cookbook — Quick Reference
2
2
 
3
+ ## Style Rules (mandatory)
4
+
5
+ Every program must follow these rules. The project review system enforces them.
6
+
7
+ 1. **Use QUERY instead of SEEK loops** for filtering or single-record lookup. SEEK loops are only appropriate when processing every record unconditionally.
8
+ 2. **Don't use MOVE + IF EQUAL for comparisons** — use QUERY WHERE.
9
+ 3. **Use implicit variable assignment** (`LET name = value`) — don't use DEFINE VARIABLE.
10
+ 4. **END/STOP only for early exit** — not needed at the end of a program.
11
+ 5. **OPEN PRIMARY not needed** when using QUERY with primary_format on compile.
12
+ 6. **Use named read** (`READ name rec` + `name.field`) instead of OPEN SECONDARY + S.field for QUERY results.
13
+ 7. **Every program must have an intent file (.DS)** — one paragraph: what the program does, what parameters it accepts, what output it produces, who uses it. Create the intent with chaprola_compile or write it manually.
14
+ 8. **Add a comment header** — first line(s) should describe the program's purpose and parameters.
15
+
3
16
  ## Workflow: Import → Compile → Run
4
17
 
5
18
  ```bash
@@ -13,136 +26,192 @@ POST /compile {userid, project, name: "REPORT", source: "...", primary_format: "
13
26
  POST /run {userid, project, name: "REPORT", primary_file: "STAFF", record: 1}
14
27
  ```
15
28
 
16
- ## R-Variable Ranges
17
-
18
- | Range | Purpose | Safe for DEFINE VARIABLE? |
19
- |-------|---------|--------------------------|
20
- | R1–R20 | HULDRA elements (parameters) | No — HULDRA overwrites these |
21
- | R21–R40 | HULDRA objectives (error metrics) | No — HULDRA reads these |
22
- | R41–R99 | Scratch space | **Yes — always use R41–R99 for DEFINE VARIABLE** |
23
-
24
- For non-HULDRA programs, R1–R40 are technically available but using R41–R99 is a good habit.
25
-
26
- ## PRINT: Preferred Output Methods
29
+ ## Hello World (no data file)
27
30
 
28
- **Concatenation (preferred):**
29
31
  ```chaprola
30
- PRINT P.name + " " + P.department + " — $" + R41
31
- PRINT "Total: " + R42
32
- PRINT P.last_name // single field, auto-trimmed
33
- PRINT "Hello from Chaprola!" // literal string
32
+ // Hello Worldminimal Chaprola program
33
+ PRINT "Hello from Chaprola!"
34
34
  ```
35
35
 
36
- - String literals are copied as-is.
37
- - P./S./U./X. fields are auto-trimmed (trailing spaces removed).
38
- - R-variables print as integers when no fractional part, otherwise as floats.
39
- - Concatenation auto-flushes the line.
36
+ No END needed the program ends naturally.
40
37
 
41
- **U buffer output (for fixed-width columnar reports only):**
42
- ```chaprola
43
- CLEAR U
44
- MOVE P.name U.1 20
45
- PUT sal INTO U.22 10 D 2
46
- PRINT 0 // output entire U buffer, then clear
47
- ```
38
+ ## Loop Through All Records
48
39
 
49
- ## Hello World (no data file)
40
+ When processing every record unconditionally (no filter), SEEK loops are appropriate:
50
41
 
51
42
  ```chaprola
52
- PRINT "Hello from Chaprola!"
53
- END
43
+ // REPORT List all staff with salaries
44
+ // Primary: STAFF
45
+
46
+ LET rec = 1
47
+ 100 SEEK rec
48
+ IF EOF END
49
+ PRINT P.name + " — " + P.salary
50
+ LET rec = rec + 1
51
+ GOTO 100
54
52
  ```
55
53
 
56
- ## Loop Through All Records
54
+ Compile with `primary_format: "STAFF"`. No OPEN PRIMARY needed — the compiler reads the format from primary_format.
57
55
 
58
- Always start programs with `OPEN PRIMARY` to declare the data file. This makes the program self-documenting and eliminates the need for `primary_format` on compile.
56
+ ## Filtered Report (QUERY)
57
+
58
+ When you want a subset of records, use QUERY — not a SEEK loop with IF conditions:
59
59
 
60
60
  ```chaprola
61
- OPEN PRIMARY "STAFF" 0
62
- DEFINE VARIABLE rec R41
61
+ // HIGH_EARNERS List staff earning over 80000
62
+ // Primary: STAFF
63
+
64
+ QUERY STAFF INTO earners WHERE salary GT 80000 ORDER BY salary DESC
65
+
63
66
  LET rec = 1
64
- 100 SEEK rec
65
- IF EOF GOTO 900
66
- PRINT P.name + " — " + P.salary
67
+ 100 READ earners rec
68
+ IF EOF END
69
+ PRINT earners.name + " — $" + earners.salary
67
70
  LET rec = rec + 1
68
71
  GOTO 100
69
- 900 END
70
72
  ```
71
73
 
72
- Compile without `primary_format` — the compiler reads the format from `OPEN PRIMARY`:
73
- ```bash
74
- POST /compile {userid, project, name: "REPORT", source: "OPEN PRIMARY \"STAFF\" 0\n..."}
74
+ ## Single-Record Lookup (QUERY)
75
+
76
+ For finding one record by key, use QUERY not a SEEK/MOVE/IF EQUAL loop:
77
+
78
+ ```chaprola
79
+ // DETAIL — Look up a single staff member by ID
80
+ // Parameter: staff_id
81
+ // Primary: STAFF
82
+
83
+ QUERY STAFF INTO person WHERE staff_id EQ PARAM.staff_id
84
+
85
+ LET rec = 1
86
+ READ person rec
87
+ IF EOF END
88
+ PRINT "Name: " + person.name
89
+ PRINT "Department: " + person.department
90
+ PRINT "Salary: $" + person.salary
75
91
  ```
76
92
 
77
- ## Filtered Report
93
+ ## DELETE with QUERY (RECORDNUMBERS)
94
+
95
+ Use QUERY to find records, then DELETE with RECORDNUMBERS for bulk deletion:
78
96
 
79
97
  ```chaprola
80
- GET sal FROM P.salary
81
- IF sal LT 80000 GOTO 200 // skip low earners
82
- PRINT P.name + " — " + R41
83
- 200 LET rec = rec + 1
98
+ // CLEANUP Delete all closed polls and their votes
99
+ // Parameter: poll_id
100
+ // Primary: polls, Secondary: votes
101
+
102
+ QUERY polls INTO poll WHERE poll_id EQ PARAM.poll_id
103
+ DELETE PRIMARY poll.RECORDNUMBERS
104
+
105
+ OPEN "votes" WHERE poll_id EQ PARAM.poll_id
106
+ DELETE votes.RECORDNUMBERS
107
+ CLOSE
108
+
109
+ IF poll.RECORDCOUNT EQ 0 GOTO 900 ;
110
+ PRINT "STATUS|OK"
111
+ PRINT "VOTES_DELETED|" + votes.RECORDCOUNT
112
+ END
113
+
114
+ 900 PRINT "STATUS|NOT_FOUND"
84
115
  ```
85
116
 
117
+ - `poll.RECORDNUMBERS` returns all physical record positions matched by the QUERY
118
+ - `DELETE PRIMARY poll.RECORDNUMBERS` bulk-deletes them in one statement
119
+ - `votes.RECORDCOUNT` returns the filtered count from OPEN WHERE, not total file count
120
+
86
121
  ## JOIN Two Files (FIND)
87
122
 
88
123
  ```chaprola
124
+ // ROSTER — Staff with department names
125
+ // Primary: EMPLOYEES, Secondary: DEPARTMENTS
126
+
89
127
  OPEN "DEPARTMENTS" 0
90
- FIND match FROM S.dept_code USING P.dept_code
91
- IF match EQ 0 GOTO 200 // no match
92
- READ match // load matched secondary record
93
- PRINT P.name + " " + S.dept_name
128
+ LET rec = 1
129
+ 100 SEEK rec
130
+ IF EOF END
131
+ FIND match FROM S.dept_code USING P.dept_code
132
+ IF match EQ 0 GOTO 200
133
+ READ match
134
+ PRINT P.name + " — " + S.dept_name
135
+ 200 LET rec = rec + 1
136
+ GOTO 100
94
137
  ```
95
138
 
96
- Compile with both formats so the compiler resolves fields from both files:
139
+ Compile with both formats:
97
140
  ```bash
98
141
  POST /compile {
99
- userid, project, name: "REPORT",
142
+ userid, project, name: "ROSTER",
100
143
  source: "...",
101
144
  primary_format: "EMPLOYEES",
102
145
  secondary_format: "DEPARTMENTS"
103
146
  }
104
147
  ```
105
148
 
106
- ## Comparing Two Memory Locations
107
-
108
- IF EQUAL compares a literal to a location. To compare two memory locations, copy both to U buffer:
149
+ ## PRINT: Preferred Output Methods
109
150
 
151
+ **Concatenation (preferred):**
110
152
  ```chaprola
111
- MOVE PARAM.poll_id U.200 12
112
- MOVE P.poll_id U.180 12
113
- IF EQUAL U.200 U.180 12 GOTO 200 // match jump to handler
153
+ PRINT P.name + " — " + P.department + " — $" + R41
154
+ PRINT "Total: " + R42
155
+ PRINT P.last_name // single field, auto-trimmed
156
+ PRINT "Hello from Chaprola!" // literal string
114
157
  ```
115
158
 
116
- ## Read-Modify-Write (UPDATE)
159
+ - String literals are copied as-is.
160
+ - P./S./U./X. fields are auto-trimmed (trailing spaces removed).
161
+ - R-variables print as integers when no fractional part, otherwise as floats.
162
+ - Concatenation auto-flushes the line.
117
163
 
164
+ **U buffer output (for fixed-width columnar reports only):**
118
165
  ```chaprola
119
- READ match // load record
120
- GET bal FROM S.balance // read current value
121
- LET bal = bal + amt // modify
122
- PUT bal INTO S.balance F 0 // write back to S memory (length auto-filled)
123
- WRITE match // flush to disk
124
- CLOSE // flush all at end
166
+ CLEAR U
167
+ MOVE P.name U.1 20
168
+ PUT sal INTO U.22 10 D 2
169
+ PRINT 0 // output entire U buffer, then clear
125
170
  ```
126
171
 
127
- ## Date Arithmetic
172
+ ## R-Variable Ranges
173
+
174
+ | Range | Purpose | Notes |
175
+ |-------|---------|-------|
176
+ | R1–R20 | HULDRA elements (parameters) | HULDRA overwrites these |
177
+ | R21–R40 | HULDRA objectives (error metrics) | HULDRA reads these |
178
+ | R41–R99 | Scratch space | Always safe |
179
+
180
+ For non-HULDRA programs, all R1–R99 are available. Use implicit assignment (`LET name = value`) — the compiler assigns R-variable slots automatically.
181
+
182
+ ## Read-Modify-Write (UPDATE)
128
183
 
129
184
  ```chaprola
130
- GET DATE R41 FROM X.primary_modified // when was file last changed?
131
- GET DATE R42 FROM X.utc_time // what time is it now?
132
- LET R43 = R42 - R41 // difference in seconds
133
- LET R43 = R43 / 86400 // convert to days
134
- IF R43 GT 30 PRINT "WARNING: file is over 30 days old" ;
185
+ // UPDATE_BALANCE Add amount to account balance
186
+ // Primary: ACCOUNTS, Secondary: LEDGER
187
+
188
+ OPEN "LEDGER" 0
189
+ FIND match FROM S.account_id USING P.account_id
190
+ IF match EQ 0 END
191
+ READ match
192
+ GET bal FROM S.balance
193
+ LET bal = bal + amt
194
+ PUT bal INTO S.balance F 0
195
+ WRITE match
196
+ CLOSE
135
197
  ```
136
198
 
137
- ## Get Current User
199
+ ## Date Arithmetic
138
200
 
139
201
  ```chaprola
140
- PRINT "Logged in as: " + X.username
202
+ // CHECK_FRESHNESS Warn if file is stale
203
+ // Primary: any
204
+
205
+ GET DATE R1 FROM X.primary_modified
206
+ GET DATE R2 FROM X.utc_time
207
+ LET days = R2 - R1
208
+ LET days = days / 86400
209
+ IF days GT 30 PRINT "WARNING: file is over 30 days old" ;
141
210
  ```
142
211
 
143
212
  ## System Text Properties (X.)
144
213
 
145
- Access system metadata by property name — no numeric positions needed:
214
+ Access system metadata by property name:
146
215
 
147
216
  | Property | Description |
148
217
  |----------|-------------|
@@ -157,6 +226,51 @@ Access system metadata by property name — no numeric positions needed:
157
226
  | `X.primary_modified` | Primary file Last-Modified |
158
227
  | `X.secondary_modified` | Secondary file Last-Modified |
159
228
 
229
+ ## Parameterized Reports (PARAM.name)
230
+
231
+ Programs accept named parameters from URL query strings:
232
+
233
+ ```chaprola
234
+ // STAFF_BY_DEPT — List staff in a department
235
+ // Parameter: dept
236
+ // Primary: STAFF
237
+
238
+ QUERY STAFF INTO team WHERE department EQ PARAM.dept ORDER BY salary DESC
239
+
240
+ LET rec = 1
241
+ 100 READ team rec
242
+ IF EOF END
243
+ PRINT team.name + " — " + team.title + " — $" + team.salary
244
+ LET rec = rec + 1
245
+ GOTO 100
246
+ ```
247
+
248
+ Publish with: `POST /publish {userid, project, name: "STAFF_BY_DEPT", primary_file: "STAFF", acl: "authenticated"}`
249
+ Call with: `GET /report?userid=X&project=Y&name=STAFF_BY_DEPT&dept=Engineering`
250
+ Discover params: `POST /report/params {userid, project, name}` — returns .PF schema
251
+
252
+ ## Cross-File Filtering (IN/NOT IN)
253
+
254
+ Use QUERY with NOT IN to find records in one file that don't appear in another:
255
+
256
+ ```chaprola
257
+ // UNREVIEWED — Find flashcards the user hasn't studied yet
258
+ // Parameters: username
259
+ // Primary: flashcards
260
+
261
+ QUERY progress INTO reviewed WHERE username EQ PARAM.username
262
+ QUERY flashcards INTO new_cards WHERE kanji NOT IN "reviewed.kanji"
263
+
264
+ LET rec = 1
265
+ 100 READ new_cards rec
266
+ IF EOF END
267
+ PRINT new_cards.kanji + " — " + new_cards.reading + " — " + new_cards.meaning
268
+ LET rec = rec + 1
269
+ GOTO 100
270
+ ```
271
+
272
+ If the IN-file doesn't exist (e.g., new user), NOT IN treats it as empty — all records pass.
273
+
160
274
  ## Async for Large Datasets
161
275
 
162
276
  ```bash
@@ -169,77 +283,47 @@ POST /run/status {userid, project, job_id}
169
283
  # Response: {status: "done", output: "..."}
170
284
  ```
171
285
 
172
- ## Parameterized Reports (PARAM.name)
286
+ ## Public Apps: Use /report, Not /query
173
287
 
174
- Programs can accept named parameters from URL query strings. Use this for dynamic reports.
288
+ Site keys require BAA signing. For public-facing web apps, use published reports (`/report`) instead of `/query` for all read operations. `/report` is public no auth or BAA needed.
289
+
290
+ ```javascript
291
+ // GOOD: public report — no auth needed
292
+ const url = `${API}/report?userid=myapp&project=data&name=RESULTS&poll_id=${id}`;
293
+ const response = await fetch(url);
294
+
295
+ // BAD: /query with site key — fails if BAA not signed
296
+ const response = await fetch(`${API}/query`, {
297
+ headers: { 'Authorization': `Bearer ${SITE_KEY}` },
298
+ body: JSON.stringify({ userid: 'myapp', project: 'data', file: 'votes', where: [...] })
299
+ });
300
+ ```
301
+
302
+ ## Chart Data with TABULATE
175
303
 
176
304
  ```chaprola
177
- // Report that accepts &deck=kanji&level=3 as URL params
178
- MOVE PARAM.deck U.1 20 // string param → U buffer
179
- LET lvl = PARAM.level // numeric param → R variable
180
- SEEK 1
181
- 100 IF EOF GOTO 900
182
- MOVE P.deck U.30 10
183
- IF EQUAL PARAM.deck U.30 GOTO 200 // filter by deck param
184
- GOTO 300
185
- 200 GET cardlvl FROM P.level
186
- IF cardlvl NE lvl GOTO 300 // filter by level param
187
- PRINT P.kanji + " — " + P.reading
188
- 300 LET rec = rec + 1
189
- SEEK rec
190
- GOTO 100
191
- 900 END
305
+ // TRENDS Cross-tabulate mortality by cause and year
306
+ // Primary: mortality
307
+
308
+ TABULATE mortality SUM deaths FOR cause VS year WHERE year GE "2020" INTO trends
309
+
310
+ PRINT TABULATE trends AS CSV
192
311
  ```
193
312
 
194
- Publish with: `POST /publish {userid, project, name, primary_file, acl: "authenticated"}`
195
- Call with: `GET /report?userid=X&project=Y&name=Z&deck=kanji&level=3`
196
- Discover params: `POST /report/params {userid, project, name}` → returns .PF schema (field names, types, widths)
313
+ For web apps, use `PRINT TABULATE trends AS JSON`.
197
314
 
198
315
  ## Named Output Positions (U.name)
199
316
 
200
- Instead of `U.1`, `U.12`, etc., use named positions for readable code:
317
+ Instead of `U.1`, `U.12`, use named positions:
201
318
 
202
319
  ```chaprola
203
- // U.name positions are auto-allocated by the compiler
204
320
  MOVE P.name U.name 20
205
321
  MOVE P.dept U.dept 10
206
322
  PUT sal INTO U.salary 10 D 0
207
323
  PRINT 0
208
324
  ```
209
325
 
210
- ## GROUP BY with Pivot (via /query)
211
-
212
- Chaprola's pivot IS GROUP BY. Schema: `{row, column, value, aggregate}`.
213
-
214
- ```bash
215
- # SQL: SELECT department, AVG(salary) FROM staff GROUP BY department
216
- POST /query {
217
- userid, project, file: "STAFF",
218
- pivot: {
219
- row: "department",
220
- column: "",
221
- value: "salary",
222
- aggregate: "avg"
223
- }
224
- }
225
-
226
- # SQL: SELECT department, year, SUM(revenue) FROM sales GROUP BY department, year
227
- POST /query {
228
- userid, project, file: "SALES",
229
- pivot: {
230
- row: "department",
231
- column: "year",
232
- value: "revenue",
233
- aggregate: "sum"
234
- }
235
- }
236
- ```
237
-
238
- - `row` — grouping field
239
- - `column` — cross-tab field (use `""` for simple aggregation)
240
- - `value` — field to aggregate (string or array of strings)
241
- - `aggregate` — function: `count`, `sum`, `avg`, `min`, `max`, `stddev`
242
- - Row and column totals included automatically in response
326
+ Positions are auto-allocated by the compiler.
243
327
 
244
328
  ## PUT Format Codes
245
329
 
@@ -252,39 +336,17 @@ POST /query {
252
336
 
253
337
  Syntax: `PUT R41 INTO P.salary D 2` (R-var, field name, format, decimals — length auto-filled)
254
338
 
255
- ## Common Field Widths
256
-
257
- | Data type | Chars | Example |
258
- |-----------|-------|---------|
259
- | ISO datetime | 20 | `2026-03-28T14:30:00Z` |
260
- | UUID | 36 | `550e8400-e29b-41d4-a716-446655440000` |
261
- | Email | 50 | `user@example.com` |
262
- | Short ID | 8–12 | `poll_001` |
263
- | Dollar amount | 10 | `$1,234.56` |
264
- | Phone | 15 | `+1-555-123-4567` |
265
-
266
- Use these when sizing MOVE lengths and U buffer positions.
267
-
268
- ## Memory Regions
269
-
270
- | Prefix | Description |
271
- |--------|-------------|
272
- | `P` | Primary data file — use field names: `P.salary`, `P.name` |
273
- | `S` | Secondary data file — use field names: `S.dept`, `S.emp_id` |
274
- | `U` | User buffer (scratch for output) |
275
- | `X` | System text — use property names: `X.username`, `X.utc_time` |
276
-
277
339
  ## Math Intrinsics
278
340
 
279
341
  ```chaprola
280
342
  LET R42 = EXP R41 // e^R41
281
343
  LET R42 = LOG R41 // ln(R41)
282
- LET R42 = SQRT R41 // R41
344
+ LET R42 = SQRT R41 // sqrt(R41)
283
345
  LET R42 = ABS R41 // |R41|
284
346
  LET R43 = POW R41 R42 // R41^R42
285
347
  ```
286
348
 
287
- ## Import-Download: URL Dataset (Parquet, Excel, CSV, JSON)
349
+ ## Import-Download: URL to Dataset (Parquet, Excel, CSV, JSON)
288
350
 
289
351
  ```bash
290
352
  # Import Parquet from a cloud data lake
@@ -304,16 +366,57 @@ POST /import-download {
304
366
  ```
305
367
 
306
368
  Supports: CSV, TSV, JSON, NDJSON, Parquet (zstd/snappy/lz4), Excel (.xlsx/.xls).
307
- AI instructions are optional — omit to import all columns as-is.
308
369
  Lambda: 10 GB /tmp, 900s timeout, 500 MB download limit.
309
370
 
371
+ ## GROUP BY with Pivot (via /query)
372
+
373
+ Chaprola's pivot IS GROUP BY:
374
+
375
+ ```bash
376
+ # SQL: SELECT department, AVG(salary) FROM staff GROUP BY department
377
+ POST /query {
378
+ userid, project, file: "STAFF",
379
+ pivot: {
380
+ row: "department",
381
+ column: "",
382
+ value: "salary",
383
+ aggregate: "avg"
384
+ }
385
+ }
386
+ ```
387
+
388
+ - `row` — grouping field
389
+ - `column` — cross-tab field (use `""` for simple aggregation)
390
+ - `value` — field to aggregate
391
+ - `aggregate` — count, sum, avg, min, max, stddev
392
+
393
+ ## Common Field Widths
394
+
395
+ | Data type | Chars | Example |
396
+ |-----------|-------|---------|
397
+ | ISO datetime | 20 | `2026-03-28T14:30:00Z` |
398
+ | UUID | 36 | `550e8400-e29b-41d4-a716-446655440000` |
399
+ | Email | 50 | `user@example.com` |
400
+ | Short ID | 8–12 | `poll_001` |
401
+ | Dollar amount | 10 | `$1,234.56` |
402
+ | Phone | 15 | `+1-555-123-4567` |
403
+
404
+ ## Memory Regions
405
+
406
+ | Prefix | Description |
407
+ |--------|-------------|
408
+ | `P` | Primary data file — use field names: `P.salary`, `P.name` |
409
+ | `S` | Secondary data file — use field names: `S.dept`, `S.emp_id` |
410
+ | `U` | User buffer (scratch for output) |
411
+ | `X` | System text — use property names: `X.username`, `X.utc_time` |
412
+
310
413
  ## HULDRA Optimization — Nonlinear Parameter Fitting
311
414
 
312
- HULDRA finds the best parameter values for a mathematical model by minimizing the difference between model predictions and observed data. You propose a model, HULDRA finds the coefficients.
415
+ HULDRA finds the best parameter values for a mathematical model by minimizing the difference between model predictions and observed data.
313
416
 
314
417
  ### How It Works
315
418
 
316
- 1. You write a VALUE program (normal Chaprola) that reads data, computes predictions using R-variable parameters, and stores the error in an objective R-variable
419
+ 1. Write a VALUE program that reads data, computes predictions using R-variable parameters, and stores the error in an objective R-variable
317
420
  2. HULDRA repeatedly runs your program with different parameter values, using gradient descent to minimize the objective
318
421
  3. When the objective stops improving, HULDRA returns the optimal parameters
319
422
 
@@ -327,7 +430,7 @@ HULDRA finds the best parameter values for a mathematical model by minimizing th
327
430
 
328
431
  ### Complete Example: Fit a Linear Model
329
432
 
330
- **Goal:** Find `salary = a × years_exp + b` that best fits employee data.
433
+ **Goal:** Find `salary = a * years_exp + b` that best fits employee data.
331
434
 
332
435
  **Step 1: Import data**
333
436
  ```bash
@@ -344,33 +447,29 @@ POST /import {
344
447
  ```
345
448
 
346
449
  **Step 2: Write and compile the VALUE program**
450
+
451
+ Note: HULDRA VALUE programs are the one case where you MUST use R1-R20 directly (HULDRA sets elements) and R21-R40 directly (HULDRA reads objectives). Use implicit assignment for scratch variables R41+.
452
+
347
453
  ```chaprola
348
- // VALUE program: salary = R1 * years_exp + R2
454
+ // SALFIT — Linear salary model: salary = R1 * years_exp + R2
349
455
  // R1 = slope (per-year raise), R2 = base salary
350
456
  // R21 = sum of squared residuals (SSR)
457
+ // Primary: EMP
351
458
 
352
- DEFINE VARIABLE REC R41
353
- DEFINE VARIABLE YRS R42
354
- DEFINE VARIABLE SAL R43
355
- DEFINE VARIABLE PRED R44
356
- DEFINE VARIABLE RESID R45
357
- DEFINE VARIABLE SSR R46
358
-
359
- LET SSR = 0
360
- LET REC = 1
361
- 100 SEEK REC
459
+ LET ssr = 0
460
+ LET rec = 1
461
+ 100 SEEK rec
362
462
  IF EOF GOTO 200
363
- GET YRS FROM P.years_exp
364
- GET SAL FROM P.salary
365
- LET PRED = R1 * YRS
366
- LET PRED = PRED + R2
367
- LET RESID = PRED - SAL
368
- LET RESID = RESID * RESID
369
- LET SSR = SSR + RESID
370
- LET REC = REC + 1
463
+ GET yrs FROM P.years_exp
464
+ GET sal FROM P.salary
465
+ LET pred = R1 * yrs
466
+ LET pred = pred + R2
467
+ LET resid = pred - sal
468
+ LET resid = resid * resid
469
+ LET ssr = ssr + resid
470
+ LET rec = rec + 1
371
471
  GOTO 100
372
- 200 LET R21 = SSR
373
- END
472
+ 200 LET R21 = ssr
374
473
  ```
375
474
 
376
475
  Compile with: `primary_format: "EMP"`
@@ -392,285 +491,67 @@ POST /optimize {
392
491
  }
393
492
  ```
394
493
 
395
- **Response:**
396
- ```json
397
- {
398
- "status": "converged",
399
- "iterations": 12,
400
- "elements": [
401
- {"index": 1, "label": "per_year_raise", "value": 4876.5},
402
- {"index": 2, "label": "base_salary", "value": 46230.1}
403
- ],
404
- "objectives": [
405
- {"index": 1, "label": "SSR", "value": 2841050.3, "goal": 0.0}
406
- ],
407
- "elapsed_seconds": 0.02
408
- }
409
- ```
410
-
411
- **Result:** `salary = $4,877/year × experience + $46,230 base`
412
-
413
- ### Element Parameters Explained
494
+ ### Element Parameters
414
495
 
415
496
  | Field | Description | Guidance |
416
497
  |-------|-------------|----------|
417
- | `index` | Maps to R-variable (1 R1, 2 R2, ...) | Max 20 elements |
498
+ | `index` | Maps to R-variable (1 = R1, 2 = R2, ...) | Max 20 elements |
418
499
  | `label` | Human-readable name | Returned in results |
419
- | `start` | Initial guess | Closer to true value = faster convergence |
420
- | `min`, `max` | Bounds | HULDRA clamps parameters to this range |
421
- | `delta` | Step size for gradient computation | ~0.1% of expected value range. Too large = inaccurate gradients. Too small = numerical noise |
500
+ | `start` | Initial guess | Closer = faster convergence |
501
+ | `min`, `max` | Bounds | HULDRA clamps to this range |
502
+ | `delta` | Step size for gradient | ~0.1% of expected value range |
422
503
 
423
504
  ### Choosing Delta Values
424
505
 
425
- Delta controls how HULDRA estimates gradients (via central differences). Rules of thumb:
426
- - **Dollar amounts** (fares, salaries): `delta: 0.01` to `1.0`
427
- - **Rates/percentages** (per-mile, per-minute): `delta: 0.001` to `0.01`
506
+ - **Dollar amounts**: `delta: 0.01` to `1.0`
507
+ - **Rates/percentages**: `delta: 0.001` to `0.01`
428
508
  - **Counts/integers**: `delta: 0.1` to `1.0`
429
- - **Time values** (hours, peaks): `delta: 0.05` to `0.5`
430
-
431
- If optimization doesn't converge, try making delta smaller.
432
-
433
- ### Performance & Limits
509
+ - **Time values**: `delta: 0.05` to `0.5`
434
510
 
435
- HULDRA runs your VALUE program **1 + 2 × N_elements** times per iteration (once for evaluation, twice per element for gradient). With `max_iterations: 100`:
511
+ ### Performance
436
512
 
437
- | Elements | VM runs/iteration | At 100 iterations |
438
- |----------|-------------------|-------------------|
439
- | 2 | 5 | 500 |
440
- | 3 | 7 | 700 |
441
- | 5 | 11 | 1,100 |
442
- | 10 | 21 | 2,100 |
513
+ HULDRA runs your program `1 + 2 * N_elements` times per iteration. Lambda timeout is 900 seconds. For large datasets, sample first — query 200-500 representative records, optimize against the sample.
443
514
 
444
- **Lambda timeout is 900 seconds.** If each VM run takes 0.01s (100 records), you're fine. If each run takes 1s (100K records), 3 elements × 100 iterations = 700s — cutting it close.
515
+ Use `async_exec: true` for optimizations that might exceed 30 seconds.
445
516
 
446
- **Strategy for large datasets:** Sample first. Query 200–500 representative records into a smaller dataset, optimize against that. The coefficients transfer to the full dataset.
447
-
448
- ```bash
449
- # Sample 500 records from a large dataset
450
- POST /query {userid, project, file: "BIGDATA", limit: 500, offset: 100000}
451
- # Import the sample
452
- POST /import {userid, project, name: "SAMPLE", data: [...results...]}
453
- # Optimize against the sample
454
- POST /optimize {... primary_file: "SAMPLE" ...}
455
- ```
456
-
457
- ### Async Optimization
458
-
459
- For optimizations that might exceed 30 seconds (API Gateway timeout), use async mode:
460
-
461
- ```bash
462
- POST /optimize {
463
- ... async_exec: true ...
464
- }
465
- # Response: {status: "running", job_id: "20260325_..."}
466
-
467
- POST /optimize/status {userid, project, job_id: "20260325_..."}
468
- # Response: {status: "converged", elements: [...], ...}
469
- ```
470
-
471
- ### Multi-Objective Optimization
472
-
473
- HULDRA can minimize multiple objectives simultaneously with different weights:
474
-
475
- ```bash
476
- objectives: [
477
- {index: 1, label: "price_error", goal: 0.0, weight: 1.0},
478
- {index: 2, label: "volume_error", goal: 0.0, weight: 10.0}
479
- ]
480
- ```
481
-
482
- Higher weight = more important. HULDRA minimizes `Q = sum(weight × (value - goal)²)`.
483
-
484
- ### Interpreting Results
485
-
486
- - **`status: "converged"`** — Optimal parameters found. The objective stopped improving.
487
- - **`status: "timeout"`** — Hit 900s wall clock. Results are the best found so far — often still useful.
488
- - **`total_objective`** — The raw Q value. Compare across runs, not in absolute terms. Lower = better fit.
489
- - **`SSR` (objective value)** — Sum of squared residuals. Divide by record count for mean squared error. Take the square root for RMSE in the same units as your data.
490
- - **`dq_dx` on elements** — Gradient. Values near zero mean the parameter is well-optimized. Large values may indicate the bounds are too tight.
491
-
492
- ### Model Catalog — Which Formula to Try
493
-
494
- HULDRA fits any model expressible with Chaprola's math: `+`, `-`, `*`, `/`, `EXP`, `LOG`, `SQRT`, `ABS`, `POW`, and `IF` branching. Use this catalog to pick the right model for your data shape.
517
+ ### Model Catalog
495
518
 
496
519
  | Model | Formula | When to use | Chaprola math |
497
520
  |-------|---------|-------------|---------------|
498
- | **Linear** | `y = R1*x + R2` | Proportional relationships, constant rate | `*`, `+` |
499
- | **Multi-linear** | `y = R1*x1 + R2*x2 + R3` | Multiple independent factors | `*`, `+` |
500
- | **Quadratic** | `y = R1*x^2 + R2*x + R3` | Accelerating/decelerating curves, area scaling | `*`, `+`, `POW` |
501
- | **Exponential growth** | `y = R1 * EXP(R2*x)` | Compound growth, population, interest | `EXP`, `*` |
502
- | **Exponential decay** | `y = R1 * EXP(-R2*x) + R3` | Drug clearance, radioactive decay, cooling | `EXP`, `*`, `-` |
503
- | **Power law** | `y = R1 * POW(x, R2)` | Scaling laws (Zipf, Kleiber), fractal relationships | `POW`, `*` |
504
- | **Logarithmic** | `y = R1 * LOG(x) + R2` | Diminishing returns, perception (Weber-Fechner) | `LOG`, `*`, `+` |
505
- | **Gaussian** | `y = R1 * EXP(-(x-R2)^2/(2*R3^2))` | Bell curves, distributions, demand peaks | `EXP`, `*`, `/` |
506
- | **Logistic (S-curve)** | `y = R1 / (1 + EXP(-R2*(x-R3)))` | Adoption curves, saturation, carrying capacity | `EXP`, `/`, `+` |
507
- | **Inverse** | `y = R1/x + R2` | Boyle's law, unit cost vs volume | `/`, `+` |
508
- | **Square root** | `y = R1 * SQRT(x) + R2` | Flow rates (Bernoulli), risk vs portfolio size | `SQRT`, `*`, `+` |
509
-
510
- **How to choose:** Look at your data's shape.
511
- - Straight line → linear or multi-linear
512
- - Curves upward faster and faster → exponential growth or quadratic
513
- - Curves upward then flattens → logarithmic, square root, or logistic
514
- - Drops fast then levels off → exponential decay or inverse
515
- - Has a peak/hump → Gaussian
516
- - Straight on log-log axes → power law
521
+ | **Linear** | `y = R1*x + R2` | Proportional relationships | `*`, `+` |
522
+ | **Multi-linear** | `y = R1*x1 + R2*x2 + R3` | Multiple factors | `*`, `+` |
523
+ | **Quadratic** | `y = R1*x^2 + R2*x + R3` | Accelerating curves | `*`, `+`, `POW` |
524
+ | **Exponential growth** | `y = R1 * EXP(R2*x)` | Compound growth | `EXP`, `*` |
525
+ | **Exponential decay** | `y = R1 * EXP(-R2*x) + R3` | Drug clearance, cooling | `EXP`, `*`, `-` |
526
+ | **Power law** | `y = R1 * POW(x, R2)` | Scaling laws | `POW`, `*` |
527
+ | **Logarithmic** | `y = R1 * LOG(x) + R2` | Diminishing returns | `LOG`, `*`, `+` |
528
+ | **Logistic** | `y = R1 / (1 + EXP(-R2*(x-R3)))` | S-curves, saturation | `EXP`, `/`, `+` |
517
529
 
518
530
  ### Nonlinear VALUE Program Patterns
519
531
 
520
532
  **Exponential decay:** `y = R1 * exp(-R2 * x) + R3`
521
533
  ```chaprola
522
- LET ARG = R2 * X
523
- LET ARG = ARG * -1
524
- LET PRED = EXP ARG
525
- LET PRED = PRED * R1
526
- LET PRED = PRED + R3
534
+ LET arg = R2 * x
535
+ LET arg = arg * -1
536
+ LET pred = EXP arg
537
+ LET pred = pred * R1
538
+ LET pred = pred + R3
527
539
  ```
528
540
 
529
541
  **Power law:** `y = R1 * x^R2`
530
542
  ```chaprola
531
- LET PRED = POW X R2
532
- LET PRED = PRED * R1
533
- ```
534
-
535
- **Gaussian:** `y = R1 * exp(-(x - R2)^2 / (2 * R3^2))`
536
- ```chaprola
537
- LET DIFF = X - R2
538
- LET DIFF = DIFF * DIFF
539
- LET DENOM = R3 * R3
540
- LET DENOM = DENOM * 2
541
- LET ARG = DIFF / DENOM
542
- LET ARG = ARG * -1
543
- LET PRED = EXP ARG
544
- LET PRED = PRED * R1
543
+ LET pred = POW x R2
544
+ LET pred = pred * R1
545
545
  ```
546
546
 
547
547
  **Logistic S-curve:** `y = R1 / (1 + exp(-R2 * (x - R3)))`
548
548
  ```chaprola
549
- LET ARG = X - R3
550
- LET ARG = ARG * R2
551
- LET ARG = ARG * -1
552
- LET DENOM = EXP ARG
553
- LET DENOM = DENOM + 1
554
- LET PRED = R1 / DENOM
555
- ```
556
-
557
- **Logarithmic:** `y = R1 * ln(x) + R2`
558
- ```chaprola
559
- LET PRED = LOG X
560
- LET PRED = PRED * R1
561
- LET PRED = PRED + R2
562
- ```
563
-
564
- All patterns follow the same loop structure: SEEK records, GET fields, compute PRED, accumulate `(PRED - OBS)^2` in SSR, store SSR in R21 at the end.
565
-
566
- ## Parameterized Report Endpoint
567
-
568
- Combine QUERY with PARAM to build a dynamic JSON API from a published program. QUERY output is a .QR file (read-only). R20 = matched record count. Missing PARAMs are silently replaced with blank (string) or 0.0 (numeric) — check param warnings in the response for diagnostics.
569
-
570
- ```chaprola
571
- // STAFF_BY_DEPT.CS — call via /report?publisher=admin&program=STAFF_BY_DEPT&dept=Engineering
572
- QUERY STAFF FIELDS name, salary, title INTO dept_staff WHERE department EQ PARAM.dept ORDER BY salary DESC
573
- OPEN SECONDARY dept_staff
574
- DEFINE name = S.name
575
- DEFINE salary = S.salary
576
- DEFINE title = S.title
577
- PRINT "["
578
- READ SECONDARY
579
- IF FINI GOTO done
580
- 100 PRINT TRIM "{\"name\":\"" + name + "\",\"title\":\"" + title + "\",\"salary\":" + salary + "}"
581
- READ SECONDARY
582
- IF FINI GOTO done
583
- PRINT ","
584
- GOTO 100
585
- done.
586
- PRINT "]"
587
- STOP
588
- ```
589
-
590
- Publish with: `POST /publish {userid, project, name: "STAFF_BY_DEPT", primary_file: "STAFF", acl: "authenticated"}`
591
- Call with: `POST /report?publisher=admin&program=STAFF_BY_DEPT&dept=Engineering`
592
-
593
- ## Cross-File Filtering (IN/NOT IN)
594
-
595
- Use QUERY with NOT IN to find records in one file that don't appear in another. This is the flashcard review pattern — find unreviewed cards by excluding already-reviewed ones. One IN/NOT IN per QUERY.
596
-
597
- If the IN-file doesn't exist (e.g., new user with no progress), NOT IN treats it as empty — all records pass. This is correct: "kanji not in (nothing)" = all kanji.
598
-
599
- ```chaprola
600
- // Step 1: Get the list of kanji the user has already reviewed
601
- QUERY progress INTO reviewed WHERE username EQ PARAM.username
602
-
603
- // Step 2: Filter flashcards to only those NOT in the reviewed set
604
- // If progress doesn't exist (new user), all flashcards are returned
605
- QUERY flashcards INTO new_cards WHERE kanji NOT IN reviewed.kanji
606
-
607
- // Step 3: Loop through unreviewed cards
608
- OPEN SECONDARY new_cards
609
- READ SECONDARY
610
- IF FINI GOTO empty
611
- 100 PRINT S.kanji + " — " + S.reading + " — " + S.meaning
612
- READ SECONDARY
613
- IF FINI GOTO done
614
- GOTO 100
615
- empty.
616
- PRINT "All cards reviewed!"
617
- done.
618
- STOP
619
- ```
620
-
621
- ## Public Apps: Use /report, Not /query
622
-
623
- Site keys require BAA signing. For public-facing web apps, use published reports (`/report`) instead of `/query` for all read operations. `/report` is public — no auth or BAA needed.
624
-
625
- **Pattern:** Move data logic into Chaprola programs (QUERY, TABULATE), publish them, and call `/report` from the frontend. Reserve site keys for write operations (`/insert-record`) only.
626
-
627
- ```javascript
628
- // GOOD: public report — no auth needed, works for anyone
629
- const url = `${API}/report?userid=myapp&project=data&name=RESULTS&poll_id=${id}`;
630
- const response = await fetch(url);
631
-
632
- // BAD: /query with site key — fails if BAA not signed (403 Forbidden)
633
- const response = await fetch(`${API}/query`, {
634
- headers: { 'Authorization': `Bearer ${SITE_KEY}` },
635
- body: JSON.stringify({ userid: 'myapp', project: 'data', file: 'votes', where: [...] })
636
- });
637
- ```
638
-
639
- **Why this is better:** The program runs server-side with full access. The frontend gets clean output. No API keys exposed for reads. QUERY + TABULATE in a program replaces client-side pivot logic.
640
-
641
- ## Chart Data with TABULATE
642
-
643
- Use TABULATE to produce CSV output suitable for charting. This example cross-tabulates mortality data by cause and year.
644
-
645
- ```chaprola
646
- // Generate a pivot table of death counts by cause and year
647
- TABULATE mortality SUM deaths FOR cause VS year WHERE year GE "2020" INTO trends
648
-
649
- // Output as CSV — ready for any charting library
650
- PRINT TABULATE trends AS CSV
651
- ```
652
-
653
- Output:
654
- ```
655
- cause,2020,2021,2022,2023,total
656
- Heart disease,690882,693021,699659,702000,2785562
657
- Cancer,598932,602350,608371,611000,2420653
658
- ...
659
- ```
660
-
661
- For web apps, use JSON output instead:
662
- ```chaprola
663
- PRINT TABULATE trends AS JSON
549
+ LET arg = x - R3
550
+ LET arg = arg * R2
551
+ LET arg = arg * -1
552
+ LET denom = EXP arg
553
+ LET denom = denom + 1
554
+ LET pred = R1 / denom
664
555
  ```
665
556
 
666
- ### Agent Workflow Summary
667
-
668
- 1. **Inspect** — Call `/format` to see what fields exist
669
- 2. **Sample** — Use `/query` with `limit` to get a manageable subset (200–500 records)
670
- 3. **Import sample** — `/import` the subset as a new small dataset
671
- 4. **Hypothesize** — Propose a model relating the fields
672
- 5. **Write VALUE program** — Loop through records, compute predicted vs actual, accumulate SSR in R21
673
- 6. **Compile** — `/compile` with `primary_format` pointing to the sample
674
- 7. **Optimize** — `/optimize` with elements, objectives, and the sample as primary_file
675
- 8. **Interpret** — Read the converged element values — those are your model coefficients
676
- 9. **Iterate** — If SSR is high, try a different model (add terms, try nonlinear)
557
+ All patterns follow the same loop structure: SEEK records, GET fields, compute pred, accumulate `(pred - obs)^2` in ssr, store ssr in R21 at the end.
@@ -60,8 +60,11 @@ IF EQUAL "CREDIT" U.76 GOTO 200
60
60
  ### MOVE literal auto-pads to field width
61
61
  `MOVE "Jones" P.name` auto-fills the rest of the field with blanks. No need to clear first.
62
62
 
63
- ### DEFINE VARIABLE names must not collide with field names
64
- If the format has a `balance` field, don't `DEFINE VARIABLE balance R41`. Use `bal` instead. The compiler confuses the alias with the field name.
63
+ ### Don't use DEFINE VARIABLE
64
+ Use implicit assignment: `LET rec = 1`. The compiler assigns R-variable slots automatically. DEFINE VARIABLE is legacy boilerplate.
65
+
66
+ ### Every program needs an intent file (.DS)
67
+ One paragraph: what the program does, what parameters it accepts, what output it produces. The project review system flags programs without intents.
65
68
 
66
69
  ### R-variables are floating point
67
70
  All R1–R99 are 64-bit floats. `7 / 2 = 3.5`. Use PUT with `I` format to display as integer.
@@ -120,11 +123,11 @@ Always CLOSE before END if you wrote to the secondary file. Unflushed writes are
120
123
 
121
124
  ## HULDRA Optimization
122
125
 
123
- ### Use R41R99 for scratch variables, not R1–R20
124
- R1–R20 are reserved for HULDRA elements. R21–R40 are reserved for objectives. Your VALUE program's DEFINE VARIABLE declarations must use R41–R99 only.
126
+ ### HULDRA programs: don't use R1R40 for your variables
127
+ R1–R20 are reserved for HULDRA elements. R21–R40 are reserved for objectives. Use implicit assignment with names that the compiler will place in R41+:
125
128
  ```chaprola
126
- // WRONG: DEFINE VARIABLE counter R1 (HULDRA will overwrite this)
127
- // RIGHT: DEFINE VARIABLE counter R41
129
+ LET rec = 1 // compiler assigns to R41+, safe from HULDRA
130
+ LET ssr = 0 // accumulate error here, then LET R21 = ssr at the end
128
131
  ```
129
132
 
130
133
  ### Sample large datasets before optimizing