@chaprola/mcp-server 1.7.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,18 @@
1
1
  # Chaprola Cookbook — Quick Reference
2
2
 
3
+ ## Style Rules (mandatory)
4
+
5
+ Every program must follow these rules. The project review system enforces them.
6
+
7
+ 1. **Use QUERY instead of SEEK loops** for filtering or single-record lookup. SEEK loops are only appropriate when processing every record unconditionally.
8
+ 2. **Don't use MOVE + IF EQUAL for comparisons** — use QUERY WHERE.
9
+ 3. **Use implicit variable assignment** (`LET name = value`) — don't use DEFINE VARIABLE.
10
+ 4. **END/STOP only for early exit** — not needed at the end of a program.
11
+ 5. **OPEN PRIMARY not needed** when using QUERY with primary_format on compile.
12
+ 6. **Use named read** (`READ name rec` + `name.field`) instead of OPEN SECONDARY + S.field for QUERY results.
13
+ 7. **Every program must have an intent file (.DS)** — one paragraph: what the program does, what parameters it accepts, what output it produces, who uses it. Create the intent with chaprola_compile or write it manually.
14
+ 8. **Add a comment header** — first line(s) should describe the program's purpose and parameters.
15
+
3
16
  ## Workflow: Import → Compile → Run
4
17
 
5
18
  ```bash
@@ -13,128 +26,192 @@ POST /compile {userid, project, name: "REPORT", source: "...", primary_format: "
13
26
  POST /run {userid, project, name: "REPORT", primary_file: "STAFF", record: 1}
14
27
  ```
15
28
 
16
- ## R-Variable Ranges
29
+ ## Hello World (no data file)
30
+
31
+ ```chaprola
32
+ // Hello World — minimal Chaprola program
33
+ PRINT "Hello from Chaprola!"
34
+ ```
17
35
 
18
- | Range | Purpose | Safe for DEFINE VARIABLE? |
19
- |-------|---------|--------------------------|
20
- | R1–R20 | HULDRA elements (parameters) | No — HULDRA overwrites these |
21
- | R21–R40 | HULDRA objectives (error metrics) | No — HULDRA reads these |
22
- | R41–R99 | Scratch space | **Yes — always use R41–R99 for DEFINE VARIABLE** |
36
+ No END needed the program ends naturally.
23
37
 
24
- For non-HULDRA programs, R1–R40 are technically available but using R41–R99 is a good habit.
38
+ ## Loop Through All Records
25
39
 
26
- ## PRINT: Preferred Output Methods
40
+ When processing every record unconditionally (no filter), SEEK loops are appropriate:
27
41
 
28
- **Concatenation (preferred):**
29
42
  ```chaprola
30
- PRINT P.name + " " + P.department + " — $" + R41
31
- PRINT "Total: " + R42
32
- PRINT P.last_name // single field, auto-trimmed
33
- PRINT "Hello from Chaprola!" // literal string
43
+ // REPORTList all staff with salaries
44
+ // Primary: STAFF
45
+
46
+ LET rec = 1
47
+ 100 SEEK rec
48
+ IF EOF END
49
+ PRINT P.name + " — " + P.salary
50
+ LET rec = rec + 1
51
+ GOTO 100
34
52
  ```
35
53
 
36
- - String literals are copied as-is.
37
- - P./S./U./X. fields are auto-trimmed (trailing spaces removed).
38
- - R-variables print as integers when no fractional part, otherwise as floats.
39
- - Concatenation auto-flushes the line.
54
+ Compile with `primary_format: "STAFF"`. No OPEN PRIMARY needed — the compiler reads the format from primary_format.
40
55
 
41
- **U buffer output (for fixed-width columnar reports only):**
42
- ```chaprola
43
- CLEAR U
44
- MOVE P.name U.1 20
45
- PUT sal INTO U.22 10 D 2
46
- PRINT 0 // output entire U buffer, then clear
47
- ```
56
+ ## Filtered Report (QUERY)
48
57
 
49
- ## Hello World (no data file)
58
+ When you want a subset of records, use QUERY — not a SEEK loop with IF conditions:
50
59
 
51
60
  ```chaprola
52
- PRINT "Hello from Chaprola!"
53
- END
54
- ```
61
+ // HIGH_EARNERS List staff earning over 80000
62
+ // Primary: STAFF
55
63
 
56
- ## Loop Through All Records
64
+ QUERY STAFF INTO earners WHERE salary GT 80000 ORDER BY salary DESC
57
65
 
58
- ```chaprola
59
- DEFINE VARIABLE rec R41
60
66
  LET rec = 1
61
- 100 SEEK rec
62
- IF EOF GOTO 900
63
- PRINT P.name + " — " + P.salary
67
+ 100 READ earners rec
68
+ IF EOF END
69
+ PRINT earners.name + " — $" + earners.salary
64
70
  LET rec = rec + 1
65
71
  GOTO 100
66
- 900 END
67
72
  ```
68
73
 
69
- ## Filtered Report
74
+ ## Single-Record Lookup (QUERY)
75
+
76
+ For finding one record by key, use QUERY — not a SEEK/MOVE/IF EQUAL loop:
70
77
 
71
78
  ```chaprola
72
- GET sal FROM P.salary
73
- IF sal LT 80000 GOTO 200 // skip low earners
74
- PRINT P.name + " — " + R41
75
- 200 LET rec = rec + 1
79
+ // DETAIL Look up a single staff member by ID
80
+ // Parameter: staff_id
81
+ // Primary: STAFF
82
+
83
+ QUERY STAFF INTO person WHERE staff_id EQ PARAM.staff_id
84
+
85
+ LET rec = 1
86
+ READ person rec
87
+ IF EOF END
88
+ PRINT "Name: " + person.name
89
+ PRINT "Department: " + person.department
90
+ PRINT "Salary: $" + person.salary
76
91
  ```
77
92
 
93
+ ## DELETE with QUERY (RECORDNUMBERS)
94
+
95
+ Use QUERY to find records, then DELETE with RECORDNUMBERS for bulk deletion:
96
+
97
+ ```chaprola
98
+ // CLEANUP — Delete all closed polls and their votes
99
+ // Parameter: poll_id
100
+ // Primary: polls, Secondary: votes
101
+
102
+ QUERY polls INTO poll WHERE poll_id EQ PARAM.poll_id
103
+ DELETE PRIMARY poll.RECORDNUMBERS
104
+
105
+ OPEN "votes" WHERE poll_id EQ PARAM.poll_id
106
+ DELETE votes.RECORDNUMBERS
107
+ CLOSE
108
+
109
+ IF poll.RECORDCOUNT EQ 0 GOTO 900 ;
110
+ PRINT "STATUS|OK"
111
+ PRINT "VOTES_DELETED|" + votes.RECORDCOUNT
112
+ END
113
+
114
+ 900 PRINT "STATUS|NOT_FOUND"
115
+ ```
116
+
117
+ - `poll.RECORDNUMBERS` returns all physical record positions matched by the QUERY
118
+ - `DELETE PRIMARY poll.RECORDNUMBERS` bulk-deletes them in one statement
119
+ - `votes.RECORDCOUNT` returns the filtered count from OPEN WHERE, not total file count
120
+
78
121
  ## JOIN Two Files (FIND)
79
122
 
80
123
  ```chaprola
124
+ // ROSTER — Staff with department names
125
+ // Primary: EMPLOYEES, Secondary: DEPARTMENTS
126
+
81
127
  OPEN "DEPARTMENTS" 0
82
- FIND match FROM S.dept_code USING P.dept_code
83
- IF match EQ 0 GOTO 200 // no match
84
- READ match // load matched secondary record
85
- PRINT P.name + " " + S.dept_name
128
+ LET rec = 1
129
+ 100 SEEK rec
130
+ IF EOF END
131
+ FIND match FROM S.dept_code USING P.dept_code
132
+ IF match EQ 0 GOTO 200
133
+ READ match
134
+ PRINT P.name + " — " + S.dept_name
135
+ 200 LET rec = rec + 1
136
+ GOTO 100
86
137
  ```
87
138
 
88
- Compile with both formats so the compiler resolves fields from both files:
139
+ Compile with both formats:
89
140
  ```bash
90
141
  POST /compile {
91
- userid, project, name: "REPORT",
142
+ userid, project, name: "ROSTER",
92
143
  source: "...",
93
144
  primary_format: "EMPLOYEES",
94
145
  secondary_format: "DEPARTMENTS"
95
146
  }
96
147
  ```
97
148
 
98
- ## Comparing Two Memory Locations
99
-
100
- IF EQUAL compares a literal to a location. To compare two memory locations, copy both to U buffer:
149
+ ## PRINT: Preferred Output Methods
101
150
 
151
+ **Concatenation (preferred):**
102
152
  ```chaprola
103
- MOVE PARAM.poll_id U.200 12
104
- MOVE P.poll_id U.180 12
105
- IF EQUAL U.200 U.180 12 GOTO 200 // match jump to handler
153
+ PRINT P.name + " — " + P.department + " — $" + R41
154
+ PRINT "Total: " + R42
155
+ PRINT P.last_name // single field, auto-trimmed
156
+ PRINT "Hello from Chaprola!" // literal string
106
157
  ```
107
158
 
108
- ## Read-Modify-Write (UPDATE)
159
+ - String literals are copied as-is.
160
+ - P./S./U./X. fields are auto-trimmed (trailing spaces removed).
161
+ - R-variables print as integers when no fractional part, otherwise as floats.
162
+ - Concatenation auto-flushes the line.
109
163
 
164
+ **U buffer output (for fixed-width columnar reports only):**
110
165
  ```chaprola
111
- READ match // load record
112
- GET bal FROM S.balance // read current value
113
- LET bal = bal + amt // modify
114
- PUT bal INTO S.balance F 0 // write back to S memory (length auto-filled)
115
- WRITE match // flush to disk
116
- CLOSE // flush all at end
166
+ CLEAR U
167
+ MOVE P.name U.1 20
168
+ PUT sal INTO U.22 10 D 2
169
+ PRINT 0 // output entire U buffer, then clear
117
170
  ```
118
171
 
119
- ## Date Arithmetic
172
+ ## R-Variable Ranges
173
+
174
+ | Range | Purpose | Notes |
175
+ |-------|---------|-------|
176
+ | R1–R20 | HULDRA elements (parameters) | HULDRA overwrites these |
177
+ | R21–R40 | HULDRA objectives (error metrics) | HULDRA reads these |
178
+ | R41–R99 | Scratch space | Always safe |
179
+
180
+ For non-HULDRA programs, all R1–R99 are available. Use implicit assignment (`LET name = value`) — the compiler assigns R-variable slots automatically.
181
+
182
+ ## Read-Modify-Write (UPDATE)
120
183
 
121
184
  ```chaprola
122
- GET DATE R41 FROM X.primary_modified // when was file last changed?
123
- GET DATE R42 FROM X.utc_time // what time is it now?
124
- LET R43 = R42 - R41 // difference in seconds
125
- LET R43 = R43 / 86400 // convert to days
126
- IF R43 GT 30 PRINT "WARNING: file is over 30 days old" ;
185
+ // UPDATE_BALANCE Add amount to account balance
186
+ // Primary: ACCOUNTS, Secondary: LEDGER
187
+
188
+ OPEN "LEDGER" 0
189
+ FIND match FROM S.account_id USING P.account_id
190
+ IF match EQ 0 END
191
+ READ match
192
+ GET bal FROM S.balance
193
+ LET bal = bal + amt
194
+ PUT bal INTO S.balance F 0
195
+ WRITE match
196
+ CLOSE
127
197
  ```
128
198
 
129
- ## Get Current User
199
+ ## Date Arithmetic
130
200
 
131
201
  ```chaprola
132
- PRINT "Logged in as: " + X.username
202
+ // CHECK_FRESHNESS Warn if file is stale
203
+ // Primary: any
204
+
205
+ GET DATE R1 FROM X.primary_modified
206
+ GET DATE R2 FROM X.utc_time
207
+ LET days = R2 - R1
208
+ LET days = days / 86400
209
+ IF days GT 30 PRINT "WARNING: file is over 30 days old" ;
133
210
  ```
134
211
 
135
212
  ## System Text Properties (X.)
136
213
 
137
- Access system metadata by property name — no numeric positions needed:
214
+ Access system metadata by property name:
138
215
 
139
216
  | Property | Description |
140
217
  |----------|-------------|
@@ -149,6 +226,51 @@ Access system metadata by property name — no numeric positions needed:
149
226
  | `X.primary_modified` | Primary file Last-Modified |
150
227
  | `X.secondary_modified` | Secondary file Last-Modified |
151
228
 
229
+ ## Parameterized Reports (PARAM.name)
230
+
231
+ Programs accept named parameters from URL query strings:
232
+
233
+ ```chaprola
234
+ // STAFF_BY_DEPT — List staff in a department
235
+ // Parameter: dept
236
+ // Primary: STAFF
237
+
238
+ QUERY STAFF INTO team WHERE department EQ PARAM.dept ORDER BY salary DESC
239
+
240
+ LET rec = 1
241
+ 100 READ team rec
242
+ IF EOF END
243
+ PRINT team.name + " — " + team.title + " — $" + team.salary
244
+ LET rec = rec + 1
245
+ GOTO 100
246
+ ```
247
+
248
+ Publish with: `POST /publish {userid, project, name: "STAFF_BY_DEPT", primary_file: "STAFF", acl: "authenticated"}`
249
+ Call with: `GET /report?userid=X&project=Y&name=STAFF_BY_DEPT&dept=Engineering`
250
+ Discover params: `POST /report/params {userid, project, name}` — returns .PF schema
251
+
252
+ ## Cross-File Filtering (IN/NOT IN)
253
+
254
+ Use QUERY with NOT IN to find records in one file that don't appear in another:
255
+
256
+ ```chaprola
257
+ // UNREVIEWED — Find flashcards the user hasn't studied yet
258
+ // Parameters: username
259
+ // Primary: flashcards
260
+
261
+ QUERY progress INTO reviewed WHERE username EQ PARAM.username
262
+ QUERY flashcards INTO new_cards WHERE kanji NOT IN "reviewed.kanji"
263
+
264
+ LET rec = 1
265
+ 100 READ new_cards rec
266
+ IF EOF END
267
+ PRINT new_cards.kanji + " — " + new_cards.reading + " — " + new_cards.meaning
268
+ LET rec = rec + 1
269
+ GOTO 100
270
+ ```
271
+
272
+ If the IN-file doesn't exist (e.g., new user), NOT IN treats it as empty — all records pass.
273
+
152
274
  ## Async for Large Datasets
153
275
 
154
276
  ```bash
@@ -161,77 +283,47 @@ POST /run/status {userid, project, job_id}
161
283
  # Response: {status: "done", output: "..."}
162
284
  ```
163
285
 
164
- ## Parameterized Reports (PARAM.name)
286
+ ## Public Apps: Use /report, Not /query
287
+
288
+ Site keys require BAA signing. For public-facing web apps, use published reports (`/report`) instead of `/query` for all read operations. `/report` is public — no auth or BAA needed.
165
289
 
166
- Programs can accept named parameters from URL query strings. Use this for dynamic reports.
290
+ ```javascript
291
+ // GOOD: public report — no auth needed
292
+ const url = `${API}/report?userid=myapp&project=data&name=RESULTS&poll_id=${id}`;
293
+ const response = await fetch(url);
294
+
295
+ // BAD: /query with site key — fails if BAA not signed
296
+ const response = await fetch(`${API}/query`, {
297
+ headers: { 'Authorization': `Bearer ${SITE_KEY}` },
298
+ body: JSON.stringify({ userid: 'myapp', project: 'data', file: 'votes', where: [...] })
299
+ });
300
+ ```
301
+
302
+ ## Chart Data with TABULATE
167
303
 
168
304
  ```chaprola
169
- // Report that accepts &deck=kanji&level=3 as URL params
170
- MOVE PARAM.deck U.1 20 // string param → U buffer
171
- LET lvl = PARAM.level // numeric param → R variable
172
- SEEK 1
173
- 100 IF EOF GOTO 900
174
- MOVE P.deck U.30 10
175
- IF EQUAL PARAM.deck U.30 GOTO 200 // filter by deck param
176
- GOTO 300
177
- 200 GET cardlvl FROM P.level
178
- IF cardlvl NE lvl GOTO 300 // filter by level param
179
- PRINT P.kanji + " — " + P.reading
180
- 300 LET rec = rec + 1
181
- SEEK rec
182
- GOTO 100
183
- 900 END
305
+ // TRENDS Cross-tabulate mortality by cause and year
306
+ // Primary: mortality
307
+
308
+ TABULATE mortality SUM deaths FOR cause VS year WHERE year GE "2020" INTO trends
309
+
310
+ PRINT TABULATE trends AS CSV
184
311
  ```
185
312
 
186
- Publish with: `POST /publish {userid, project, name, primary_file, acl: "authenticated"}`
187
- Call with: `GET /report?userid=X&project=Y&name=Z&deck=kanji&level=3`
188
- Discover params: `POST /report/params {userid, project, name}` → returns .PF schema (field names, types, widths)
313
+ For web apps, use `PRINT TABULATE trends AS JSON`.
189
314
 
190
315
  ## Named Output Positions (U.name)
191
316
 
192
- Instead of `U.1`, `U.12`, etc., use named positions for readable code:
317
+ Instead of `U.1`, `U.12`, use named positions:
193
318
 
194
319
  ```chaprola
195
- // U.name positions are auto-allocated by the compiler
196
320
  MOVE P.name U.name 20
197
321
  MOVE P.dept U.dept 10
198
322
  PUT sal INTO U.salary 10 D 0
199
323
  PRINT 0
200
324
  ```
201
325
 
202
- ## GROUP BY with Pivot (via /query)
203
-
204
- Chaprola's pivot IS GROUP BY. Schema: `{row, column, value, aggregate}`.
205
-
206
- ```bash
207
- # SQL: SELECT department, AVG(salary) FROM staff GROUP BY department
208
- POST /query {
209
- userid, project, file: "STAFF",
210
- pivot: {
211
- row: "department",
212
- column: "",
213
- value: "salary",
214
- aggregate: "avg"
215
- }
216
- }
217
-
218
- # SQL: SELECT department, year, SUM(revenue) FROM sales GROUP BY department, year
219
- POST /query {
220
- userid, project, file: "SALES",
221
- pivot: {
222
- row: "department",
223
- column: "year",
224
- value: "revenue",
225
- aggregate: "sum"
226
- }
227
- }
228
- ```
229
-
230
- - `row` — grouping field
231
- - `column` — cross-tab field (use `""` for simple aggregation)
232
- - `value` — field to aggregate (string or array of strings)
233
- - `aggregate` — function: `count`, `sum`, `avg`, `min`, `max`, `stddev`
234
- - Row and column totals included automatically in response
326
+ Positions are auto-allocated by the compiler.
235
327
 
236
328
  ## PUT Format Codes
237
329
 
@@ -244,39 +336,17 @@ POST /query {
244
336
 
245
337
  Syntax: `PUT R41 INTO P.salary D 2` (R-var, field name, format, decimals — length auto-filled)
246
338
 
247
- ## Common Field Widths
248
-
249
- | Data type | Chars | Example |
250
- |-----------|-------|---------|
251
- | ISO datetime | 20 | `2026-03-28T14:30:00Z` |
252
- | UUID | 36 | `550e8400-e29b-41d4-a716-446655440000` |
253
- | Email | 50 | `user@example.com` |
254
- | Short ID | 8–12 | `poll_001` |
255
- | Dollar amount | 10 | `$1,234.56` |
256
- | Phone | 15 | `+1-555-123-4567` |
257
-
258
- Use these when sizing MOVE lengths and U buffer positions.
259
-
260
- ## Memory Regions
261
-
262
- | Prefix | Description |
263
- |--------|-------------|
264
- | `P` | Primary data file — use field names: `P.salary`, `P.name` |
265
- | `S` | Secondary data file — use field names: `S.dept`, `S.emp_id` |
266
- | `U` | User buffer (scratch for output) |
267
- | `X` | System text — use property names: `X.username`, `X.utc_time` |
268
-
269
339
  ## Math Intrinsics
270
340
 
271
341
  ```chaprola
272
342
  LET R42 = EXP R41 // e^R41
273
343
  LET R42 = LOG R41 // ln(R41)
274
- LET R42 = SQRT R41 // R41
344
+ LET R42 = SQRT R41 // sqrt(R41)
275
345
  LET R42 = ABS R41 // |R41|
276
346
  LET R43 = POW R41 R42 // R41^R42
277
347
  ```
278
348
 
279
- ## Import-Download: URL Dataset (Parquet, Excel, CSV, JSON)
349
+ ## Import-Download: URL to Dataset (Parquet, Excel, CSV, JSON)
280
350
 
281
351
  ```bash
282
352
  # Import Parquet from a cloud data lake
@@ -296,16 +366,57 @@ POST /import-download {
296
366
  ```
297
367
 
298
368
  Supports: CSV, TSV, JSON, NDJSON, Parquet (zstd/snappy/lz4), Excel (.xlsx/.xls).
299
- AI instructions are optional — omit to import all columns as-is.
300
369
  Lambda: 10 GB /tmp, 900s timeout, 500 MB download limit.
301
370
 
371
+ ## GROUP BY with Pivot (via /query)
372
+
373
+ Chaprola's pivot IS GROUP BY:
374
+
375
+ ```bash
376
+ # SQL: SELECT department, AVG(salary) FROM staff GROUP BY department
377
+ POST /query {
378
+ userid, project, file: "STAFF",
379
+ pivot: {
380
+ row: "department",
381
+ column: "",
382
+ value: "salary",
383
+ aggregate: "avg"
384
+ }
385
+ }
386
+ ```
387
+
388
+ - `row` — grouping field
389
+ - `column` — cross-tab field (use `""` for simple aggregation)
390
+ - `value` — field to aggregate
391
+ - `aggregate` — count, sum, avg, min, max, stddev
392
+
393
+ ## Common Field Widths
394
+
395
+ | Data type | Chars | Example |
396
+ |-----------|-------|---------|
397
+ | ISO datetime | 20 | `2026-03-28T14:30:00Z` |
398
+ | UUID | 36 | `550e8400-e29b-41d4-a716-446655440000` |
399
+ | Email | 50 | `user@example.com` |
400
+ | Short ID | 8–12 | `poll_001` |
401
+ | Dollar amount | 10 | `$1,234.56` |
402
+ | Phone | 15 | `+1-555-123-4567` |
403
+
404
+ ## Memory Regions
405
+
406
+ | Prefix | Description |
407
+ |--------|-------------|
408
+ | `P` | Primary data file — use field names: `P.salary`, `P.name` |
409
+ | `S` | Secondary data file — use field names: `S.dept`, `S.emp_id` |
410
+ | `U` | User buffer (scratch for output) |
411
+ | `X` | System text — use property names: `X.username`, `X.utc_time` |
412
+
302
413
  ## HULDRA Optimization — Nonlinear Parameter Fitting
303
414
 
304
- HULDRA finds the best parameter values for a mathematical model by minimizing the difference between model predictions and observed data. You propose a model, HULDRA finds the coefficients.
415
+ HULDRA finds the best parameter values for a mathematical model by minimizing the difference between model predictions and observed data.
305
416
 
306
417
  ### How It Works
307
418
 
308
- 1. You write a VALUE program (normal Chaprola) that reads data, computes predictions using R-variable parameters, and stores the error in an objective R-variable
419
+ 1. Write a VALUE program that reads data, computes predictions using R-variable parameters, and stores the error in an objective R-variable
309
420
  2. HULDRA repeatedly runs your program with different parameter values, using gradient descent to minimize the objective
310
421
  3. When the objective stops improving, HULDRA returns the optimal parameters
311
422
 
@@ -319,7 +430,7 @@ HULDRA finds the best parameter values for a mathematical model by minimizing th
319
430
 
320
431
  ### Complete Example: Fit a Linear Model
321
432
 
322
- **Goal:** Find `salary = a × years_exp + b` that best fits employee data.
433
+ **Goal:** Find `salary = a * years_exp + b` that best fits employee data.
323
434
 
324
435
  **Step 1: Import data**
325
436
  ```bash
@@ -336,33 +447,29 @@ POST /import {
336
447
  ```
337
448
 
338
449
  **Step 2: Write and compile the VALUE program**
450
+
451
+ Note: HULDRA VALUE programs are the one case where you MUST use R1-R20 directly (HULDRA sets elements) and R21-R40 directly (HULDRA reads objectives). Use implicit assignment for scratch variables R41+.
452
+
339
453
  ```chaprola
340
- // VALUE program: salary = R1 * years_exp + R2
454
+ // SALFIT — Linear salary model: salary = R1 * years_exp + R2
341
455
  // R1 = slope (per-year raise), R2 = base salary
342
456
  // R21 = sum of squared residuals (SSR)
457
+ // Primary: EMP
343
458
 
344
- DEFINE VARIABLE REC R41
345
- DEFINE VARIABLE YRS R42
346
- DEFINE VARIABLE SAL R43
347
- DEFINE VARIABLE PRED R44
348
- DEFINE VARIABLE RESID R45
349
- DEFINE VARIABLE SSR R46
350
-
351
- LET SSR = 0
352
- LET REC = 1
353
- 100 SEEK REC
459
+ LET ssr = 0
460
+ LET rec = 1
461
+ 100 SEEK rec
354
462
  IF EOF GOTO 200
355
- GET YRS FROM P.years_exp
356
- GET SAL FROM P.salary
357
- LET PRED = R1 * YRS
358
- LET PRED = PRED + R2
359
- LET RESID = PRED - SAL
360
- LET RESID = RESID * RESID
361
- LET SSR = SSR + RESID
362
- LET REC = REC + 1
463
+ GET yrs FROM P.years_exp
464
+ GET sal FROM P.salary
465
+ LET pred = R1 * yrs
466
+ LET pred = pred + R2
467
+ LET resid = pred - sal
468
+ LET resid = resid * resid
469
+ LET ssr = ssr + resid
470
+ LET rec = rec + 1
363
471
  GOTO 100
364
- 200 LET R21 = SSR
365
- END
472
+ 200 LET R21 = ssr
366
473
  ```
367
474
 
368
475
  Compile with: `primary_format: "EMP"`
@@ -384,185 +491,67 @@ POST /optimize {
384
491
  }
385
492
  ```
386
493
 
387
- **Response:**
388
- ```json
389
- {
390
- "status": "converged",
391
- "iterations": 12,
392
- "elements": [
393
- {"index": 1, "label": "per_year_raise", "value": 4876.5},
394
- {"index": 2, "label": "base_salary", "value": 46230.1}
395
- ],
396
- "objectives": [
397
- {"index": 1, "label": "SSR", "value": 2841050.3, "goal": 0.0}
398
- ],
399
- "elapsed_seconds": 0.02
400
- }
401
- ```
402
-
403
- **Result:** `salary = $4,877/year × experience + $46,230 base`
404
-
405
- ### Element Parameters Explained
494
+ ### Element Parameters
406
495
 
407
496
  | Field | Description | Guidance |
408
497
  |-------|-------------|----------|
409
- | `index` | Maps to R-variable (1 R1, 2 R2, ...) | Max 20 elements |
498
+ | `index` | Maps to R-variable (1 = R1, 2 = R2, ...) | Max 20 elements |
410
499
  | `label` | Human-readable name | Returned in results |
411
- | `start` | Initial guess | Closer to true value = faster convergence |
412
- | `min`, `max` | Bounds | HULDRA clamps parameters to this range |
413
- | `delta` | Step size for gradient computation | ~0.1% of expected value range. Too large = inaccurate gradients. Too small = numerical noise |
500
+ | `start` | Initial guess | Closer = faster convergence |
501
+ | `min`, `max` | Bounds | HULDRA clamps to this range |
502
+ | `delta` | Step size for gradient | ~0.1% of expected value range |
414
503
 
415
504
  ### Choosing Delta Values
416
505
 
417
- Delta controls how HULDRA estimates gradients (via central differences). Rules of thumb:
418
- - **Dollar amounts** (fares, salaries): `delta: 0.01` to `1.0`
419
- - **Rates/percentages** (per-mile, per-minute): `delta: 0.001` to `0.01`
506
+ - **Dollar amounts**: `delta: 0.01` to `1.0`
507
+ - **Rates/percentages**: `delta: 0.001` to `0.01`
420
508
  - **Counts/integers**: `delta: 0.1` to `1.0`
421
- - **Time values** (hours, peaks): `delta: 0.05` to `0.5`
422
-
423
- If optimization doesn't converge, try making delta smaller.
509
+ - **Time values**: `delta: 0.05` to `0.5`
424
510
 
425
- ### Performance & Limits
511
+ ### Performance
426
512
 
427
- HULDRA runs your VALUE program **1 + 2 × N_elements** times per iteration (once for evaluation, twice per element for gradient). With `max_iterations: 100`:
513
+ HULDRA runs your program `1 + 2 * N_elements` times per iteration. Lambda timeout is 900 seconds. For large datasets, sample first query 200-500 representative records, optimize against the sample.
428
514
 
429
- | Elements | VM runs/iteration | At 100 iterations |
430
- |----------|-------------------|-------------------|
431
- | 2 | 5 | 500 |
432
- | 3 | 7 | 700 |
433
- | 5 | 11 | 1,100 |
434
- | 10 | 21 | 2,100 |
515
+ Use `async_exec: true` for optimizations that might exceed 30 seconds.
435
516
 
436
- **Lambda timeout is 900 seconds.** If each VM run takes 0.01s (100 records), you're fine. If each run takes 1s (100K records), 3 elements × 100 iterations = 700s — cutting it close.
437
-
438
- **Strategy for large datasets:** Sample first. Query 200–500 representative records into a smaller dataset, optimize against that. The coefficients transfer to the full dataset.
439
-
440
- ```bash
441
- # Sample 500 records from a large dataset
442
- POST /query {userid, project, file: "BIGDATA", limit: 500, offset: 100000}
443
- # Import the sample
444
- POST /import {userid, project, name: "SAMPLE", data: [...results...]}
445
- # Optimize against the sample
446
- POST /optimize {... primary_file: "SAMPLE" ...}
447
- ```
448
-
449
- ### Async Optimization
450
-
451
- For optimizations that might exceed 30 seconds (API Gateway timeout), use async mode:
452
-
453
- ```bash
454
- POST /optimize {
455
- ... async_exec: true ...
456
- }
457
- # Response: {status: "running", job_id: "20260325_..."}
458
-
459
- POST /optimize/status {userid, project, job_id: "20260325_..."}
460
- # Response: {status: "converged", elements: [...], ...}
461
- ```
462
-
463
- ### Multi-Objective Optimization
464
-
465
- HULDRA can minimize multiple objectives simultaneously with different weights:
466
-
467
- ```bash
468
- objectives: [
469
- {index: 1, label: "price_error", goal: 0.0, weight: 1.0},
470
- {index: 2, label: "volume_error", goal: 0.0, weight: 10.0}
471
- ]
472
- ```
473
-
474
- Higher weight = more important. HULDRA minimizes `Q = sum(weight × (value - goal)²)`.
475
-
476
- ### Interpreting Results
477
-
478
- - **`status: "converged"`** — Optimal parameters found. The objective stopped improving.
479
- - **`status: "timeout"`** — Hit 900s wall clock. Results are the best found so far — often still useful.
480
- - **`total_objective`** — The raw Q value. Compare across runs, not in absolute terms. Lower = better fit.
481
- - **`SSR` (objective value)** — Sum of squared residuals. Divide by record count for mean squared error. Take the square root for RMSE in the same units as your data.
482
- - **`dq_dx` on elements** — Gradient. Values near zero mean the parameter is well-optimized. Large values may indicate the bounds are too tight.
483
-
484
- ### Model Catalog — Which Formula to Try
485
-
486
- HULDRA fits any model expressible with Chaprola's math: `+`, `-`, `*`, `/`, `EXP`, `LOG`, `SQRT`, `ABS`, `POW`, and `IF` branching. Use this catalog to pick the right model for your data shape.
517
+ ### Model Catalog
487
518
 
488
519
  | Model | Formula | When to use | Chaprola math |
489
520
  |-------|---------|-------------|---------------|
490
- | **Linear** | `y = R1*x + R2` | Proportional relationships, constant rate | `*`, `+` |
491
- | **Multi-linear** | `y = R1*x1 + R2*x2 + R3` | Multiple independent factors | `*`, `+` |
492
- | **Quadratic** | `y = R1*x^2 + R2*x + R3` | Accelerating/decelerating curves, area scaling | `*`, `+`, `POW` |
493
- | **Exponential growth** | `y = R1 * EXP(R2*x)` | Compound growth, population, interest | `EXP`, `*` |
494
- | **Exponential decay** | `y = R1 * EXP(-R2*x) + R3` | Drug clearance, radioactive decay, cooling | `EXP`, `*`, `-` |
495
- | **Power law** | `y = R1 * POW(x, R2)` | Scaling laws (Zipf, Kleiber), fractal relationships | `POW`, `*` |
496
- | **Logarithmic** | `y = R1 * LOG(x) + R2` | Diminishing returns, perception (Weber-Fechner) | `LOG`, `*`, `+` |
497
- | **Gaussian** | `y = R1 * EXP(-(x-R2)^2/(2*R3^2))` | Bell curves, distributions, demand peaks | `EXP`, `*`, `/` |
498
- | **Logistic (S-curve)** | `y = R1 / (1 + EXP(-R2*(x-R3)))` | Adoption curves, saturation, carrying capacity | `EXP`, `/`, `+` |
499
- | **Inverse** | `y = R1/x + R2` | Boyle's law, unit cost vs volume | `/`, `+` |
500
- | **Square root** | `y = R1 * SQRT(x) + R2` | Flow rates (Bernoulli), risk vs portfolio size | `SQRT`, `*`, `+` |
501
-
502
- **How to choose:** Look at your data's shape.
503
- - Straight line → linear or multi-linear
504
- - Curves upward faster and faster → exponential growth or quadratic
505
- - Curves upward then flattens → logarithmic, square root, or logistic
506
- - Drops fast then levels off → exponential decay or inverse
507
- - Has a peak/hump → Gaussian
508
- - Straight on log-log axes → power law
521
+ | **Linear** | `y = R1*x + R2` | Proportional relationships | `*`, `+` |
522
+ | **Multi-linear** | `y = R1*x1 + R2*x2 + R3` | Multiple factors | `*`, `+` |
523
+ | **Quadratic** | `y = R1*x^2 + R2*x + R3` | Accelerating curves | `*`, `+`, `POW` |
524
+ | **Exponential growth** | `y = R1 * EXP(R2*x)` | Compound growth | `EXP`, `*` |
525
+ | **Exponential decay** | `y = R1 * EXP(-R2*x) + R3` | Drug clearance, cooling | `EXP`, `*`, `-` |
526
+ | **Power law** | `y = R1 * POW(x, R2)` | Scaling laws | `POW`, `*` |
527
+ | **Logarithmic** | `y = R1 * LOG(x) + R2` | Diminishing returns | `LOG`, `*`, `+` |
528
+ | **Logistic** | `y = R1 / (1 + EXP(-R2*(x-R3)))` | S-curves, saturation | `EXP`, `/`, `+` |
509
529
 
510
530
  ### Nonlinear VALUE Program Patterns
511
531
 
512
532
  **Exponential decay:** `y = R1 * exp(-R2 * x) + R3`
513
533
  ```chaprola
514
- LET ARG = R2 * X
515
- LET ARG = ARG * -1
516
- LET PRED = EXP ARG
517
- LET PRED = PRED * R1
518
- LET PRED = PRED + R3
534
+ LET arg = R2 * x
535
+ LET arg = arg * -1
536
+ LET pred = EXP arg
537
+ LET pred = pred * R1
538
+ LET pred = pred + R3
519
539
  ```
520
540
 
521
541
  **Power law:** `y = R1 * x^R2`
522
542
  ```chaprola
523
- LET PRED = POW X R2
524
- LET PRED = PRED * R1
525
- ```
526
-
527
- **Gaussian:** `y = R1 * exp(-(x - R2)^2 / (2 * R3^2))`
528
- ```chaprola
529
- LET DIFF = X - R2
530
- LET DIFF = DIFF * DIFF
531
- LET DENOM = R3 * R3
532
- LET DENOM = DENOM * 2
533
- LET ARG = DIFF / DENOM
534
- LET ARG = ARG * -1
535
- LET PRED = EXP ARG
536
- LET PRED = PRED * R1
543
+ LET pred = POW x R2
544
+ LET pred = pred * R1
537
545
  ```
538
546
 
539
547
  **Logistic S-curve:** `y = R1 / (1 + exp(-R2 * (x - R3)))`
540
548
  ```chaprola
541
- LET ARG = X - R3
542
- LET ARG = ARG * R2
543
- LET ARG = ARG * -1
544
- LET DENOM = EXP ARG
545
- LET DENOM = DENOM + 1
546
- LET PRED = R1 / DENOM
549
+ LET arg = x - R3
550
+ LET arg = arg * R2
551
+ LET arg = arg * -1
552
+ LET denom = EXP arg
553
+ LET denom = denom + 1
554
+ LET pred = R1 / denom
547
555
  ```
548
556
 
549
- **Logarithmic:** `y = R1 * ln(x) + R2`
550
- ```chaprola
551
- LET PRED = LOG X
552
- LET PRED = PRED * R1
553
- LET PRED = PRED + R2
554
- ```
555
-
556
- All patterns follow the same loop structure: SEEK records, GET fields, compute PRED, accumulate `(PRED - OBS)^2` in SSR, store SSR in R21 at the end.
557
-
558
- ### Agent Workflow Summary
559
-
560
- 1. **Inspect** — Call `/format` to see what fields exist
561
- 2. **Sample** — Use `/query` with `limit` to get a manageable subset (200–500 records)
562
- 3. **Import sample** — `/import` the subset as a new small dataset
563
- 4. **Hypothesize** — Propose a model relating the fields
564
- 5. **Write VALUE program** — Loop through records, compute predicted vs actual, accumulate SSR in R21
565
- 6. **Compile** — `/compile` with `primary_format` pointing to the sample
566
- 7. **Optimize** — `/optimize` with elements, objectives, and the sample as primary_file
567
- 8. **Interpret** — Read the converged element values — those are your model coefficients
568
- 9. **Iterate** — If SSR is high, try a different model (add terms, try nonlinear)
557
+ All patterns follow the same loop structure: SEEK records, GET fields, compute pred, accumulate `(pred - obs)^2` in ssr, store ssr in R21 at the end.