@chaprola/mcp-server 1.3.0 → 1.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@chaprola/mcp-server",
3
- "version": "1.3.0",
3
+ "version": "1.3.2",
4
4
  "description": "MCP server for Chaprola — agent-first data platform. Gives AI agents 46 tools for structured data storage, record CRUD, querying, schema inspection, web search, URL fetching, scheduled jobs, and execution via plain HTTP.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -136,17 +136,270 @@ Supports: CSV, TSV, JSON, NDJSON, Parquet (zstd/snappy/lz4), Excel (.xlsx/.xls).
136
136
  AI instructions are optional — omit to import all columns as-is.
137
137
  Lambda: 10 GB /tmp, 900s timeout, 500 MB download limit.
138
138
 
139
- ## HULDRA Optimization Pattern
139
+ ## HULDRA Optimization — Nonlinear Parameter Fitting
140
140
 
141
- R1–R20 = elements (HULDRA sets before each run)
142
- R21–R40 = objectives (your program computes, HULDRA reads after)
143
- R41–R50 = scratch space
141
+ HULDRA finds the best parameter values for a mathematical model by minimizing the difference between model predictions and observed data. You propose a model, HULDRA finds the coefficients.
144
142
 
143
+ ### How It Works
144
+
145
+ 1. You write a VALUE program (normal Chaprola) that reads data, computes predictions using R-variable parameters, and stores the error in an objective R-variable
146
+ 2. HULDRA repeatedly runs your program with different parameter values, using gradient descent to minimize the objective
147
+ 3. When the objective stops improving, HULDRA returns the optimal parameters
148
+
149
+ ### R-Variable Interface
150
+
151
+ | Range | Purpose | Who sets it |
152
+ |-------|---------|-------------|
153
+ | R1–R20 | **Elements** (parameters to optimize) | HULDRA sets these before each VM run |
154
+ | R21–R40 | **Objectives** (error metrics) | Your program computes and stores these |
155
+ | R41–R50 | **Scratch space** | Your program uses these for temp variables |
156
+
157
+ ### Complete Example: Fit a Linear Model
158
+
159
+ **Goal:** Find `salary = a × years_exp + b` that best fits employee data.
160
+
161
+ **Step 1: Import data**
162
+ ```bash
163
+ POST /import {
164
+ userid, project: "fit", name: "EMP",
165
+ data: [
166
+ {"years_exp": 2, "salary": 55000},
167
+ {"years_exp": 5, "salary": 72000},
168
+ {"years_exp": 8, "salary": 88000},
169
+ {"years_exp": 12, "salary": 105000},
170
+ {"years_exp": 15, "salary": 118000}
171
+ ]
172
+ }
173
+ ```
174
+
175
+ **Step 2: Write and compile the VALUE program**
176
+ ```chaprola
177
+ // VALUE program: salary = R1 * years_exp + R2
178
+ // R1 = slope (per-year raise), R2 = base salary
179
+ // R21 = sum of squared residuals (SSR)
180
+
181
+ DEFINE VARIABLE REC R41
182
+ DEFINE VARIABLE YRS R42
183
+ DEFINE VARIABLE SAL R43
184
+ DEFINE VARIABLE PRED R44
185
+ DEFINE VARIABLE RESID R45
186
+ DEFINE VARIABLE SSR R46
187
+
188
+ LET SSR = 0
189
+ LET REC = 1
190
+ 100 SEEK REC
191
+ IF EOF GOTO 200
192
+ GET YRS FROM P.years_exp
193
+ GET SAL FROM P.salary
194
+ LET PRED = R1 * YRS
195
+ LET PRED = PRED + R2
196
+ LET RESID = PRED - SAL
197
+ LET RESID = RESID * RESID
198
+ LET SSR = SSR + RESID
199
+ LET REC = REC + 1
200
+ GOTO 100
201
+ 200 LET R21 = SSR
202
+ END
203
+ ```
204
+
205
+ Compile with: `primary_format: "EMP"`
206
+
207
+ **Step 3: Run HULDRA**
145
208
  ```bash
146
209
  POST /optimize {
147
- userid, project, program: "FITMODEL", primary_file: "DATA",
148
- elements: [{index: 1, label: "slope", start: 0.0, min: -100, max: 100, delta: 0.01}],
149
- objectives: [{index: 1, label: "SSR", goal: 0.0, weight: 1.0}],
210
+ userid, project: "fit",
211
+ program: "SALFIT",
212
+ primary_file: "EMP",
213
+ elements: [
214
+ {index: 1, label: "per_year_raise", start: 5000, min: 0, max: 20000, delta: 10},
215
+ {index: 2, label: "base_salary", start: 40000, min: 0, max: 100000, delta: 100}
216
+ ],
217
+ objectives: [
218
+ {index: 1, label: "SSR", goal: 0.0, weight: 1.0}
219
+ ],
150
220
  max_iterations: 100
151
221
  }
152
222
  ```
223
+
224
+ **Response:**
225
+ ```json
226
+ {
227
+ "status": "converged",
228
+ "iterations": 12,
229
+ "elements": [
230
+ {"index": 1, "label": "per_year_raise", "value": 4876.5},
231
+ {"index": 2, "label": "base_salary", "value": 46230.1}
232
+ ],
233
+ "objectives": [
234
+ {"index": 1, "label": "SSR", "value": 2841050.3, "goal": 0.0}
235
+ ],
236
+ "elapsed_seconds": 0.02
237
+ }
238
+ ```
239
+
240
+ **Result:** `salary = $4,877/year × experience + $46,230 base`
241
+
242
+ ### Element Parameters Explained
243
+
244
+ | Field | Description | Guidance |
245
+ |-------|-------------|----------|
246
+ | `index` | Maps to R-variable (1 → R1, 2 → R2, ...) | Max 20 elements |
247
+ | `label` | Human-readable name | Returned in results |
248
+ | `start` | Initial guess | Closer to true value = faster convergence |
249
+ | `min`, `max` | Bounds | HULDRA clamps parameters to this range |
250
+ | `delta` | Step size for gradient computation | ~0.1% of expected value range. Too large = inaccurate gradients. Too small = numerical noise |
251
+
252
+ ### Choosing Delta Values
253
+
254
+ Delta controls how HULDRA estimates gradients (via central differences). Rules of thumb:
255
+ - **Dollar amounts** (fares, salaries): `delta: 0.01` to `1.0`
256
+ - **Rates/percentages** (per-mile, per-minute): `delta: 0.001` to `0.01`
257
+ - **Counts/integers**: `delta: 0.1` to `1.0`
258
+ - **Time values** (hours, peaks): `delta: 0.05` to `0.5`
259
+
260
+ If optimization doesn't converge, try making delta smaller.
261
+
262
+ ### Performance & Limits
263
+
264
+ HULDRA runs your VALUE program **1 + 2 × N_elements** times per iteration (once for evaluation, twice per element for gradient). With `max_iterations: 100`:
265
+
266
+ | Elements | VM runs/iteration | At 100 iterations |
267
+ |----------|-------------------|-------------------|
268
+ | 2 | 5 | 500 |
269
+ | 3 | 7 | 700 |
270
+ | 5 | 11 | 1,100 |
271
+ | 10 | 21 | 2,100 |
272
+
273
+ **Lambda timeout is 900 seconds.** If each VM run takes 0.01s (100 records), you're fine. If each run takes 1s (100K records), 3 elements × 100 iterations = 700s — cutting it close.
274
+
275
+ **Strategy for large datasets:** Sample first. Query 200–500 representative records into a smaller dataset, optimize against that. The coefficients transfer to the full dataset.
276
+
277
+ ```bash
278
+ # Sample 500 records from a large dataset
279
+ POST /query {userid, project, file: "BIGDATA", limit: 500, offset: 100000}
280
+ # Import the sample
281
+ POST /import {userid, project, name: "SAMPLE", data: [...results...]}
282
+ # Optimize against the sample
283
+ POST /optimize {... primary_file: "SAMPLE" ...}
284
+ ```
285
+
286
+ ### Async Optimization
287
+
288
+ For optimizations that might exceed 30 seconds (API Gateway timeout), use async mode:
289
+
290
+ ```bash
291
+ POST /optimize {
292
+ ... async_exec: true ...
293
+ }
294
+ # Response: {status: "running", job_id: "20260325_..."}
295
+
296
+ POST /optimize/status {userid, project, job_id: "20260325_..."}
297
+ # Response: {status: "converged", elements: [...], ...}
298
+ ```
299
+
300
+ ### Multi-Objective Optimization
301
+
302
+ HULDRA can minimize multiple objectives simultaneously with different weights:
303
+
304
+ ```bash
305
+ objectives: [
306
+ {index: 1, label: "price_error", goal: 0.0, weight: 1.0},
307
+ {index: 2, label: "volume_error", goal: 0.0, weight: 10.0}
308
+ ]
309
+ ```
310
+
311
+ Higher weight = more important. HULDRA minimizes `Q = sum(weight × (value - goal)²)`.
312
+
313
+ ### Interpreting Results
314
+
315
+ - **`status: "converged"`** — Optimal parameters found. The objective stopped improving.
316
+ - **`status: "timeout"`** — Hit 900s wall clock. Results are the best found so far — often still useful.
317
+ - **`total_objective`** — The raw Q value. Compare across runs, not in absolute terms. Lower = better fit.
318
+ - **`SSR` (objective value)** — Sum of squared residuals. Divide by record count for mean squared error. Take the square root for RMSE in the same units as your data.
319
+ - **`dq_dx` on elements** — Gradient. Values near zero mean the parameter is well-optimized. Large values may indicate the bounds are too tight.
320
+
321
+ ### Model Catalog — Which Formula to Try
322
+
323
+ HULDRA fits any model expressible with Chaprola's math: `+`, `-`, `*`, `/`, `EXP`, `LOG`, `SQRT`, `ABS`, `POW`, and `IF` branching. Use this catalog to pick the right model for your data shape.
324
+
325
+ | Model | Formula | When to use | Chaprola math |
326
+ |-------|---------|-------------|---------------|
327
+ | **Linear** | `y = R1*x + R2` | Proportional relationships, constant rate | `*`, `+` |
328
+ | **Multi-linear** | `y = R1*x1 + R2*x2 + R3` | Multiple independent factors | `*`, `+` |
329
+ | **Quadratic** | `y = R1*x^2 + R2*x + R3` | Accelerating/decelerating curves, area scaling | `*`, `+`, `POW` |
330
+ | **Exponential growth** | `y = R1 * EXP(R2*x)` | Compound growth, population, interest | `EXP`, `*` |
331
+ | **Exponential decay** | `y = R1 * EXP(-R2*x) + R3` | Drug clearance, radioactive decay, cooling | `EXP`, `*`, `-` |
332
+ | **Power law** | `y = R1 * POW(x, R2)` | Scaling laws (Zipf, Kleiber), fractal relationships | `POW`, `*` |
333
+ | **Logarithmic** | `y = R1 * LOG(x) + R2` | Diminishing returns, perception (Weber-Fechner) | `LOG`, `*`, `+` |
334
+ | **Gaussian** | `y = R1 * EXP(-(x-R2)^2/(2*R3^2))` | Bell curves, distributions, demand peaks | `EXP`, `*`, `/` |
335
+ | **Logistic (S-curve)** | `y = R1 / (1 + EXP(-R2*(x-R3)))` | Adoption curves, saturation, carrying capacity | `EXP`, `/`, `+` |
336
+ | **Inverse** | `y = R1/x + R2` | Boyle's law, unit cost vs volume | `/`, `+` |
337
+ | **Square root** | `y = R1 * SQRT(x) + R2` | Flow rates (Bernoulli), risk vs portfolio size | `SQRT`, `*`, `+` |
338
+
339
+ **How to choose:** Look at your data's shape.
340
+ - Straight line → linear or multi-linear
341
+ - Curves upward faster and faster → exponential growth or quadratic
342
+ - Curves upward then flattens → logarithmic, square root, or logistic
343
+ - Drops fast then levels off → exponential decay or inverse
344
+ - Has a peak/hump → Gaussian
345
+ - Straight on log-log axes → power law
346
+
347
+ ### Nonlinear VALUE Program Patterns
348
+
349
+ **Exponential decay:** `y = R1 * exp(-R2 * x) + R3`
350
+ ```chaprola
351
+ LET ARG = R2 * X
352
+ LET ARG = ARG * -1
353
+ LET PRED = EXP ARG
354
+ LET PRED = PRED * R1
355
+ LET PRED = PRED + R3
356
+ ```
357
+
358
+ **Power law:** `y = R1 * x^R2`
359
+ ```chaprola
360
+ LET PRED = POW X R2
361
+ LET PRED = PRED * R1
362
+ ```
363
+
364
+ **Gaussian:** `y = R1 * exp(-(x - R2)^2 / (2 * R3^2))`
365
+ ```chaprola
366
+ LET DIFF = X - R2
367
+ LET DIFF = DIFF * DIFF
368
+ LET DENOM = R3 * R3
369
+ LET DENOM = DENOM * 2
370
+ LET ARG = DIFF / DENOM
371
+ LET ARG = ARG * -1
372
+ LET PRED = EXP ARG
373
+ LET PRED = PRED * R1
374
+ ```
375
+
376
+ **Logistic S-curve:** `y = R1 / (1 + exp(-R2 * (x - R3)))`
377
+ ```chaprola
378
+ LET ARG = X - R3
379
+ LET ARG = ARG * R2
380
+ LET ARG = ARG * -1
381
+ LET DENOM = EXP ARG
382
+ LET DENOM = DENOM + 1
383
+ LET PRED = R1 / DENOM
384
+ ```
385
+
386
+ **Logarithmic:** `y = R1 * ln(x) + R2`
387
+ ```chaprola
388
+ LET PRED = LOG X
389
+ LET PRED = PRED * R1
390
+ LET PRED = PRED + R2
391
+ ```
392
+
393
+ All patterns follow the same loop structure: SEEK records, GET fields, compute PRED, accumulate `(PRED - OBS)^2` in SSR, store SSR in R21 at the end.
394
+
395
+ ### Agent Workflow Summary
396
+
397
+ 1. **Inspect** — Call `/format` to see what fields exist
398
+ 2. **Sample** — Use `/query` with `limit` to get a manageable subset (200–500 records)
399
+ 3. **Import sample** — `/import` the subset as a new small dataset
400
+ 4. **Hypothesize** — Propose a model relating the fields
401
+ 5. **Write VALUE program** — Loop through records, compute predicted vs actual, accumulate SSR in R21
402
+ 6. **Compile** — `/compile` with `primary_format` pointing to the sample
403
+ 7. **Optimize** — `/optimize` with elements, objectives, and the sample as primary_file
404
+ 8. **Interpret** — Read the converged element values — those are your model coefficients
405
+ 9. **Iterate** — If SSR is high, try a different model (add terms, try nonlinear)
@@ -72,6 +72,33 @@ Only one secondary file can be open. CLOSE before opening another. Save any need
72
72
  ### CLOSE flushes writes
73
73
  Always CLOSE before END if you wrote to the secondary file. Unflushed writes are lost.
74
74
 
75
+ ## HULDRA Optimization
76
+
77
+ ### Use R41–R50 for scratch variables, not R1–R20
78
+ R1–R20 are reserved for HULDRA elements. R21–R40 are reserved for objectives. Your VALUE program's DEFINE VARIABLE declarations must use R41–R50 only.
79
+ ```chaprola
80
+ // WRONG: DEFINE VARIABLE counter R1 (HULDRA will overwrite this)
81
+ // RIGHT: DEFINE VARIABLE counter R41
82
+ ```
83
+
84
+ ### Sample large datasets before optimizing
85
+ HULDRA runs your program `1 + 2 × N_elements` times per iteration. With 3 elements and 100 iterations, that's 700 VM runs. If each run processes 1M records (7+ seconds), total time = 5,000+ seconds — well beyond the 900-second Lambda timeout. Query 200–500 records into a sample dataset and optimize against that.
86
+
87
+ ### Delta too large = bad convergence
88
+ If HULDRA doesn't converge or oscillates, reduce `delta`. Start with ~0.1% of the expected parameter range. For dollar amounts, try `delta: 0.01`. For rates, try `delta: 0.001`.
89
+
90
+ ### Always initialize SSR to zero
91
+ Your VALUE program accumulates squared residuals across all records. If you forget `LET SSR = 0` before the loop, SSR carries garbage from a previous HULDRA iteration (R-variables persist between runs within an optimization).
92
+
93
+ ### Filter bad data in the VALUE program
94
+ Negative fares, zero distances, and other anomalies will corrupt your fit. Add guards:
95
+ ```chaprola
96
+ GET FARE FROM P.fare
97
+ IF FARE LE 0 GOTO 300 // skip bad records
98
+ // ... compute residual ...
99
+ 300 LET REC = REC + 1
100
+ ```
101
+
75
102
  ## Email
76
103
 
77
104
  ### Content moderation on outbound