npm - @chaprola/mcp-server - Versions diffs - 1.7.0 → 1.9.0 - Mend

@chaprola/mcp-server 1.7.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/dist/index.js +86 -36
package/package.json +1 -1
package/references/cookbook.md +313 -324
package/references/gotchas.md +9 -6
package/references/ref-apps.md +2 -0
package/references/ref-import.md +37 -0
package/references/ref-pivot.md +11 -0
package/references/ref-programs.md +78 -1
package/references/ref-query.md +51 -0

package/references/gotchas.md CHANGED Viewed

@@ -60,8 +60,11 @@ IF EQUAL "CREDIT" U.76 GOTO 200
 ### MOVE literal auto-pads to field width
 `MOVE "Jones" P.name` auto-fills the rest of the field with blanks. No need to clear first.
-### DEFINE VARIABLE names must not collide with field names
-If the format has a `balance` field, don't `DEFINE VARIABLE balance R41`. Use `bal` instead. The compiler confuses the alias with the field name.
+### Don't use DEFINE VARIABLE
+Use implicit assignment: `LET rec = 1`. The compiler assigns R-variable slots automatically. DEFINE VARIABLE is legacy boilerplate.
+### Every program needs an intent file (.DS)
+One paragraph: what the program does, what parameters it accepts, what output it produces. The project review system flags programs without intents.
 ### R-variables are floating point
 All R1–R99 are 64-bit floats. `7 / 2 = 3.5`. Use PUT with `I` format to display as integer.
@@ -120,11 +123,11 @@ Always CLOSE before END if you wrote to the secondary file. Unflushed writes are
 ## HULDRA Optimization
-### Use R41–R99 for scratch variables, not R1–R20
-R1–R20 are reserved for HULDRA elements. R21–R40 are reserved for objectives. Your VALUE program's DEFINE VARIABLE declarations must use R41–R99 only.
+### HULDRA programs: don't use R1–R40 for your variables
+R1–R20 are reserved for HULDRA elements. R21–R40 are reserved for objectives. Use implicit assignment with names that the compiler will place in R41+:
 ```chaprola
-// WRONG: DEFINE VARIABLE counter R1  (HULDRA will overwrite this)
-// RIGHT: DEFINE VARIABLE counter R41
+LET rec = 1    // compiler assigns to R41+, safe from HULDRA
+LET ssr = 0    // accumulate error here, then LET R21 = ssr at the end
 ```
 ### Sample large datasets before optimizing

package/references/ref-apps.md CHANGED Viewed

@@ -37,6 +37,8 @@ Browser → React App → api.chaprola.org (site key in Authorization header)
    })
    ```
+**BAA requirement:** All data endpoints (including `/query`) require BAA signing. For public-facing apps that haven't signed a BAA, use published reports (`/report`) for reads instead of `/query`. Move filtering and aggregation into Chaprola programs (QUERY + TABULATE commands), publish them, and call `/report` from the frontend. Reserve the site key for write-only operations like `/insert-record`.
 **Security model:** The site key is checked against the `Origin` HTTP header, which browsers set automatically. This prevents other websites from using your key (CORS-level protection). However, Origin headers are trivially spoofable from non-browser clients (curl, Postman, scripts). Anyone who extracts the site key from your JavaScript has full access to the account's data. **Use this pattern only for public or semi-public data** — dashboards, product catalogs, published reports. For private data, use the multi-user pattern (each user authenticates individually) or the enterprise proxy pattern.
 ### 2. Multi-User App (each user has their own account)

package/references/ref-import.md CHANGED Viewed

@@ -44,3 +44,40 @@ Optional `format: "fhir"` for FHIR JSON reconstruction.
 ## POST /download
 `{userid, project, file, type}` → `{download_url, expires_in, size_bytes}`
 Type: `data`, `format`, `source`, `proc`, `output`.
+## sort_columns
+Reorder fields and physically sort data at import time, creating a self-indexing (clustered) data file.
+```json
+POST /import {
+  "userid": "...", "project": "...", "name": "STAFF",
+  "sort_columns": ["username", "kanji"],
+  "data": [...]
+}
+```
+- Reorders fields so sort columns come first in the format file
+- Sorts data by those columns at import time
+- Marks `KEY:1`, `KEY:2` in `.F` metadata
+- Enables binary search on the clustered key during QUERY
+## split_by
+Split a dataset into per-group data files at import time. One `.DA` file per distinct value of the split field, with a shared `.F` format file.
+```json
+POST /import {
+  "userid": "...", "project": "...", "name": "orders",
+  "split_by": "region",
+  "data": [...]
+}
+```
+- Creates one `.DA` per distinct value of the split field
+- Shared `.F` format file
+- Response includes `files_created` and `groups` object
+## 5GB File Size Limit
+Maximum 5GB per data file. Returns 413 error if exceeded. Use `split_by` for larger datasets.

package/references/ref-pivot.md CHANGED Viewed

@@ -44,3 +44,14 @@ For COUNT: `"value": "department", "aggregate": "count"`
 SQL equivalent: `SELECT department, year, SUM(revenue) FROM sales GROUP BY department, year`
 Row and column totals are included automatically in the response.
+## TABULATE in Programs
+The `/query` pivot feature is also available in the Chaprola language via the TABULATE command:
+```chaprola
+TABULATE SALES SUM revenue FOR department VS year WHERE year GE "2020" INTO trends
+PRINT TABULATE trends AS CSV
+```
+TABULATE produces a matrix in memory — same cross-tabulation as `/query` pivot, but executed inside a program with dynamic PARAM values and chaining with QUERY results.

package/references/ref-programs.md CHANGED Viewed

@@ -1,8 +1,16 @@
 # Chaprola Programs (.CS Source)
 ## Compile & Run
+**Best practice:** Start every program with `OPEN PRIMARY "filename" 0`. The compiler reads the format from the OPEN PRIMARY statement — no `primary_format` parameter needed. This makes programs self-documenting and eliminates compile-time guessing.
 ```bash
-POST /compile {userid, project, name: "REPORT", source: "...", primary_format: "STAFF", secondary_format?: "DEPTS"}
+# Preferred: source declares its own primary file
+POST /compile {userid, project, name: "REPORT", source: "OPEN PRIMARY \"STAFF\" 0\n..."}
+# Legacy: pass primary_format explicitly (still works, but OPEN PRIMARY is better)
+POST /compile {userid, project, name: "REPORT", source: "...", primary_format: "STAFF"}
 POST /run {userid, project, name: "REPORT", primary_file: "STAFF", record: 1, async?: true, nophi?: true}
 POST /run/status {userid, project, job_id}  # poll async jobs
 POST /publish {userid, project, name, primary_file, acl?: "public|authenticated|owner|token"}
@@ -111,5 +119,74 @@ LET lvl = PARAM.level            // numeric param → R variable
 Publish, then call: `POST /report?userid=X&project=Y&name=Z&deck=kanji&level=3`
 Discover params: `POST /report/params {userid, project, name}`
+## QUERY Command
+QUERY filters, selects, and reorganizes data inside a Chaprola program — the same power as the `/query` API endpoint, but as a language command.
+**Output is a .QR file (read-only snapshot).** Cannot be modified with INSERT, UPDATE, or DELETE. Use the original .DA file for writes. R20 is set to the number of matched records.
+```chaprola
+// Filter + column select
+QUERY STAFF FIELDS name, salary INTO HIGH_PAID WHERE salary GT 80000
+// Dynamic WHERE with params and R-variables
+QUERY flashcards INTO results WHERE level EQ PARAM.level
+QUERY data INTO subset WHERE score GE R5 AND category EQ PARAM.type
+// BETWEEN with dynamic bounds
+QUERY data INTO results WHERE age BETWEEN PARAM.min_age PARAM.max_age
+// Cross-file filtering (IN/NOT IN) — one per QUERY
+QUERY flashcards INTO new_cards WHERE kanji NOT IN progress.kanji
+// GROUP BY
+QUERY orders INTO summary WHERE year EQ "2026" GROUP BY region COUNT, SUM total ORDER BY SUM_TOTAL DESC LIMIT 5
+// FROM syntax (alternative to INTO)
+QUERY results FROM STAFF FIELDS name, salary WHERE dept EQ PARAM.dept
+// OPEN with WHERE (filter directly into file handle)
+OPEN SECONDARY customers WHERE customer_id IN orders.customer_id
+```
+### QUERY Errors
+- **Missing source file:** FOERR flag set, QUERY skipped. Program can check FOERR and branch. (R20 retains its prior value.)
+- **Missing IN-file:** NOT IN treats as empty set (all records pass). IN treats as empty set (no records pass). This is intentional — a new user with no progress file gets all flashcards.
+- **Missing PARAM:** Silently replaced with blank (string) or 0.0 (numeric). Not a hard error — program continues. Check param warnings in the response for diagnostics.
+- **Zero matches:** Not an error. R20 = 0, output .QR is empty.
+### QUERY Limits
+- One index lookup per QUERY (first EQ condition only)
+- One IN/NOT IN per QUERY
+- No nested QUERY — QUERY is a statement, not an expression
+- Output is always a new file — QUERY never modifies the source
+- FIELDS and GROUP BY are mutually exclusive
+## TABULATE Command
+TABULATE produces cross-tabulation matrices inside a program — the language equivalent of `/query` pivot. Result is in-memory only (not written to S3).
+```chaprola
+TABULATE sales SUM revenue FOR region VS quarter WHERE year EQ "2026" INTO matrix
+PRINT TABULATE matrix AS CSV     // CSV output for charting
+PRINT TABULATE matrix AS JSON    // JSON matrix for web apps
+PRINT TABULATE matrix AS TABLE   // text table for preview
+```
+Aggregates: COUNT, SUM, AVG, MIN, MAX. Multiple aggregates: `TABULATE data COUNT, SUM total FOR row VS col ...`
+## File Properties
+```chaprola
+LET R1 = orders.RECORDCOUNT     // record count of any loaded file
+IF R1 EQ 0 GOTO no_data
+```
+## INDEX Command
+```chaprola
+INDEX STAFF ON department        // creates STAFF.DEPARTMENT.IDX
+```
 ## Common Field Widths
 ISO datetime: 20, UUID: 36, email: 50, short ID: 8-12, dollar: 10, phone: 15.

package/references/ref-query.md CHANGED Viewed

@@ -38,3 +38,54 @@ Types: `inner`, `left`, `right`, `full`. Optional `pre_sorted: true` for merge j
 - `POST /update-record {userid, project, file, where: [...], set: {field: "value"}}`
 - `POST /delete-record {userid, project, file, where: [...]}`
 - `POST /consolidate {userid, project, file}` — merge .MRG into .DA
+## QUERY in Programs
+The QUERY language command does the same thing as the `/query` API but inside a Chaprola program. Use it to filter, select, and reorder data without leaving the runtime.
+```chaprola
+// In a program, QUERY replaces /query API calls
+QUERY STAFF FIELDS name, salary INTO TOP_EARNERS WHERE salary GT 80000 ORDER BY salary DESC LIMIT 10
+```
+The result is a `.QR` file (read-only snapshot) that can be opened as a secondary file or used in subsequent QUERY commands. R20 is set to the number of matched records. INSERT, UPDATE, and DELETE operations are rejected on .QR files.
+If the source file doesn't exist, the FOERR flag is set and the QUERY is skipped. If an IN/NOT IN reference file doesn't exist, it's treated as an empty set (NOT IN = all pass, IN = none pass).
+## Clustered Sort Columns
+Import with `sort_columns` to create self-indexing files. The data is physically sorted by the key columns at import time, enabling binary search on the clustered key without a separate .IDX file.
+```json
+POST /import {
+  "userid": "...", "project": "...", "name": "STAFF",
+  "sort_columns": ["department", "name"],
+  "data": [...]
+}
+```
+- The .F file marks KEY fields (`KEY:1`, `KEY:2`, etc.)
+- QUERY automatically uses binary search on clustered keys
+- No separate .IDX needed for primary access patterns
+## split_by on /import
+Split a dataset into per-group data files at import time. One `.DA` file is created per distinct value of the split field, sharing a single `.F` format file.
+```json
+POST /import {
+  "userid": "...", "project": "...", "name": "orders",
+  "split_by": "region",
+  "data": [...]
+}
+```
+Produces files like `orders/east.DA`, `orders/west.DA`, etc. Access with dynamic filenames in a program:
+```chaprola
+OPEN PRIMARY orders/PARAM.region
+```
+## BAA and Site Keys
+The `/query` API endpoint requires BAA signing. Site keys inherit this requirement. For public-facing web apps, use published reports (`/report`) instead of `/query` for all read operations. Move filtering and aggregation logic into Chaprola programs using the QUERY and TABULATE language commands, publish them, and call `/report` from the frontend. Reserve site keys for write-only operations like `/insert-record`.