@chaprola/mcp-server 1.7.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -60,8 +60,11 @@ IF EQUAL "CREDIT" U.76 GOTO 200
60
60
  ### MOVE literal auto-pads to field width
61
61
  `MOVE "Jones" P.name` auto-fills the rest of the field with blanks. No need to clear first.
62
62
 
63
- ### DEFINE VARIABLE names must not collide with field names
64
- If the format has a `balance` field, don't `DEFINE VARIABLE balance R41`. Use `bal` instead. The compiler confuses the alias with the field name.
63
+ ### Don't use DEFINE VARIABLE
64
+ Use implicit assignment: `LET rec = 1`. The compiler assigns R-variable slots automatically. DEFINE VARIABLE is legacy boilerplate.
65
+
66
+ ### Every program needs an intent file (.DS)
67
+ One paragraph: what the program does, what parameters it accepts, what output it produces. The project review system flags programs without intents.
65
68
 
66
69
  ### R-variables are floating point
67
70
  All R1–R99 are 64-bit floats. `7 / 2 = 3.5`. Use PUT with `I` format to display as integer.
@@ -120,11 +123,11 @@ Always CLOSE before END if you wrote to the secondary file. Unflushed writes are
120
123
 
121
124
  ## HULDRA Optimization
122
125
 
123
- ### Use R41R99 for scratch variables, not R1–R20
124
- R1–R20 are reserved for HULDRA elements. R21–R40 are reserved for objectives. Your VALUE program's DEFINE VARIABLE declarations must use R41–R99 only.
126
+ ### HULDRA programs: don't use R1R40 for your variables
127
+ R1–R20 are reserved for HULDRA elements. R21–R40 are reserved for objectives. Use implicit assignment with names that the compiler will place in R41+:
125
128
  ```chaprola
126
- // WRONG: DEFINE VARIABLE counter R1 (HULDRA will overwrite this)
127
- // RIGHT: DEFINE VARIABLE counter R41
129
+ LET rec = 1 // compiler assigns to R41+, safe from HULDRA
130
+ LET ssr = 0 // accumulate error here, then LET R21 = ssr at the end
128
131
  ```
129
132
 
130
133
  ### Sample large datasets before optimizing
@@ -37,6 +37,8 @@ Browser → React App → api.chaprola.org (site key in Authorization header)
37
37
  })
38
38
  ```
39
39
 
40
+ **BAA requirement:** All data endpoints (including `/query`) require BAA signing. For public-facing apps that haven't signed a BAA, use published reports (`/report`) for reads instead of `/query`. Move filtering and aggregation into Chaprola programs (QUERY + TABULATE commands), publish them, and call `/report` from the frontend. Reserve the site key for write-only operations like `/insert-record`.
41
+
40
42
  **Security model:** The site key is checked against the `Origin` HTTP header, which browsers set automatically. This prevents other websites from using your key (CORS-level protection). However, Origin headers are trivially spoofable from non-browser clients (curl, Postman, scripts). Anyone who extracts the site key from your JavaScript has full access to the account's data. **Use this pattern only for public or semi-public data** — dashboards, product catalogs, published reports. For private data, use the multi-user pattern (each user authenticates individually) or the enterprise proxy pattern.
41
43
 
42
44
  ### 2. Multi-User App (each user has their own account)
@@ -44,3 +44,40 @@ Optional `format: "fhir"` for FHIR JSON reconstruction.
44
44
  ## POST /download
45
45
  `{userid, project, file, type}` → `{download_url, expires_in, size_bytes}`
46
46
  Type: `data`, `format`, `source`, `proc`, `output`.
47
+
48
+ ## sort_columns
49
+
50
+ Reorder fields and physically sort data at import time, creating a self-indexing (clustered) data file.
51
+
52
+ ```json
53
+ POST /import {
54
+ "userid": "...", "project": "...", "name": "STAFF",
55
+ "sort_columns": ["username", "kanji"],
56
+ "data": [...]
57
+ }
58
+ ```
59
+
60
+ - Reorders fields so sort columns come first in the format file
61
+ - Sorts data by those columns at import time
62
+ - Marks `KEY:1`, `KEY:2` in `.F` metadata
63
+ - Enables binary search on the clustered key during QUERY
64
+
65
+ ## split_by
66
+
67
+ Split a dataset into per-group data files at import time. One `.DA` file per distinct value of the split field, with a shared `.F` format file.
68
+
69
+ ```json
70
+ POST /import {
71
+ "userid": "...", "project": "...", "name": "orders",
72
+ "split_by": "region",
73
+ "data": [...]
74
+ }
75
+ ```
76
+
77
+ - Creates one `.DA` per distinct value of the split field
78
+ - Shared `.F` format file
79
+ - Response includes `files_created` and `groups` object
80
+
81
+ ## 5GB File Size Limit
82
+
83
+ Maximum 5GB per data file. Returns 413 error if exceeded. Use `split_by` for larger datasets.
@@ -44,3 +44,14 @@ For COUNT: `"value": "department", "aggregate": "count"`
44
44
  SQL equivalent: `SELECT department, year, SUM(revenue) FROM sales GROUP BY department, year`
45
45
 
46
46
  Row and column totals are included automatically in the response.
47
+
48
+ ## TABULATE in Programs
49
+
50
+ The `/query` pivot feature is also available in the Chaprola language via the TABULATE command:
51
+
52
+ ```chaprola
53
+ TABULATE SALES SUM revenue FOR department VS year WHERE year GE "2020" INTO trends
54
+ PRINT TABULATE trends AS CSV
55
+ ```
56
+
57
+ TABULATE produces a matrix in memory — same cross-tabulation as `/query` pivot, but executed inside a program with dynamic PARAM values and chaining with QUERY results.
@@ -1,8 +1,16 @@
1
1
  # Chaprola Programs (.CS Source)
2
2
 
3
3
  ## Compile & Run
4
+
5
+ **Best practice:** Start every program with `OPEN PRIMARY "filename" 0`. The compiler reads the format from the OPEN PRIMARY statement — no `primary_format` parameter needed. This makes programs self-documenting and eliminates compile-time guessing.
6
+
4
7
  ```bash
5
- POST /compile {userid, project, name: "REPORT", source: "...", primary_format: "STAFF", secondary_format?: "DEPTS"}
8
+ # Preferred: source declares its own primary file
9
+ POST /compile {userid, project, name: "REPORT", source: "OPEN PRIMARY \"STAFF\" 0\n..."}
10
+
11
+ # Legacy: pass primary_format explicitly (still works, but OPEN PRIMARY is better)
12
+ POST /compile {userid, project, name: "REPORT", source: "...", primary_format: "STAFF"}
13
+
6
14
  POST /run {userid, project, name: "REPORT", primary_file: "STAFF", record: 1, async?: true, nophi?: true}
7
15
  POST /run/status {userid, project, job_id} # poll async jobs
8
16
  POST /publish {userid, project, name, primary_file, acl?: "public|authenticated|owner|token"}
@@ -111,5 +119,74 @@ LET lvl = PARAM.level // numeric param → R variable
111
119
  Publish, then call: `POST /report?userid=X&project=Y&name=Z&deck=kanji&level=3`
112
120
  Discover params: `POST /report/params {userid, project, name}`
113
121
 
122
+ ## QUERY Command
123
+
124
+ QUERY filters, selects, and reorganizes data inside a Chaprola program — the same power as the `/query` API endpoint, but as a language command.
125
+
126
+ **Output is a .QR file (read-only snapshot).** Cannot be modified with INSERT, UPDATE, or DELETE. Use the original .DA file for writes. R20 is set to the number of matched records.
127
+
128
+ ```chaprola
129
+ // Filter + column select
130
+ QUERY STAFF FIELDS name, salary INTO HIGH_PAID WHERE salary GT 80000
131
+
132
+ // Dynamic WHERE with params and R-variables
133
+ QUERY flashcards INTO results WHERE level EQ PARAM.level
134
+ QUERY data INTO subset WHERE score GE R5 AND category EQ PARAM.type
135
+
136
+ // BETWEEN with dynamic bounds
137
+ QUERY data INTO results WHERE age BETWEEN PARAM.min_age PARAM.max_age
138
+
139
+ // Cross-file filtering (IN/NOT IN) — one per QUERY
140
+ QUERY flashcards INTO new_cards WHERE kanji NOT IN progress.kanji
141
+
142
+ // GROUP BY
143
+ QUERY orders INTO summary WHERE year EQ "2026" GROUP BY region COUNT, SUM total ORDER BY SUM_TOTAL DESC LIMIT 5
144
+
145
+ // FROM syntax (alternative to INTO)
146
+ QUERY results FROM STAFF FIELDS name, salary WHERE dept EQ PARAM.dept
147
+
148
+ // OPEN with WHERE (filter directly into file handle)
149
+ OPEN SECONDARY customers WHERE customer_id IN orders.customer_id
150
+ ```
151
+
152
+ ### QUERY Errors
153
+ - **Missing source file:** FOERR flag set, QUERY skipped. Program can check FOERR and branch. (R20 retains its prior value.)
154
+ - **Missing IN-file:** NOT IN treats as empty set (all records pass). IN treats as empty set (no records pass). This is intentional — a new user with no progress file gets all flashcards.
155
+ - **Missing PARAM:** Silently replaced with blank (string) or 0.0 (numeric). Not a hard error — program continues. Check param warnings in the response for diagnostics.
156
+ - **Zero matches:** Not an error. R20 = 0, output .QR is empty.
157
+
158
+ ### QUERY Limits
159
+ - One index lookup per QUERY (first EQ condition only)
160
+ - One IN/NOT IN per QUERY
161
+ - No nested QUERY — QUERY is a statement, not an expression
162
+ - Output is always a new file — QUERY never modifies the source
163
+ - FIELDS and GROUP BY are mutually exclusive
164
+
165
+ ## TABULATE Command
166
+
167
+ TABULATE produces cross-tabulation matrices inside a program — the language equivalent of `/query` pivot. Result is in-memory only (not written to S3).
168
+
169
+ ```chaprola
170
+ TABULATE sales SUM revenue FOR region VS quarter WHERE year EQ "2026" INTO matrix
171
+ PRINT TABULATE matrix AS CSV // CSV output for charting
172
+ PRINT TABULATE matrix AS JSON // JSON matrix for web apps
173
+ PRINT TABULATE matrix AS TABLE // text table for preview
174
+ ```
175
+
176
+ Aggregates: COUNT, SUM, AVG, MIN, MAX. Multiple aggregates: `TABULATE data COUNT, SUM total FOR row VS col ...`
177
+
178
+ ## File Properties
179
+
180
+ ```chaprola
181
+ LET R1 = orders.RECORDCOUNT // record count of any loaded file
182
+ IF R1 EQ 0 GOTO no_data
183
+ ```
184
+
185
+ ## INDEX Command
186
+
187
+ ```chaprola
188
+ INDEX STAFF ON department // creates STAFF.DEPARTMENT.IDX
189
+ ```
190
+
114
191
  ## Common Field Widths
115
192
  ISO datetime: 20, UUID: 36, email: 50, short ID: 8-12, dollar: 10, phone: 15.
@@ -38,3 +38,54 @@ Types: `inner`, `left`, `right`, `full`. Optional `pre_sorted: true` for merge j
38
38
  - `POST /update-record {userid, project, file, where: [...], set: {field: "value"}}`
39
39
  - `POST /delete-record {userid, project, file, where: [...]}`
40
40
  - `POST /consolidate {userid, project, file}` — merge .MRG into .DA
41
+
42
+ ## QUERY in Programs
43
+
44
+ The QUERY language command does the same thing as the `/query` API but inside a Chaprola program. Use it to filter, select, and reorder data without leaving the runtime.
45
+
46
+ ```chaprola
47
+ // In a program, QUERY replaces /query API calls
48
+ QUERY STAFF FIELDS name, salary INTO TOP_EARNERS WHERE salary GT 80000 ORDER BY salary DESC LIMIT 10
49
+ ```
50
+
51
+ The result is a `.QR` file (read-only snapshot) that can be opened as a secondary file or used in subsequent QUERY commands. R20 is set to the number of matched records. INSERT, UPDATE, and DELETE operations are rejected on .QR files.
52
+
53
+ If the source file doesn't exist, the FOERR flag is set and the QUERY is skipped. If an IN/NOT IN reference file doesn't exist, it's treated as an empty set (NOT IN = all pass, IN = none pass).
54
+
55
+ ## Clustered Sort Columns
56
+
57
+ Import with `sort_columns` to create self-indexing files. The data is physically sorted by the key columns at import time, enabling binary search on the clustered key without a separate .IDX file.
58
+
59
+ ```json
60
+ POST /import {
61
+ "userid": "...", "project": "...", "name": "STAFF",
62
+ "sort_columns": ["department", "name"],
63
+ "data": [...]
64
+ }
65
+ ```
66
+
67
+ - The .F file marks KEY fields (`KEY:1`, `KEY:2`, etc.)
68
+ - QUERY automatically uses binary search on clustered keys
69
+ - No separate .IDX needed for primary access patterns
70
+
71
+ ## split_by on /import
72
+
73
+ Split a dataset into per-group data files at import time. One `.DA` file is created per distinct value of the split field, sharing a single `.F` format file.
74
+
75
+ ```json
76
+ POST /import {
77
+ "userid": "...", "project": "...", "name": "orders",
78
+ "split_by": "region",
79
+ "data": [...]
80
+ }
81
+ ```
82
+
83
+ Produces files like `orders/east.DA`, `orders/west.DA`, etc. Access with dynamic filenames in a program:
84
+
85
+ ```chaprola
86
+ OPEN PRIMARY orders/PARAM.region
87
+ ```
88
+
89
+ ## BAA and Site Keys
90
+
91
+ The `/query` API endpoint requires BAA signing. Site keys inherit this requirement. For public-facing web apps, use published reports (`/report`) instead of `/query` for all read operations. Move filtering and aggregation logic into Chaprola programs using the QUERY and TABULATE language commands, publish them, and call `/report` from the frontend. Reserve site keys for write-only operations like `/insert-record`.