@barivia/barsom-mcp 0.4.2 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +60 -52
- package/dist/index.js +1 -1
- package/dist/index.js.map +1 -1
- package/dist/views/src/views/map-explorer/index.html +288 -0
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -32,72 +32,80 @@ This is the standard pattern for MCP servers distributed as npm packages (same a
|
|
|
32
32
|
|
|
33
33
|
Legacy `BARSOM_API_KEY` / `BARSOM_API_URL` are also accepted as fallbacks.
|
|
34
34
|
|
|
35
|
-
## Tools (
|
|
35
|
+
## Tools (11 + MCP App)
|
|
36
36
|
|
|
37
|
-
|
|
37
|
+
All multi-action tools follow the `datasets` pattern: a required `action` enum routes to the correct operation.
|
|
38
38
|
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
| `guide_barsom_workflow` | Standard Operating Procedure. Call this first if unsure. Outlines the Upload → Preprocess → Train → Wait → Analyze workflow. |
|
|
42
|
-
| `datasets` | upload: add CSV; preview: inspect columns/stats/samples before train_som; subset: create subset by row_range or filter; delete: remove dataset |
|
|
43
|
-
| `list` | List datasets (type=datasets) or jobs (type=jobs; optional dataset_id filter) |
|
|
39
|
+
### `guide_barsom_workflow`
|
|
40
|
+
SOP dispatch. Call this first if unsure of the workflow. No parameters.
|
|
44
41
|
|
|
45
|
-
###
|
|
42
|
+
### `datasets(action)`
|
|
46
43
|
|
|
47
|
-
|
|
|
48
|
-
|
|
49
|
-
| `
|
|
50
|
-
| `
|
|
51
|
-
| `
|
|
44
|
+
| Action | Use when |
|
|
45
|
+
|--------|----------|
|
|
46
|
+
| `upload` | Adding a new CSV — returns dataset_id |
|
|
47
|
+
| `preview` | Before jobs(action=train_map) — inspect columns, stats, cyclic/datetime hints |
|
|
48
|
+
| `list` | Finding dataset IDs |
|
|
49
|
+
| `subset` | Creating a filtered/sliced copy (row_range, filter conditions) |
|
|
50
|
+
| `delete` | Removing a dataset |
|
|
52
51
|
|
|
53
|
-
###
|
|
52
|
+
### `jobs(action)`
|
|
54
53
|
|
|
55
|
-
|
|
|
56
|
-
|
|
57
|
-
| `
|
|
58
|
-
| `
|
|
59
|
-
| `
|
|
54
|
+
| Action | Use when |
|
|
55
|
+
|--------|----------|
|
|
56
|
+
| `train_map` | Submitting a new map training job — full control: model type, grid, epochs, cyclic/temporal features, transforms. Returns `job_id`; poll with `jobs(action=status, job_id=...)`. |
|
|
57
|
+
| `status` | Polling after any async job — every 10–15s |
|
|
58
|
+
| `list` | Finding job IDs, checking pipeline state |
|
|
59
|
+
| `compare` | Picking the best run from a set (QE, TE, silhouette table) |
|
|
60
|
+
| `cancel` | Stopping a running job |
|
|
61
|
+
| `delete` | Permanently removing a job + its S3 files |
|
|
60
62
|
|
|
61
|
-
###
|
|
63
|
+
### `results(action)`
|
|
62
64
|
|
|
63
|
-
|
|
|
64
|
-
|
|
65
|
-
| `
|
|
66
|
-
| `
|
|
67
|
-
| `
|
|
68
|
-
| `
|
|
69
|
-
| `
|
|
70
|
-
| `feature_correlation` | Inter-feature correlation analysis |
|
|
71
|
-
| `transition_flow` | Temporal trajectory analysis with quiver plots |
|
|
72
|
-
| `local_density` | Node neighborhood density analysis |
|
|
73
|
-
| `feature_gradient` | Spatial gradient magnitude across the SOM |
|
|
74
|
-
| `quality_report` | Extended quality metrics (trustworthiness, neighborhood preservation, recommendations) |
|
|
65
|
+
| Action | Sync | Use when |
|
|
66
|
+
|--------|------|----------|
|
|
67
|
+
| `get` | instant | First look after training — combined view + quality metrics |
|
|
68
|
+
| `export` | instant | Learning curve (training_log), weight matrix (weights), node stats (nodes) |
|
|
69
|
+
| `download` | instant | Saving figures to a local folder |
|
|
70
|
+
| `recolor` | async | Changing colormap or output format without retraining |
|
|
71
|
+
| `transition_flow` | async | Temporal state transition analysis on time-ordered data |
|
|
75
72
|
|
|
76
|
-
###
|
|
73
|
+
### `analyze(analysis_type)`
|
|
74
|
+
10 analysis types: `u_matrix`, `component_planes`, `bmu_hits`, `clusters`, `feature_importance`, `feature_correlation`, `transition_flow`, `local_density`, `feature_gradient`, `quality_report`.
|
|
77
75
|
|
|
78
|
-
|
|
79
|
-
|---|---|
|
|
80
|
-
| `compare_runs` | Side-by-side metric comparison across training runs |
|
|
81
|
-
| `project_variable` | Project a derived variable onto a trained SOM |
|
|
82
|
-
| `derive_variable` | Evaluate a formula expression (e.g. `revenue / cost`) and project result onto SOM or add as a new column |
|
|
83
|
-
| `transition_flow` | Dedicated temporal flow analysis (submit job, poll, return flow plot) |
|
|
76
|
+
### `project(action)`
|
|
84
77
|
|
|
85
|
-
|
|
78
|
+
| Action | Use when |
|
|
79
|
+
|--------|----------|
|
|
80
|
+
| `expression` | Computing a derived variable from a formula (`revenue / cost`, `diff(temp)`, rolling stats) — add to dataset or project onto the map |
|
|
81
|
+
| `values` | Projecting a pre-computed external array (anomaly scores, labels from another system) onto the map |
|
|
86
82
|
|
|
87
|
-
|
|
83
|
+
### `inference(action)`
|
|
84
|
+
All actions use a frozen trained map — no retraining. All are async.
|
|
88
85
|
|
|
89
|
-
|
|
|
90
|
-
|
|
91
|
-
| `predict` |
|
|
92
|
-
| `
|
|
93
|
-
| `
|
|
94
|
-
| `
|
|
86
|
+
| Action | Output | Timing |
|
|
87
|
+
|--------|--------|--------|
|
|
88
|
+
| `predict` | predictions.csv: per-row bmu_x/y, cluster_id, QE — high QE = anomaly | 5–120s |
|
|
89
|
+
| `enrich` | enriched.csv: training data + bmu_x/y/node_index/cluster_id | 5–60s |
|
|
90
|
+
| `compare` | density-diff heatmap + top gained/lost nodes — drift, A/B, cohort | 30–120s |
|
|
91
|
+
| `report` | comprehensive PDF: metrics, views, component grid, cluster table | 30–180s |
|
|
95
92
|
|
|
96
|
-
###
|
|
93
|
+
### `account(action)`
|
|
97
94
|
|
|
98
|
-
|
|
|
99
|
-
|
|
100
|
-
| `
|
|
95
|
+
| Action | Use when |
|
|
96
|
+
|--------|----------|
|
|
97
|
+
| `status` | Before large jobs — plan tier, GPU availability, queue depth, credit balance, training time estimates |
|
|
98
|
+
| `request_compute` | Upgrading to cloud burst. Leave tier blank to list options. |
|
|
99
|
+
| `compute_status` | Checking active lease time remaining |
|
|
100
|
+
| `release_compute` | Manually stopping a lease to stop billing |
|
|
101
|
+
| `history` | Viewing recent compute usage and spend |
|
|
102
|
+
| `add_funds` | Getting instructions to add credits |
|
|
103
|
+
|
|
104
|
+
### `explore_map` (MCP App)
|
|
105
|
+
Interactive inline map explorer — clickable nodes, feature toggles, export controls.
|
|
106
|
+
|
|
107
|
+
### `send_feedback`
|
|
108
|
+
Submit feedback or feature requests (max 1400 characters, ~190 words).
|
|
101
109
|
|
|
102
110
|
## Tool Design Guidelines
|
|
103
111
|
|
|
@@ -110,7 +118,7 @@ When adding or refining tools, follow [MCP best practices](https://modelcontextp
|
|
|
110
118
|
|
|
111
119
|
## Data preparation
|
|
112
120
|
|
|
113
|
-
To train on a subset of your data (e.g. first 2000 rows, or rows where region=Europe) without re-uploading: use **datasets(action=subset)** with `row_range` and/or `filter` to create a new dataset, then **
|
|
121
|
+
To train on a subset of your data (e.g. first 2000 rows, or rows where region=Europe) without re-uploading: use **datasets(action=subset)** with `row_range` and/or `filter` to create a new dataset, then **jobs(action=train_map, dataset_id=...)** on the new dataset_id; or pass **row_range** in **jobs(action=train_map)** params for a one-off training slice.
|
|
114
122
|
|
|
115
123
|
## How It Works
|
|
116
124
|
|
package/dist/index.js
CHANGED
|
@@ -1,2 +1,2 @@
|
|
|
1
1
|
#!/usr/bin/env node
|
|
2
|
-
import{McpServer as e}from"@modelcontextprotocol/sdk/server/mcp.js";import{StdioServerTransport as t}from"@modelcontextprotocol/sdk/server/stdio.js";import{z as a}from"zod";import{registerAppResource as o,registerAppTool as r,RESOURCE_MIME_TYPE as n}from"@modelcontextprotocol/ext-apps/server";import i from"node:fs/promises";import s from"node:path";const l=process.env.BARIVIA_API_URL??process.env.BARSOM_API_URL??"https://api.barivia.se",d=process.env.BARIVIA_API_KEY??process.env.BARSOM_API_KEY??"";d||(console.error("Error: BARIVIA_API_KEY not set. Set it in your MCP client config."),process.exit(1));const u=parseInt(process.env.BARIVIA_FETCH_TIMEOUT_MS??"30000",10),c=new Set([502,503,504]);function p(e,t){return!(void 0===t||!c.has(t))||(e instanceof DOMException&&"AbortError"===e.name||e instanceof TypeError)}async function m(e,t,a=u){const o=new AbortController,r=setTimeout(()=>o.abort(),a);try{return await fetch(e,{...t,signal:o.signal})}finally{clearTimeout(r)}}async function g(e,t,a,o){const r=`${l}${t}`,n=o?.["Content-Type"]??"application/json",i=Math.random().toString(36).slice(2,10),s={Authorization:`Bearer ${d}`,"Content-Type":n,"X-Request-ID":i,...o};let u,c;void 0!==a&&(u="application/json"===n?JSON.stringify(a):String(a));for(let t=0;t<=2;t++)try{const a=await m(r,{method:e,headers:s,body:u}),o=await a.text();if(!a.ok){if(t<2&&p(null,a.status)){await new Promise(e=>setTimeout(e,1e3*2**t));continue}const e=(()=>{try{return JSON.parse(o)}catch{return null}})(),r=e?.error??o,n=400===a.status?" Check parameter types and required fields.":404===a.status?" The resource may not exist or may have been deleted.":409===a.status?" The job may not be in the expected state.":429===a.status?" Rate limit exceeded — wait a moment and retry.":"";throw new Error(`${r}${n}`)}return JSON.parse(o)}catch(e){if(c=e,t<2&&p(e)){await new Promise(e=>setTimeout(e,1e3*2**t));continue}throw e}throw c}async function h(e){const t=`${l}${e}`;let a;for(let o=0;o<=2;o++)try{const a=await m(t,{method:"GET",headers:{Authorization:`Bearer ${d}`}});if(!a.ok){if(o<2&&p(null,a.status)){await new Promise(e=>setTimeout(e,1e3*2**o));continue}throw new Error(`API GET ${e} returned ${a.status}`)}const r=await a.arrayBuffer();return{data:Buffer.from(r),contentType:a.headers.get("content-type")??"application/octet-stream"}}catch(e){if(a=e,o<2&&p(e)){await new Promise(e=>setTimeout(e,1e3*2**o));continue}throw e}throw a}function f(e){return{content:[{type:"text",text:JSON.stringify(e,null,2)}]}}async function b(e,t=3e4,a=1e3){const o=Date.now();for(;Date.now()-o<t;){const t=await g("GET",`/v1/jobs/${e}`),o=t.status;if("completed"===o||"failed"===o||"cancelled"===o)return{status:o,result_ref:t.result_ref,error:t.error};await new Promise(e=>setTimeout(e,a))}return{status:"timeout"}}const _=new e({name:"analytics-engine",version:"0.4.1",instructions:"# Barivia barSOM Analytics Engine\n\nYou have access to a Self-Organizing Map (SOM) analytics platform. SOMs are unsupervised neural networks that project high-dimensional data onto a 2D grid, revealing clusters, gradients, and anomalies in the structure of the data.\n\n## Typical workflow\n\n1. **Upload** → `datasets(action=upload)` — ingest a CSV\n2. **Preview** → `datasets(action=preview)` — inspect columns, detect cyclics/datetimes\n3. **Train** → `train_som` — returns a job_id; poll `get_job_status` until completed\n4. **Analyze** → `get_results` + `analyze` — visualize and interpret the map\n5. **Export / Inference** → `enrich_dataset`, `predict`, `compare_datasets`, `generate_report`\n\n## Tool categories\n\n| Category | Tools |\n|----------|-------|\n| Data management | `datasets`, `manage_account` |\n| Training | `train_som`, `recolor_som` |\n| Analysis | `analyze`, `get_results`, `get_result_image`, `get_job_status` |\n| Projection | `project_variable`, `derive_variable`, `attach_column`, `project_multiple`, `transition_flow` |\n| Inference & export | `predict`, `enrich_dataset`, `compare_datasets`, `generate_report` |\n| Feedback | `send_feedback` |\n\n## Async job pattern\n\nMost operations are async. Every tool that submits a job either:\n- **Auto-polls** (project_variable, derive_variable, predict, enrich_dataset, compare_datasets, generate_report, transition_flow) — waits up to 30–180 seconds then returns or gives a job_id to poll manually\n- **Returns immediately** (train_som, recolor_som) — always requires manual `get_job_status` polling\n\n**Do not tell the user a job failed because it is still running.** If a tool returns a job_id, poll `get_job_status` every 10–15 seconds. SOM training takes 30s–10min depending on grid size and dataset.\n\n## Credit and cost\n\nJobs consume compute credits. Inference jobs (predict, compare_datasets, generate_report) are priced the same as projection jobs. Check `manage_account(action=status)` to see the remaining balance before starting large jobs.\n\n## Key constraints\n\n- Column names are case-sensitive; always match exactly what `datasets(action=preview)` returns\n- Numeric columns only (SOMs do not support text/categorical directly — encode first)\n- `predict` input columns must exactly match the features used during training"}),w=import.meta.dirname??s.dirname(new URL(import.meta.url).pathname);async function y(e){const t=[s.join(w,"views","src","views",e,"index.html"),s.join(w,"views",e,"index.html"),s.join(w,"..","dist","views","src","views",e,"index.html")];for(const e of t)try{return await i.readFile(e,"utf-8")}catch{continue}return null}const v="ui://barsom/som-explorer",$="ui://barsom/data-preview",x="ui://barsom/training-monitor";function j(e,t,a,o){const r=t.output_format??"pdf";if("transition_flow"===e){return[`transition_flow_lag${t.lag??1}.${r}`]}if("project_variable"===e){const e=t.variable_name??"variable";return[`projected_${String(e).replace(/[^a-zA-Z0-9_]/g,"_")}.${r}`]}if("derive_variable"===e){const e=t.variable_name??"variable";return[`projected_${String(e).replace(/[^a-zA-Z0-9_]/g,"_")}.${r}`]}const n=t.features??[],i=`combined.${r}`,s=`umatrix.${r}`,l=`hit_histogram.${r}`,d=`correlation.${r}`,u=n.map((e,t)=>`component_${t+1}_${e.replace(/[^a-zA-Z0-9_]/g,"_")}.${r}`),c=[i,s,l,d,...u];if(void 0===a||"default"===a)return o?c:[i];if("combined_only"===a)return[i];if("all"===a)return c;if(Array.isArray(a)){const e={combined:i,umatrix:s,hit_histogram:l,correlation:d};return n.forEach((t,a)=>{e[`component_${a+1}`]=u[a]}),a.map(t=>{const a=t.trim().toLowerCase();return e[a]?e[a]:t.includes(".")?t:null}).filter(e=>null!=e)}return[i]}function S(e){return e.endsWith(".pdf")?"application/pdf":e.endsWith(".svg")?"image/svg+xml":"image/png"}async function T(e,t,a){if(a.endsWith(".pdf")||a.endsWith(".svg"))e.push({type:"text",text:`${a} is ready (vector format — not inlineable). Use get_result_image(job_id="${t}", filename="${a}") to download it.`});else try{const{data:o}=await h(`/v1/results/${t}/image/${a}`);e.push({type:"image",data:o.toString("base64"),mimeType:S(a),annotations:{audience:["user"],priority:.8}})}catch{e.push({type:"text",text:`(${a} not available for inline display)`})}}o(_,v,v,{mimeType:n},async()=>{const e=await y("som-explorer");return{contents:[{uri:v,mimeType:n,text:e??"<html><body>SOM Explorer view not built yet. Run: npm run build:views</body></html>"}]}}),o(_,$,$,{mimeType:n},async()=>{const e=await y("data-preview");return{contents:[{uri:$,mimeType:n,text:e??"<html><body>Data Preview view not built yet.</body></html>"}]}}),o(_,x,x,{mimeType:n},async()=>{const e=await y("training-monitor");return{contents:[{uri:x,mimeType:n,text:e??"<html><body>Training Monitor view not built yet.</body></html>"}]}}),r(_,"explore_som",{title:"Explore SOM",description:"Interactive SOM explorer dashboard. Opens an inline visualization where you can toggle features, click nodes, and export figures. Use this after get_results for a richer, interactive exploration experience. Falls back to text+image on hosts that don't support MCP Apps.",inputSchema:{job_id:a.string().describe("Job ID of a completed SOM training job")},_meta:{ui:{resourceUri:v}}},async({job_id:e})=>{const t=await g("GET",`/v1/results/${e}`),a=t.summary??{},o=[];o.push({type:"text",text:JSON.stringify({job_id:e,summary:a,download_urls:t.download_urls})});const r=a.output_format??"pdf";return await T(o,e,`combined.${r}`),{content:o}}),_.tool("guide_barsom_workflow","Retrieve the Standard Operating Procedure (SOP) for the barSOM analysis pipeline.\nALWAYS call this tool first if you are unsure of the steps to execute a complete Self-Organizing Map analysis.\nThe workflow explains the exact sequence of tool calls needed: Upload → Preprocess → Train → Wait → Analyze.",{},async()=>({content:[{type:"text",text:"barSOM Standard Operating Procedure (SOP)\n\nStep 1: Upload Data\n- Use `datasets(action=upload)` with a local `file_path` to your CSV.\n- BEFORE UPLOADING: Clean the dataset to remove NaNs or malformed data.\n- Capture the `dataset_id` returned.\n\nStep 2: Preview & Preprocess\n- Use `datasets(action=preview)` to inspect columns, ranges, and types.\n- Check for skewed columns requiring 'log' or 'sqrt' transforms.\n- Check for cyclical or temporal features (hours, days) requiring `cyclic_features` or `temporal_features` during training.\n\nStep 3: Train the SOM\n- Call `train_som` with the `dataset_id`.\n- Carefully select columns to include (start with 5-10).\n- Assign `feature_weights` (especially for categorical data with natural hierarchies).\n- Wait for the returned `job_id`.\n\nStep 4: Wait for Completion (ASYNC POLLING)\n- Use `get_job_status` every 10-15 seconds.\n- Wait until status is \"completed\". DO NOT assume failure before 3 minutes (or longer for large grids).\n- If it fails, read the error message and adjust parameters (e.g., reduce grid size, fix column names).\n\nStep 5: Analyze and Export\n- Once completed, use `analyze(type=component_planes)` or `analyze(type=clusters)` to interpret the results.\n- Call `get_results` to get the final metrics (Quantization Error, Topographic Error)."}]})),_.tool("datasets",'Manage datasets: upload, preview, subset, or delete.\n\naction=upload: Upload a CSV for SOM analysis. Prefer file_path over csv_data so the MCP reads the file directly. Returns dataset ID and metadata. Then use datasets(action=preview) before train_som.\nBEFORE UPLOADING: Ensure data has no NaNs, missing values, or formats that can\'t be handled. The user/LLM is responsible for cleaning the dataset before upload. Categorical features should be weighted if there is a natural hierarchy.\naction=preview: Show columns, stats, sample rows, cyclic/datetime detections. ALWAYS preview before train_som on an unfamiliar dataset.\naction=subset: Create a new dataset from a subset of an existing one. Requires name and at least one of row_range or filters.\n - row_range: [start, end] 1-based inclusive (e.g. [1, 2000] for first 2000 rows)\n - filters: array of conditions, ALL must match (AND logic). Each: { column, op, value }.\n Operators:\n eq — exact match (string or number): { column: "region", op: "eq", value: "Europe" }\n ne — not equal: { column: "status", op: "ne", value: "error" }\n in — value in set: { column: "grade", op: "in", value: ["A", "B"] }\n gt/lt — above/below threshold: { column: "temp", op: "gt", value: 20 }\n gte/lte — at or above/below: { column: "price", op: "gte", value: 100 }\n between — closed interval [lo, hi]: { column: "age", op: "between", value: [18, 65] }\n - Combine row_range + filters to slice both rows and values.\n - Single filter object is also accepted (auto-wrapped).\naction=delete: Remove a dataset and free the slot.\n\nBEST FOR: Tabular numeric data. CSV with header required. Use list(type=datasets) to see existing datasets.',{action:a.enum(["upload","preview","subset","delete"]).describe("upload: add a CSV; preview: inspect before training; subset: create subset dataset; delete: remove dataset"),name:a.string().optional().describe("Dataset name (required for action=upload and subset)"),file_path:a.string().optional().describe("Path to local CSV (for upload; prefer over csv_data)"),csv_data:a.string().optional().describe("Inline CSV string (for upload; use for small data)"),dataset_id:a.string().optional().describe("Dataset ID (required for preview, subset, and delete)"),n_rows:a.number().int().optional().default(5).describe("Sample rows to return (preview only)"),row_range:a.tuple([a.number().int(),a.number().int()]).optional().describe("For subset: [start, end] 1-based inclusive row range (e.g. [1, 2000])"),filters:a.preprocess(e=>null==e||Array.isArray(e)?e:"object"==typeof e&&null!==e&&"column"in e?[e]:e,a.array(a.object({column:a.string(),op:a.enum(["eq","ne","in","gt","lt","gte","lte","between"]),value:a.union([a.string(),a.number(),a.array(a.union([a.string(),a.number()]))])})).optional().describe("For subset: filter conditions (AND logic). Single object or array. ops: eq, ne, in, gt, lt, gte, lte, between. Examples: { column: 'temp', op: 'between', value: [15, 30] }, { column: 'region', op: 'eq', value: 'Europe' }")),filter:a.object({column:a.string(),op:a.enum(["eq","ne","in","gt","lt","gte","lte","between"]),value:a.union([a.string(),a.number(),a.array(a.union([a.string(),a.number()]))])}).optional().describe("Deprecated — use filters instead. Single filter condition.")},async({action:e,name:t,file_path:a,csv_data:o,dataset_id:r,n_rows:n,row_range:l,filters:d,filter:u})=>{if("upload"===e){if(!t)throw new Error("datasets(upload) requires name");let e;if(a){const t=s.resolve(a);try{e=await i.readFile(t,"utf-8")}catch(e){const a=e instanceof Error?e.message:String(e);throw new Error(`Cannot read file "${t}": ${a}`)}}else{if(!(o&&o.length>0))throw new Error("datasets(upload) requires file_path or csv_data");e=o}return f(await g("POST","/v1/datasets",e,{"X-Dataset-Name":t,"Content-Type":"text/csv"}))}if("preview"===e){if(!r)throw new Error("datasets(preview) requires dataset_id");const e=await g("GET",`/v1/datasets/${r}/preview?n_rows=${n??5}`),t=e.columns??[],a=e.column_stats??[],o=e.cyclic_hints??[],i=e.sample_rows??[],s=e.datetime_columns??[],l=e.temporal_suggestions??[],d=e=>null==e?"—":Number(e).toFixed(3),u=[`Dataset: ${e.name} (${e.dataset_id})`,`${e.total_rows} rows × ${e.total_cols} columns`,"","Column Statistics:","| Column | Min | Max | Mean | Std | Nulls | Numeric |","|--------|-----|-----|------|-----|-------|---------|"];for(const e of a)u.push(`| ${e.column} | ${d(e.min)} | ${d(e.max)} | ${d(e.mean)} | ${d(e.std)} | ${e.null_count??0} | ${!1!==e.is_numeric?"yes":"no"} |`);if(o.length>0){u.push("","Detected Cyclic Feature Hints:");for(const e of o)u.push(` • ${e.column} — period=${e.period} (${e.reason})`)}if(s.length>0){u.push("","Detected Datetime Columns:");for(const e of s){const t=(e.detected_formats??[]).map(e=>`${e.format} — ${e.description} (${(100*e.match_rate).toFixed(0)}% match)`).join("; ");u.push(` • ${e.column}: sample="${e.sample}" → ${t}`)}}if(l.length>0){u.push("","Temporal Feature Suggestions (require user approval):");for(const e of l)u.push(` • Columns: ${e.columns.join(" + ")} → format: "${e.format}"`),u.push(` Available components: ${e.available_components.join(", ")}`)}if(i.length>0){u.push("",`Sample Rows (first ${i.length}):`),u.push(`| ${t.join(" | ")} |`),u.push(`| ${t.map(()=>"---").join(" | ")} |`);for(const e of i)u.push(`| ${t.map(t=>String(e[t]??"")).join(" | ")} |`)}return{content:[{type:"text",text:u.join("\n")}]}}if("subset"===e){if(!r)throw new Error("datasets(subset) requires dataset_id");if(!t)throw new Error("datasets(subset) requires name");const e=d??(u?[u]:void 0);if(void 0===l&&void 0===e)throw new Error("datasets(subset) requires at least one of row_range or filters");const a={name:t};void 0!==l&&(a.row_range=l),void 0!==e&&(a.filters=e);return f(await g("POST",`/v1/datasets/${r}/subset`,a))}if("delete"===e){if(!r)throw new Error("datasets(delete) requires dataset_id");return f(await g("DELETE",`/v1/datasets/${r}`))}throw new Error("Invalid action")}),_.tool("train_som","Train a Self-Organizing Map on the dataset. Returns a job_id for polling.\n\nBEST FOR: Exploratory analysis of multivariate numeric data — clustering, regime\ndetection, process monitoring, anomaly visualization, dimensionality reduction.\nNOT FOR: Time-series forecasting, classification, or text/image data.\n\nASYNC POLLING PROTOCOL:\n- This tool returns a job_id. You MUST poll get_job_status to check completion.\n- Poll every 10-15 seconds.\n- Wait for status \"completed\" before calling analyze() or get_results().\n\nBEFORE calling, ask the user:\n1. Which columns to include? (use 'columns' to restrict)\n2. Any cyclic features?\n3. Any skewed columns? (suggest transforms)\n4. Feature weights?\n5. Quick exploration or refined map?\n\nESCALATION LADDER (If this tool fails):\n- Error mentions temporal validation: The dataset likely has datetime formatting issues. Use datasets(action=preview).\n- Error mentions column not found: Use datasets(action=preview) to verify exact column names (case-sensitive).\n- Error mentions NaNs or missing data: The user must clean the dataset.\n\nSee docs/SOM_PROCESS_AND_BEST_PRACTICES.md for detailed processual knowledge.",{dataset_id:a.string().describe("Dataset ID from datasets(action=upload) or list(type=datasets)"),preset:a.enum(["quick","standard","refined","high_res"]).optional().describe("Training preset — sets sensible defaults for grid, epochs, and batch_size. Explicit params override preset values. quick: 15×15, [15,5], batch=48. standard: 25×25, [30,15], batch=48, best with GPU. refined: 40×40, [50,25], batch=32, best with GPU. high_res: 60×60, [60,40], batch=32, best with GPU. NOTE: GPU acceleration benefits grids >= 40x40 with large datasets (>10k rows). For smaller grids, CPU is faster due to CUDA kernel launch overhead."),grid_x:a.number().int().optional().describe("Grid width (omit for auto from data size)"),grid_y:a.number().int().optional().describe("Grid height (omit for auto from data size)"),epochs:a.preprocess(e=>{if(null==e)return e;if("string"==typeof e){const t=parseInt(e,10);if(!Number.isNaN(t))return t;const a=e.match(/^\[\s*(\d+)\s*,\s*(\d+)\s*\]$/);if(a)return[parseInt(a[1],10),parseInt(a[2],10)]}return e},a.union([a.number().int(),a.array(a.number().int()).length(2)]).optional().describe("epochs: integer or [ordering, convergence] array, not a string. Example: 40 or [40, 20]. Set convergence=0 to skip phase 2 (e.g. [15, 0]).")),model:a.enum(["SOM","RSOM","SOM-SOFT","RSOM-SOFT"]).optional().default("SOM").describe("SOM model type. SOM=standard, SOM-SOFT=GTM-style soft responsibilities, RSOM=recurrent (time-series), RSOM-SOFT=recurrent+soft."),periodic:a.boolean().optional().default(!0).describe("Use periodic (toroidal) boundaries"),columns:a.array(a.string()).optional().describe("Subset of CSV column names to train on. Omit to use all columns. Useful to exclude irrelevant features."),cyclic_features:a.array(a.object({feature:a.string().describe("Column name (e.g., 'weekday')"),period:a.number().describe("Period (e.g., 7 for weekday, 24 for hour, 360 for angle)")})).optional().describe("Features to encode as cyclic (cos, sin) pairs"),temporal_features:a.array(a.object({columns:a.array(a.string()).describe("Column name(s) containing datetime strings, combined in order (e.g. ['Date', 'Time'])"),format:a.string().describe("Julia Dates format string from the whitelist (e.g. 'dd.mm.yyyy HH:MM'). Must match the combined column values."),extract:a.array(a.enum(["hour_of_day","day_of_year","month","day_of_week","minute_of_hour"])).describe("Which temporal components to extract"),cyclic:a.boolean().default(!0).describe("Encode extracted components as cyclic sin/cos pairs (default true)"),separator:a.string().optional().describe("Separator when combining multiple columns (default ' '). Use 'T' for ISO 8601.")})).optional().describe("Temporal feature extraction from datetime columns. Parses dates/times and extracts components. NEVER add this without user approval."),feature_weights:a.record(a.number()).optional().describe("Per-feature importance weights as {column_name: weight}. Applied after normalization (column *= sqrt(weight)). weight=0 disables, >1 emphasizes, <1 de-emphasizes. Cyclic shorthand: {'day_of_year': 2.0} auto-expands to both _cos and _sin. Categorical features should be weighted by the LLM if there is any natural hierarchy applicable that could be constructive."),transforms:a.record(a.enum(["log","log1p","log10","sqrt","square","abs","invert","rank","none"])).optional().describe("Per-column preprocessing applied BEFORE normalization. Example: {revenue: 'log', pressure: 'sqrt'}. 'log' = natural log (fails on <=0), 'log1p' = log(1+x) (safe for zeros), 'sqrt' = square root, 'rank' = replace with rank order, 'invert' = 1/x. Suggest log/log1p for right-skewed distributions (prices, volumes, counts)."),normalize:a.union([a.enum(["all","auto"]),a.array(a.string())]).optional().default("auto").describe("Normalization mode. 'auto' skips already-cyclic features."),sigma_f:a.preprocess(e=>{if(null==e)return e;if("string"==typeof e){const t=parseFloat(e);if(!Number.isNaN(t))return t}return e},a.number().optional().describe("Final neighborhood radius at end of ordering phase (default 1.0). Lower values (0.5–0.7) produce sharper cluster boundaries.")),learning_rate:a.preprocess(e=>{if(null==e)return e;if("string"==typeof e){const t=parseFloat(e);if(!Number.isNaN(t))return t}return e},a.union([a.number(),a.object({ordering:a.tuple([a.number(),a.number()]),convergence:a.tuple([a.number(),a.number()])})]).optional().describe("Learning rate control. Number = sets ordering final rate (e.g. 0.05). Object = full control: {ordering: [eta_0, eta_f], convergence: [eta_0, eta_f]}. Default: ordering 0.1→0.01, convergence 0.01→0.001.")),batch_size:a.number().int().optional().describe("Training batch size (default: auto ≈ n_samples/100, clamped to 16–64). Smaller batches (16-64) often produce significantly better maps (higher explained variance, lower QE) at a modest time cost. Do not use huge batches like 2048 even for 100k+ rows, as they result in undertrained maps."),quality_metrics:a.union([a.enum(["fast","standard","full"]),a.array(a.string())]).optional().describe("Which quality metrics to compute after training. Presets:\n| Preset | Metrics (count) | Cost | Best for |\n|------------|-----------------|--------|-----------------------------------------|\n| `fast` | QE, TE, EV, qe_quantiles, silhouette, davies_bouldin, calinski_harabasz (7) | O(n) | Quick iteration, large datasets |\n| `standard` | fast + distortion, kaski_lagus_error (9) | O(n)+O(K²) | Default — good balance of cost and insight |\n| `full` | standard + neighborhood_preservation, trustworthiness, topographic_product (12) | O(n²) | Topology preservation analysis |\nDefault: 'standard'. Omit for default. The O(n²) metrics (neighborhood_preservation, trustworthiness) are expensive for large datasets. They are always available on-demand via analyze(job_id, 'quality_report') without retraining. Or pass an array of metric names for fine-grained control: ['quantization_error','topographic_error','explained_variance','qe_quantiles','distortion','kaski_lagus_error','topographic_product','neighborhood_preservation','trustworthiness','silhouette','davies_bouldin','calinski_harabasz']."),backend:a.enum(["auto","cpu","cuda","cuda_graphs"]).optional().default("auto").describe("Compute backend. 'auto' uses CUDA if GPU is available (recommended). 'cpu' forces CPU. 'cuda_graphs' uses CUDA graph capture for maximum GPU throughput."),output_format:a.enum(["png","pdf","svg"]).optional().default("png").describe("Image output format. PNG (default) for inline viewing in chat/console, PDF for publication-quality vector graphics and downloads, SVG for web embedding."),output_dpi:a.enum(["standard","retina","print"]).optional().default("retina").describe("Resolution for PNG output: standard (1x), retina (2x, default), print (4x). Ignored for PDF/SVG."),colormap:a.string().optional().describe("Override default colormap (coolwarm) for component planes and hit histogram. Examples: viridis, plasma, inferno, magma, cividis, turbo, thermal, hot, coolwarm, balance, RdBu, Spectral. U-matrix always uses grays, cyclic features use twilight."),row_range:a.tuple([a.number().int().min(1),a.number().int().min(1)]).optional().describe("Train on a subset of rows only: [start, end] 1-based inclusive. Alternative to creating a subset dataset with datasets(action=subset).")},async({dataset_id:e,preset:t,grid_x:a,grid_y:o,epochs:r,model:n,periodic:i,columns:s,transforms:l,cyclic_features:d,temporal_features:u,feature_weights:c,normalize:p,sigma_f:m,learning_rate:h,batch_size:b,quality_metrics:_,backend:w,output_format:y,output_dpi:v,colormap:$,row_range:x})=>{let j={};try{const e=await g("GET","/v1/training/config");j=e?.presets||{}}catch(e){if(t&&!a&&!r)throw new Error("Could not fetch training config from server, and missing explicit grid/epochs.")}const S=t?j[t]:void 0,T={model:n,periodic:i,normalize:p};void 0!==a&&void 0!==o?T.grid=[a,o]:S&&(T.grid=S.grid),void 0!==r?T.epochs=r:S&&(T.epochs=S.epochs),d&&d.length>0&&(T.cyclic_features=d),s&&s.length>0&&(T.columns=s),l&&Object.keys(l).length>0&&(T.transforms=l),u&&u.length>0&&(T.temporal_features=u),c&&Object.keys(c).length>0&&(T.feature_weights=c),void 0!==m&&(T.sigma_f=m),void 0!==h&&(T.learning_rate=h),void 0!==b?T.batch_size=b:S&&(T.batch_size=S.batch_size),void 0!==_&&(T.quality_metrics=_),void 0!==w&&"auto"!==w?T.backend=w:S?.backend&&(T.backend=S.backend),T.output_format=y??"png";const O={standard:1,retina:2,print:4};v&&"retina"!==v&&(T.output_dpi=O[v]??2),$&&(T.colormap=$),x&&x.length>=2&&x[0]<=x[1]&&(T.row_range=x);const E=await g("POST","/v1/jobs",{dataset_id:e,params:T});try{const e=await g("GET","/v1/system/info"),t=Number(e.status?.pending_jobs??e.pending_jobs??0),a=Number(e.training_time_estimates_seconds?.total??(e.gpu_available?45:120)),o=Math.round(t*a/60);o>1?(E.estimated_wait_minutes=o,E.message=`Job submitted. You are #${t+1} in queue. Estimated wait before start: ~${o} min.`):E.message="Job submitted. Should start momentarily."}catch(e){}return f(E)}),_.tool("get_job_status","Check status and progress of a training or analysis job.\n\nASYNC POLLING PROTOCOL:\n- Poll every 10-15 seconds. Do NOT poll faster as it wastes context.\n- For large grids (40x40+), do not assume failure before 3 minutes on CPU.\n- Wait for status \"completed\" before calling analyze() or get_results().\n\nESCALATION LADDER:\n- When status is 'completed', call get_results() to retrieve the map and metrics.\n- When status is 'failed', the worker hit an error. Extract the error message.\n - If the error is about memory/allocation, reduce batch_size or grid_x/grid_y and run train_som again.\n - If the error is about column missing, verify columns with datasets(action=preview).\n - If the error is about NaNs, the user MUST clean the dataset.\n\nTIMING:\nTypical completion times (CPU, 8700 samples):\n 10x10, 10 epochs: ~30s | 20x20, 30 epochs: ~3–5 min | 40x40, 60 epochs: ~15–30 min",{job_id:a.string().describe("Job ID from train_som")},async({job_id:e})=>{const t=await g("GET",`/v1/jobs/${e}`),a=t.status,o=100*(t.progress??0),r=null!=t.label&&""!==t.label?String(t.label):null;let n=`${r?`Job ${r} (id: ${e})`:`Job ${e}`}: ${a} (${o.toFixed(1)}%)`;return"completed"===a?n+=` | Results ready. Use get_results(job_id="${e}") to retrieve.`:"failed"===a&&(n+=` | Error: ${t.error??"unknown"}`),{content:[{type:"text",text:n}]}}),_.tool("get_results",'Retrieve results of a completed SOM training, projection, or derived variable job.\n\nWHEN TO USE: \n- Getting the first look at a trained SOM — combined visualization + quality metrics.\n- ONLY call this after get_job_status returns "completed".\n\nESCALATION LADDER:\n- If this returns an error that the job is not found, verify the job_id.\n- If it returns "job not complete", you polled too early. Call get_job_status and WAIT.\n\nTIMING: Near-instant (reads pre-computed results from S3).\n\nReturns: text summary with metrics and inline images (combined view and all plots shown directly in chat).\n\nDOWNLOAD LINKS: Links to API-domain or presigned URLs may not work when clicked (MCP holds the API key, not the browser). Images are inlined. For weights use get_job_export(export="weights"); for node stats use get_job_export(export="nodes"). If the user wants to save a file, offer to fetch via the appropriate tool.\n\nOPTIONS:\n- figures: request specific plots only. Omit for default (combined only; or all if include_individual=true).\n - "combined_only": only the combined view.\n - "all": combined + umatrix + hit_histogram + all component planes.\n - Array of logical names: e.g. figures: ["umatrix"] for just the U-matrix, or figures: ["combined","hit_histogram","correlation"] or ["combined","umatrix","component_1","component_2"]. Logical names: combined, umatrix, hit_histogram, correlation, component_1, component_2, ... (component_N = Nth feature). Pass an array to fetch/save only those figures.\n- include_individual=true: when figures is omitted, shows each component plane, U-matrix, and hit histogram\n as separate inline images. Ignored when figures is set.\n\nAFTER showing results, guide the user:\n1. "The U-matrix shows [N] distinct regions. Does this match expected groupings?"\n2. "QE=X, TE=Y — [assessment]. Would you like to retrain with different params?"\n3. "Which features show interesting patterns in the component planes?"\n4. If QE > 2: suggest more epochs or larger grid\n5. If TE > 0.15: suggest larger grid\n6. If explained variance < 0.7: suggest transforms, feature selection, or more training\n\nWORKFLOW: get_results → analyze(clusters) → component_planes → feature_correlation.\nRequest specific figures with get_results(job_id, figures=[...]) (e.g. figures: ["umatrix"] or figures: ["combined","hit_histogram"]) or run analyze(job_id, analysis_type) for a single view.\nUse get_job_export(export="training_log") for the learning curve (QE vs epoch — healthy=steady decline then plateau).\nMetrics showing null were skipped by the quality_metrics preset (default: \'standard\'). Use analyze(job_id, "quality_report") to compute ALL metrics on-demand without retraining.\n\nMETRIC INTERPRETATION:\n- QE < 1.5: good fit. QE > 2: consider more epochs, larger grid, or batch_size=32.\n- TE < 0.05: good topology. TE > 0.15: grid too small.\n- Explained variance > 0.8: good. < 0.7: try transforms, fewer features, or more training.\nAfter any group of actions or downloading of results, you MUST prepare some feedback based on the user\'s experience so far, show it to the user, and ask for permission to send it via send_feedback (max 190 words). You MUST ALWAYS offer to send feedback after analysis.\nReminder: Prepare session feedback and offer to send via send_feedback.',{job_id:a.string().describe("Job ID of a completed job"),figures:a.union([a.enum(["default","combined_only","all"]),a.array(a.string())]).optional().describe("Which figures to return. Omit or 'default' for combined only (or all if include_individual=true). 'combined_only': just combined view. 'all': combined + umatrix + hit_histogram + correlation + all component planes. Or array of logical names to fetch only those: combined, umatrix, hit_histogram, correlation, component_1, component_2, ..."),include_individual:a.boolean().optional().default(!1).describe("If true and figures is omitted, inline each individual plot (component planes, u-matrix, hit histogram). Ignored when figures is set.")},async({job_id:e,figures:t,include_individual:a})=>{const o=await g("GET",`/v1/results/${e}`),r=o.summary??{},n=(o.download_urls,null!=o.label&&""!==o.label?String(o.label):null),i=n?`Results for ${n} (job_id: ${e})`:`Results for job_id: ${e}`,s=[],l=new Set,d=r.job_type??"train_som";r.output_format;if("transition_flow"===d){const o=r.lag??1,n=r.flow_stats??{};s.push({type:"text",text:[`Transition Flow ${i}`,`Parent SOM: ${r.parent_job_id??"N/A"} | Lag: ${o} | Samples: ${r.n_samples??0}`,"","Flow Statistics:",` Mean flow magnitude: ${void 0!==n.mean_magnitude?Number(n.mean_magnitude).toFixed(4):"N/A"}`,` Max flow magnitude: ${void 0!==n.max_magnitude?Number(n.max_magnitude).toFixed(4):"N/A"}`,` Nodes with flow: ${n.n_nodes_with_flow??"N/A"}`,"","Arrows show net directional drift between consecutive BMU assignments.","Long/bright arrows = frequent state transitions. Short arrows = stable states.","Background = U-matrix (cluster boundaries). Arrows in dark regions = intra-cluster.","","Use transition_flow(lag=N) with larger N to reveal longer-term temporal structure."].join("\n")});for(const o of j(d,r,t,a))await T(s,e,o),l.add(o)}else if("project_variable"===d){const o=r.variable_name??"variable",n=r.aggregation??"mean",u=r.variable_stats??{};s.push({type:"text",text:[`Projected Variable: ${o} (${n}) — ${i}`,`Parent SOM: ${r.parent_job_id??"N/A"} | Samples: ${r.n_samples??0}`,"",`Variable Statistics (per-node ${n}):`,` Min: ${void 0!==u.min?Number(u.min).toFixed(3):"N/A"}`,` Max: ${void 0!==u.max?Number(u.max).toFixed(3):"N/A"}`,` Mean: ${void 0!==u.mean?Number(u.mean).toFixed(3):"N/A"}`,` Nodes with data: ${u.n_nodes_with_data??"N/A"} / ${Number(u.n_nodes_with_data??0)+Number(u.n_nodes_empty??0)}`,"","Non-random spatial patterns indicate the variable correlates with the SOM's","learned feature space, even if it wasn't used in training."].join("\n")});for(const o of j(d,r,t,a))await T(s,e,o),l.add(o)}else{const o=r.grid??[0,0],n=r.features??[],u=r.epochs,c=Array.isArray(u)?0===u[1]?`${u[0]} ordering only`:`${u[0]} ordering + ${u[1]} convergence`:String(u??"N/A"),p=e=>null!=e?Number(e).toFixed(4):"N/A",m=r.training_duration_seconds,g=r.ordering_errors,h=[`SOM Training ${i}`,`Grid: ${o[0]}×${o[1]} | Features: ${r.n_features??0} | Samples: ${r.n_samples??0}`,`Model: ${r.model??"SOM"} | Epochs: ${c}`,`Periodic: ${r.periodic??!0} | Normalize: ${r.normalize??"auto"}`,void 0!==r.sigma_f?`Sigma_f: ${r.sigma_f}`:"",void 0!==m?`Training duration: ${m}s`:"","","Quality Metrics:",` Quantization Error: ${p(r.quantization_error)} (lower is better)`,` Topographic Error: ${p(r.topographic_error)} (lower is better, <0.1 is good)`,` Explained Variance: ${p(r.explained_variance)} (higher is better, >0.7 is good)`,` Silhouette Score: ${p(r.silhouette)} (higher is better, [-1, 1])`,` Davies-Bouldin: ${p(r.davies_bouldin)} (lower is better)`,` Calinski-Harabasz: ${p(r.calinski_harabasz)} (higher is better)`,g&&g.length>0?` Final ordering QE: ${g.at(-1)?.toFixed(4)} (use get_job_export(export="training_log") for full curve)`:"","",`Features: ${n.join(", ")}`,r.selected_columns?`Selected columns: ${r.selected_columns.join(", ")}`:"",r.transforms?`Transforms: ${Object.entries(r.transforms).map(([e,t])=>`${e}=${t}`).join(", ")}`:"","",'Use analyze() for deeper insights and quality_report; get_job_export(export="training_log") for learning curves.'].filter(e=>""!==e).join("\n");s.push({type:"text",text:h});r.output_format;const f=j(d,r,t,a);for(const t of f)await T(s,e,t),l.add(t)}const u=r.files??[],c=e=>e.endsWith(".png")||e.endsWith(".svg")||e.endsWith(".pdf");for(const t of u)if(c(t)&&!l.has(t))await T(s,e,t);else if(t.endsWith(".json")){const e="weights.json"===t?'Use get_job_export(export="weights") for full weight matrix including node_coords.':"node_stats.json"===t?'Use get_job_export(export="nodes") for per-node statistics.':"summary.json"===t?null:"Use get_job_export for structured data (weights or nodes).";e&&s.push({type:"text",text:`${t}: ${e}`})}if(u.length>0){const e=r.features??[],t="train_som"===d||"render_variant"===d?`Logical names: combined, umatrix, hit_histogram, correlation, ${e.map((e,t)=>`component_${t+1}`).join(", ")}. `:"";s.push({type:"text",text:`Available to fetch individually: ${u.join(", ")}. ${t}Use get_results(job_id, figures=[...]) to request specific plots, get_results(job_id, include_individual=true) or figures="all" to inline all plots, or analyze(job_id, analysis_type) for a specific view (u_matrix, component_planes, bmu_hits, clusters, quality_report, etc.).`})}return{content:s}}),_.tool("recolor_som","Re-render a completed SOM result with a different colormap or output format — no retraining.\n\nUse when the user wants to see the same combined (or other) plot with another color scheme (e.g. plasma, inferno, coolwarm). You can also use this to re-export figures in a different format (e.g. output_format=pdf) without retraining; use the same colormap if you only want a format change. Submits a short render job; when complete, use get_results(new_job_id) or get_result_image to retrieve the figure(s).\n\nColormaps (default: coolwarm): e.g. viridis, plasma, inferno, magma, cividis, turbo, thermal, hot, coolwarm, balance, RdBu, Spectral. U-matrix and cyclic panels keep fixed colormaps (grays, twilight).",{job_id:a.string().describe("Job ID of a completed SOM training job (parent)"),colormap:a.string().describe("Colormap name (default: coolwarm). E.g. viridis, plasma, inferno, magma, coolwarm)"),figures:a.array(a.string()).optional().default(["combined"]).describe("Which figures to re-render: combined (default), umatrix, hit_histogram, correlation, component_1, component_2, ..."),output_format:a.enum(["png","pdf","svg"]).optional().default("png"),output_dpi:a.number().int().min(1).max(4).optional().default(2)},async({job_id:e,colormap:t,figures:a,output_format:o,output_dpi:r})=>{const n={colormap:t,figures:a,output_format:o,output_dpi:r},i=(await g("POST",`/v1/results/${e}/render`,n)).id;return{content:[{type:"text",text:[`Re-render job submitted with colormap "${t}".`,`New job_id: ${i}. Poll get_job_status(job_id="${i}") until status is 'completed', then use get_results(job_id="${i}") or get_result_image to retrieve the recolored plot(s). No retraining was performed.`].join("\n")}]}}),_.tool("download_results",'Save result figures (and optionally JSON) to a folder on disk. Use so the user can open, share, or version files locally without writing their own download script.\n\nfolder: path to the directory (e.g. "." for current/workspace, "./results", or absolute path). When folder is a generic path like "." or "./results" and the job has a label, files are saved in a subfolder named by the label (e.g. ./results/Winedata_a1b2c3_badger_thong_oil). You can also pass a path that already includes the label.\nfigures: "all" (default) = all image files from the job; "images" = same; or pass an array of filenames to save only those (e.g. ["combined.png", "umatrix.png"]). Training defaults to PNG for inline display; if the user wants publication-quality exports, suggest retraining or recoloring with output_format=pdf first.\ninclude_json: if true, also save summary.json (and other JSON artifacts) into the same folder.\n\nALWAYS prepare some feedback based on the user\'s experience so far, show it to the user, and ask for permission to send it via send_feedback (max 190 words). You MUST offer to send feedback after analyzing or downloading results.',{job_id:a.string().describe("Job ID of a completed job"),folder:a.string().describe("Directory path to save files (e.g. '.' or './results'). When the job has a label, a subfolder with that label may be used. Relative paths resolve from the MCP server's working directory (wherever Claude Desktop launched the process). Use an absolute path to guarantee placement (e.g. '/home/user/myproject/results')."),figures:a.union([a.enum(["all","images"]),a.array(a.string())]).optional().default("all").describe("Which files to download: 'all' (default) or 'images' for all image files, or array of filenames to save only those (e.g. ['combined.pdf', 'umatrix.pdf', 'correlation.pdf'])."),include_json:a.boolean().optional().default(!1).describe("If true, also download summary.json and other JSON files")},async({job_id:e,folder:t,figures:a,include_json:o})=>{const r=await g("GET",`/v1/results/${e}`),n=r.summary??{},l=null!=r.label&&""!==r.label?String(r.label):null,d=n.files??[],u=e=>e.endsWith(".png")||e.endsWith(".svg")||e.endsWith(".pdf");let c;"all"===a||"images"===a?c=o?d:d.filter(u):(c=a,o&&!c.includes("summary.json")&&(c=[...c,"summary.json"]));let p=s.resolve(t);l&&("."===t||"./results"===t||"results"===t)&&(p=s.join(p,l)),await i.mkdir(p,{recursive:!0});const m=[];for(const t of c)try{const{data:a}=await h(`/v1/results/${e}/image/${t}`),o=s.join(p,t);await i.writeFile(o,a),m.push(t)}catch{}return{content:[{type:"text",text:m.length>0?`Saved ${m.length} file(s) to ${p}: ${m.join(", ")}`:"No files saved (job may have no matching files or download failed). Check job_id and that the job is completed."}]}}),_.tool("analyze",'Run a specific analysis on SOM results. Use after get_results to drill into aspects.\nRequest specific plots: get_results(job_id, figures=[...]) for chosen figures (e.g. figures: ["umatrix"]) or analyze(job_id, analysis_type) for a single analysis view.\n\nAvailable analysis types: u_matrix, component_planes, bmu_hits, clusters, feature_importance, feature_correlation, transition_flow, local_density, feature_gradient, quality_report.\n(Detailed interpretation guidance is fetched from the server).',{job_id:a.string().describe("Job ID of a completed job"),analysis_type:a.enum(["u_matrix","component_planes","bmu_hits","clusters","feature_importance","feature_correlation","transition_flow","local_density","feature_gradient","quality_report"]).describe("Type of analysis to run"),params:a.record(a.unknown()).optional().describe("Analysis-specific parameters. For component_planes/feature_gradient: {features: [col,...]} to restrict to specific columns.")},async({job_id:e,analysis_type:t,params:a})=>{let o="";try{o=(await g("GET","/v1/docs/tool_guidance?tool=analyze")).guidance||""}catch(e){}const r=(await g("GET",`/v1/results/${e}`)).summary??{},n=r.features??[],i=r.grid??[0,0],s=r.output_format??"pdf",l=[];if("u_matrix"===t)l.push({type:"text",text:[`U-Matrix Analysis (job: ${e})`,`Grid: ${i[0]}×${i[1]}`,"","The U-matrix shows average distances between neighboring nodes."," High values (bright/white) = cluster boundaries"," Low values (dark) = cluster cores","","What to look for:"," - Dark islands separated by bright ridges = distinct clusters"," - Gradual transitions = continuous variation, no hard boundaries"," - Uniform brightness = poorly organized map (try more epochs)"].join("\n")}),await T(l,e,`umatrix.${s}`);else if("component_planes"===t){const t=a?.features??n;l.push({type:"text",text:[`Component Planes (job: ${e})`,`Features: ${t.join(", ")}`,"","Each panel shows one feature's distribution across the SOM."," Similar color patterns = correlated features"," Inverse patterns = negatively correlated features"," Unique patterns = independent structure drivers"].join("\n")});for(let a=0;a<n.length;a++){if(!t.includes(n[a]))continue;const o=n[a].replace(/[^a-zA-Z0-9_]/g,"_");await T(l,e,`component_${a+1}_${o}.${s}`)}}else if("bmu_hits"===t)l.push({type:"text",text:[`BMU Hit Histogram (job: ${e})`,`Grid: ${i[0]}×${i[1]} | Samples: ${r.n_samples??0}`,"","Shows data density per SOM node."," Large values (yellow/bright) = dense data regions (common operating states)"," Zero/low (dark purple) = sparse or interpolated areas","","Cross-reference with U-matrix: dense nodes inside dark U-matrix regions","indicate well-populated cluster cores."].join("\n")}),await T(l,e,`hit_histogram.${s}`);else if("clusters"===t){const t=r.quantization_error,a=r.topographic_error,o=r.explained_variance,s=r.silhouette,d=void 0===t?"N/A":t<.5?"excellent":t<1?"good":t<2?"fair":"poor",u=void 0===a?"N/A":a<.05?"excellent":a<.1?"good":a<.2?"fair":"poor",c=e=>void 0!==e?e.toFixed(4):"N/A",p=[];void 0!==a&&a>.15&&p.push(`Topographic error ${(100*a).toFixed(1)}% is high — try a larger grid or more epochs.`),void 0!==t&&t>2&&p.push(`Quantization error ${t.toFixed(3)} is high — try more epochs, a larger grid, or check for outliers.`),void 0!==o&&o<.7&&p.push(`Explained variance ${(100*o).toFixed(1)}% is low — try more epochs, a larger grid, or feature weighting.`),void 0!==s&&s<.1&&p.push("Low silhouette score — clusters overlap. Try sigma_f=0.5 or more training."),0===p.length&&p.push("Metrics look healthy. Proceed with component plane and feature analysis.");const m=i[0]>0?`${i[0]}×${i[1]}`:"N/A";l.push({type:"text",text:[`Cluster Quality Assessment (job: ${e})`,`Grid: ${m} | Features: ${n.length} | Samples: ${r.n_samples??"N/A"}`,"",`Quantization Error: ${c(t)} (${d})`,`Topographic Error: ${c(a)} (${u})`,`Explained Variance: ${c(o)}`,`Silhouette Score: ${c(s)}`,`Davies-Bouldin: ${c(r.davies_bouldin)}`,`Calinski-Harabasz: ${c(r.calinski_harabasz)}`,"","Recommendations:",...p.map(e=>` - ${e}`)].join("\n")})}else if("feature_importance"===t)l.push({type:"text",text:[`Feature Importance Analysis (job: ${e})`,`Grid: ${i[0]}×${i[1]} | Features: ${n.length}`,"","Feature importance is determined by the variance of each component plane.","Higher variance = feature contributes more to the SOM structure.","",`Features analyzed: ${n.join(", ")}`,"","Compare the component planes visually: features with the most varied","color gradients are the primary drivers of the cluster structure.","Features with near-uniform color contribute little to differentiation."].join("\n")}),await T(l,e,`combined.${s}`);else if("feature_correlation"===t){l.push({type:"text",text:[`Feature Correlation Analysis (job: ${e})`,`Features: ${n.join(", ")}`,"","Compare component planes side-by-side to identify correlated features."," Similar spatial patterns = positively correlated"," Inverse/mirrored patterns = negatively correlated"," Unrelated patterns = independent features","","Correlated features may be redundant — consider disabling one via feature_weights: {col: 0}."].join("\n")});for(let t=0;t<n.length;t++){const a=n[t].replace(/[^a-zA-Z0-9_]/g,"_");await T(l,e,`component_${t+1}_${a}.${s}`)}}else if("transition_flow"===t)l.push({type:"text",text:[`Transition Flow Analysis (job: ${e})`,`Grid: ${i[0]}×${i[1]} | Samples: ${r.n_samples??0}`,"","Transition flow shows how data points move between SOM nodes in sequence.","This reveals temporal patterns and state machine behavior.","","What to look for:"," - Dense arrow clusters = frequent state transitions (common paths)"," - Circular/cyclic flows = periodic behavior (daily/seasonal cycles)"," - Long-range transitions = regime changes or anomalies","","Note: Full transition flow arrows require server-side support (planned).","Currently showing U-matrix for cluster boundary context."].join("\n")}),await T(l,e,`umatrix.${s}`);else if("local_density"===t)l.push({type:"text",text:[`Local Density & Cluster Analysis (job: ${e})`,`Grid: ${i[0]}×${i[1]} | Samples: ${r.n_samples??0}`,"","Local density = inverse of U-matrix values."," High density (low U-matrix) = cluster cores (similar neighbors)"," Low density (high U-matrix) = cluster boundaries (dissimilar neighbors)","","Cross-reference hit histogram with U-matrix:"," Dense hits + low U-matrix = populated cluster core (dominant operating mode)"," Dense hits + high U-matrix = transition zone with many samples (worth investigating)"," Sparse hits anywhere = rare state or interpolated region"].join("\n")}),await T(l,e,`umatrix.${s}`),await T(l,e,`hit_histogram.${s}`);else if("feature_gradient"===t){const t=a?.feature;if(l.push({type:"text",text:[`Feature Gradient Analysis (job: ${e})`,`Target: ${t??"all features"}`,`Grid: ${i[0]}×${i[1]}`,"","Feature gradients show where each feature changes most rapidly on the SOM."," High gradient = feature transitions rapidly (boundary region for this feature)"," Low gradient = feature is stable across this region","","Compare with U-matrix: if feature gradients align with U-matrix boundaries,","this feature is a key driver of the cluster separation."].join("\n")}),t){const a=n.indexOf(t);if(a>=0){const o=t.replace(/[^a-zA-Z0-9_]/g,"_");await T(l,e,`component_${a+1}_${o}.${s}`)}}else await T(l,e,`combined.${s}`);await T(l,e,`umatrix.${s}`)}else if("quality_report"===t){const t=await g("GET",`/v1/results/${e}/quality-report`),a=t.standard_metrics??{},o=t.cluster_metrics??{},r=t.topology_metrics??{},n=t.training??{},i=t.grid??[0,0],s=e=>null!=e?e.toFixed(4):"—",d=e=>null!=e?`${(100*e).toFixed(1)}%`:"—",u=[],c=a.quantization_error,p=a.topographic_error,m=a.explained_variance,h=o.silhouette,f=r.trustworthiness;null!=c&&c>2&&u.push("QE is high → try more epochs or a larger grid"),null!=p&&p>.15&&u.push("TE is high → topology is not well-preserved, try larger grid"),null!=m&&m<.7&&u.push("Explained variance < 70% → consider more training or feature selection"),null!=h&&h<.1&&u.push("Low silhouette → clusters overlap, try sigma_f=0.5 or more epochs"),null!=f&&f<.85&&u.push("Trustworthiness < 85% → local neighborhood structure is distorted"),0===u.length&&u.push("All metrics look healthy — good map quality!");const b=n.epochs,_=b?0===b[1]?`${b[0]} ordering only`:`${b[0]}+${b[1]}`:"—",w=[`Quality Report — Job ${e}`,`Grid: ${i[0]}×${i[1]} | Model: ${t.model??"SOM"} | Samples: ${t.n_samples??"?"}`,`Epochs: ${_} | Duration: ${n.duration_seconds?`${n.duration_seconds}s`:"—"}`,"","Standard Metrics:",` Quantization Error: ${s(a.quantization_error)} (lower is better)`,` Topographic Error: ${s(a.topographic_error)} (lower is better)`,` Distortion: ${s(a.distortion)}`,` Kaski-Lagus Error: ${s(a.kaski_lagus_error)} (lower is better)`,` Explained Variance: ${d(a.explained_variance)}`,"","Cluster Quality Metrics:",` Silhouette Score: ${s(o.silhouette)} (higher is better, -1 to +1)`,` Davies-Bouldin: ${s(o.davies_bouldin)} (lower is better)`,` Calinski-Harabasz: ${s(o.calinski_harabasz)} (higher is better)`,"","Topology Metrics:",` Neighborhood Preservation: ${d(r.neighborhood_preservation)} (higher is better)`,` Trustworthiness: ${d(r.trustworthiness)} (higher is better)`,` Topographic Product: ${s(r.topographic_product)} (near 0 is ideal)`,"","Recommendations:",...u.map(e=>` • ${e}`)];l.push({type:"text",text:w.join("\n")})}return{content:l}}),_.tool("compare_runs",'Compare metrics across multiple completed SOM training jobs.\nReturns a table of QE, TE, silhouette, and other metrics for each job.\n\nUse to evaluate hyperparameter choices: grid size, epochs, sigma_f, model type, feature selection.\n\nAfter comparing, ask the user:\n"Which job produced the best metrics for your goal?"\n- For visualization clarity: prioritize low topographic error (<0.1)\n- For tight clusters: prioritize low QE and high silhouette\n- For dimensionality reduction: prioritize high explained variance (>0.8)',{job_ids:a.array(a.string()).min(2).describe("Array of job IDs to compare (minimum 2)")},async({job_ids:e})=>{const t=e.join(","),a=(await g("GET",`/v1/jobs/compare?ids=${t}`)).comparisons??[],o=["| Job ID | Grid | Epochs | Model | QE | TE | Expl.Var | Silhouette |","|--------|------|--------|-------|----|----|----------|------------|"];for(const e of a){if(e.error){o.push(`| ${e.job_id.slice(0,8)}... | — | — | — | ${e.error} | — | — | — |`);continue}const t=e.grid,a=t?`${t[0]}×${t[1]}`:"—",r=e.epochs,n=r?0===r[1]?`${r[0]}+0`:`${r[0]}+${r[1]}`:"—",i=e.model??"—",s=e=>null!=e?Number(e).toFixed(4):"—";o.push(`| ${e.job_id.slice(0,8)}... | ${a} | ${n} | ${i} | ${s(e.quantization_error)} | ${s(e.topographic_error)} | ${s(e.explained_variance)} | ${s(e.silhouette)} |`)}return{content:[{type:"text",text:o.join("\n")}]}}),_.tool("manage_job","Cancel or delete a job.\n\naction=cancel: Cancel a pending or running job. Not instant — worker checks between phases (expect up to 30s). Use when run is too slow, wrong params, or to free the worker. Partial results discarded.\naction=delete: Permanently delete a job and all S3 result files. Use to free storage, remove test runs, or clean up after cancel. WARNING: Job ID will no longer work with get_results or other tools.",{job_id:a.string().describe("Job ID to cancel or delete"),action:a.enum(["cancel","delete"]).describe("cancel: stop the job; delete: remove job and all result files")},async({job_id:e,action:t})=>{if("cancel"===t){return f(await g("POST",`/v1/jobs/${e}/cancel`))}return f(await g("DELETE",`/v1/jobs/${e}`))}),_.tool("list","List datasets or jobs.\n\ntype=datasets: List all datasets uploaded by the organization. Use to check what data is available before train_som or to find dataset IDs.\ntype=jobs: List SOM training jobs (optionally filtered by dataset_id). Use to find job IDs for compare_runs, check completed vs pending, or review hyperparameters.",{type:a.enum(["datasets","jobs"]).describe("What to list: datasets or jobs"),dataset_id:a.string().optional().describe("Filter jobs by dataset ID (only used when type=jobs)")},async({type:e,dataset_id:t})=>{if("datasets"===e){return f(await g("GET","/v1/datasets"))}const a=t?`/v1/jobs?dataset_id=${t}`:"/v1/jobs",o=await g("GET",a);if("jobs"===e&&Array.isArray(o)){const e=o.map(e=>{const t=String(e.id??""),a=String(e.status??""),o=null!=e.label&&""!==e.label?String(e.label):null;return o?`${o} (id: ${t}) — status: ${a}`:`id: ${t} — status: ${a}`});return{content:[{type:"text",text:e.length>0?e.join("\n"):"No jobs found."}]}}return f(o)}),_.tool("get_job_export","Export structured data from a completed SOM training job.\n\nexport=training_log: Learning curve and diagnostics (per-epoch QE, sparklines, inline plot). Use to diagnose convergence, plateau, or divergence.\nexport=weights: Raw weight matrix with node_coords, normalized/denormalized values, normalization stats. Use for external analysis or custom visualizations. Can be large (e.g. 600KB+ for 30×30×12).\nexport=nodes: Per-node statistics (hit count, feature mean/std). Use to profile clusters and characterize operating modes.",{job_id:a.string().describe("Job ID of a completed training job"),export:a.enum(["training_log","weights","nodes"]).describe("What to export: training_log, weights, or nodes")},async({job_id:e,export:t})=>{if("training_log"===t){const t=await g("GET",`/v1/results/${e}/training-log`),a=t.ordering_errors??[],o=t.convergence_errors??[],r=t.training_duration_seconds,n=t.epochs,i=e=>{if(0===e.length)return"(no data)";const t=Math.min(...e),a=Math.max(...e)-t||1;return e.map(e=>"▁▂▃▄▅▆▇█"[Math.min(7,Math.floor((e-t)/a*7))]).join("")},s=[`Training Log — Job ${e}`,`Grid: ${JSON.stringify(t.grid)} | Model: ${t.model??"SOM"}`,"Epochs: "+(n?`[${n[0]} ordering, ${n[1]} convergence]`:"N/A"),"Duration: "+(null!=r?`${r}s`:"N/A"),`Features: ${t.n_features??"?"} | Samples: ${t.n_samples??"?"}`,"",`Ordering Phase (${a.length} epochs):`,` Start QE: ${a[0]?.toFixed(4)??"—"} → End QE: ${a.at(-1)?.toFixed(4)??"—"}`,` Curve: ${i(a)}`];o.length>0?s.push("",`Convergence Phase (${o.length} epochs):`,` Start QE: ${o[0]?.toFixed(4)??"—"} → End QE: ${o.at(-1)?.toFixed(4)??"—"}`,` Curve: ${i(o)}`):0===(n?.[1]??0)&&s.push("","Convergence phase: skipped (epochs[1]=0)");const l=t.quantization_error,d=t.explained_variance;null!=l&&s.push("",`Final QE: ${l.toFixed(4)} | Explained Variance: ${(d??0).toFixed(4)}`);const u=[{type:"text",text:s.join("\n")}];let c=!1;for(const t of["png","pdf","svg"])try{const{data:a}=await h(`/v1/results/${e}/image/learning_curve.${t}`);u.push({type:"image",data:a.toString("base64"),mimeType:S(`learning_curve.${t}`),annotations:{audience:["user"],priority:.8}}),c=!0;break}catch{continue}return c||u.push({type:"text",text:"(learning curve plot not available)"}),{content:u}}if("weights"===t){const t=await g("GET",`/v1/results/${e}/weights`),a=t.features??[],o=t.n_nodes??0,r=t.grid??[0,0],n=[`SOM Weights — Job ${e}`,`Grid: ${r[0]}×${r[1]} | Nodes: ${o} | Features: ${a.length}`,"node_coords: [x,y] per node for topology",`Features: ${a.join(", ")}`,"","Normalization Stats:"],i=t.normalization_stats??{};for(const[e,t]of Object.entries(i))n.push(` ${e}: mean=${t.mean?.toFixed(4)}, std=${t.std?.toFixed(4)}`);return n.push("","Full weight matrix available in the response JSON (includes node_coords).","Use the denormalized_weights array for original-scale values."),{content:[{type:"text",text:n.join("\n")},{type:"text",text:JSON.stringify(t,null,2)}]}}const a=await g("GET",`/v1/results/${e}/nodes`),o=[...a].sort((e,t)=>(t.hit_count??0)-(e.hit_count??0)).slice(0,10),r=a.filter(e=>0===e.hit_count).length,n=a.reduce((e,t)=>e+(t.hit_count??0),0),i=[`Node Statistics — Job ${e}`,`Total nodes: ${a.length} | Active: ${a.length-r} | Empty: ${r}`,`Total hits: ${n}`,"","Top 10 Most Populated Nodes:","| Node | Coords | Hits | Hit% |","|------|--------|------|------|"];for(const e of o){if(0===e.hit_count)break;const t=e.coords,a=(e.hit_count/n*100).toFixed(1);i.push(`| ${e.node_index} | (${t?.[0]?.toFixed(1)}, ${t?.[1]?.toFixed(1)}) | ${e.hit_count} | ${a}% |`)}return{content:[{type:"text",text:i.join("\n")},{type:"text",text:`\nFull node statistics JSON:\n${JSON.stringify(a,null,2)}`}]}}),_.tool("project_variable",'Project a pre-computed variable onto a trained SOM without retraining.\n\nBEST FOR: Mapping external metrics (revenue, labels, anomaly scores) onto the\ntrained SOM structure. For formula-based variables from the training dataset,\nprefer derive_variable with project_onto_job; use project_variable only for\nexternally computed arrays.\nNOT FOR: Re-training or adding features to the map.\n\nTIMING: ~5–15s (loads cached SOM, computes per-node aggregation, renders plot).\n\nvalues: array of length n_samples (~11 bytes/sample). Must match training sample\ncount exactly (same CSV row order). Aggregation controls how multiple samples\nper node are combined (mean/median/sum/max/count).\n\nBEFORE calling, ask:\n- "What variable? Is it from the original data or externally computed?"\n- "How to aggregate per node: mean (typical), sum (totals), max (peaks)?"\n\nCOMMON MISTAKES:\n- Wrong number of values (must match n_samples from training)\n- Using mean aggregation for count data (use sum instead)\n- Not trying derive_variable first when the variable can be computed from columns\n\nHINT: If values length mismatch, suggest derive_variable for formula-based variables.',{job_id:a.string().describe("ID of the completed SOM training job"),variable_name:a.string().describe("Name for this variable (used in visualization labels)"),values:a.array(a.number()).describe("Array of values to project — one per training sample, in original CSV row order"),aggregation:a.enum(["mean","median","sum","min","max","std","count"]).optional().default("mean").describe("How to aggregate values for nodes with multiple samples"),output_format:a.enum(["png","pdf","svg"]).optional().default("png").describe("Image output format for the projection plot (default: png)."),output_dpi:a.enum(["standard","retina","print"]).optional().default("retina").describe("Resolution: standard (1x), retina (2x), print (4x)."),colormap:a.string().optional().describe("Override colormap for the projection plot (default: coolwarm). Examples: viridis, plasma, inferno, magma, cividis, turbo, coolwarm, RdBu, Spectral.")},async({job_id:e,variable_name:t,values:a,aggregation:o,output_format:r,output_dpi:n,colormap:i})=>{const s={variable_name:t,values:a,aggregation:o??"mean",output_format:r??"png"};n&&"retina"!==n&&(s.output_dpi={standard:1,retina:2,print:4}[n]??2),i&&(s.colormap=i);const l=(await g("POST",`/v1/results/${e}/project`,s)).id,d=await b(l);if("completed"===d.status){const a=(await g("GET",`/v1/results/${l}`)).summary??{},n=a.variable_stats??{},i=[];i.push({type:"text",text:[`Projected Variable: ${t} (${o??"mean"}) — job: ${l}`,`Parent SOM: ${e} | Samples: ${a.n_samples??0}`,"",`Variable Statistics (per-node ${o??"mean"}):`,` Min: ${void 0!==n.min?Number(n.min).toFixed(3):"N/A"}`,` Max: ${void 0!==n.max?Number(n.max).toFixed(3):"N/A"}`,` Mean: ${void 0!==n.mean?Number(n.mean).toFixed(3):"N/A"}`,` Nodes with data: ${n.n_nodes_with_data??"N/A"}`].join("\n")});const s=t.replace(/[^a-zA-Z0-9_]/g,"_"),d=a.output_format??r??"pdf";return await T(i,l,`projected_${s}.${d}`),{content:i}}return"failed"===d.status?{content:[{type:"text",text:`Projection job ${l} failed: ${d.error??"unknown error"}`}]}:{content:[{type:"text",text:["Variable projection job submitted but did not complete within 30s.",`Projection job ID: ${l}`,"",`Poll with: get_job_status(job_id="${l}")`,`Retrieve with: get_results(job_id="${l}")`].join("\n")}]}}),_.tool("transition_flow",'Compute temporal transition flow for a trained SOM.\n\nShows how data points move between SOM nodes over time — revealing directional\npatterns, cycles, and state machine behavior in sequential data.\n\n**Requires time-ordered data.** Each row must represent a consecutive observation;\nthe transition from row i to row i+lag is counted. If rows are not time-ordered,\nresults will be meaningless.\n\nBest used for:\n- **Time-series dynamics**: how does the system state evolve step-by-step?\n- **Cyclic processes**: daily/weekly patterns, recurring operating modes\n- **Process monitoring**: identify common transition paths and bottlenecks\n- **Regime detection**: find absorbing states (nodes with self-loops) vs transient hubs\n\n**lag** controls the temporal horizon:\n- lag=1 (default): immediate next-step transitions — "where does the system go next?"\n- lag=N: transitions N steps apart — useful for periodic analysis (e.g. lag=24 for daily cycles in hourly data)\n- Try multiple lags to reveal different temporal scales.\n\n**min_transitions** filters noisy arrows — only transitions observed at least this many times are drawn. Increase for cleaner plots on large datasets.\n\n**top_k** controls how many top-flow nodes are reported in statistics.\n\nBEFORE calling, confirm:\n- Data is time-ordered (chronological row sequence)\n- The lag makes sense for the time resolution (e.g. lag=1 for hourly data = 1 hour ahead)\n\nAfter showing results, discuss:\n- Arrow direction and clustering patterns\n- Hub nodes (many transitions through them) vs absorbing nodes (self-loops)\n- Whether cyclic flow matches known periodic behavior',{job_id:a.string().describe("ID of the completed SOM training job"),lag:a.number().int().min(1).optional().default(1).describe("Step lag for transition pairs (default 1 = consecutive rows). Use larger values for periodic analysis (e.g. 24 for daily cycles in hourly data)."),min_transitions:a.number().int().min(1).optional().describe("Minimum transition count to draw an arrow (default: auto). Increase to filter noise on large datasets."),top_k:a.number().int().min(1).optional().default(10).describe("Number of top-flow nodes to include in statistics (default 10)."),colormap:a.string().optional().describe("Colormap for the U-matrix background (default: grays). Try viridis, plasma, or inferno for more contrast."),output_format:a.enum(["png","pdf","svg"]).optional().default("png").describe("Image output format for the flow plot (default: png)."),output_dpi:a.enum(["standard","retina","print"]).optional().default("retina").describe("Resolution: standard (1x), retina (2x), print (4x).")},async({job_id:e,lag:t,min_transitions:a,top_k:o,colormap:r,output_format:n,output_dpi:i})=>{const s={lag:t??1,output_format:n??"png"};void 0!==a&&(s.min_transitions=a),void 0!==o&&(s.top_k=o),void 0!==r&&(s.colormap=r),i&&"retina"!==i&&(s.output_dpi={standard:1,retina:2,print:4}[i]??2);const l=(await g("POST",`/v1/results/${e}/transition-flow`,s)).id,d=await b(l);if("completed"===d.status){const a=(await g("GET",`/v1/results/${l}`)).summary??{},o=a.flow_stats??{},r=[];r.push({type:"text",text:[`Transition Flow Results (job: ${l})`,`Parent SOM: ${e} | Lag: ${t??1} | Samples: ${a.n_samples??0}`,"","Flow Statistics:",` Active flow nodes: ${o.active_flow_nodes??"N/A"}`,` Total transitions: ${o.total_transitions??"N/A"}`,` Mean magnitude: ${void 0!==o.mean_magnitude?Number(o.mean_magnitude).toFixed(4):"N/A"}`].join("\n")});const i=n??"pdf";return await T(r,l,`transition_flow_lag${t??1}.${i}`),{content:r}}return"failed"===d.status?{content:[{type:"text",text:`Transition flow job ${l} failed: ${d.error??"unknown error"}`}]}:{content:[{type:"text",text:["Transition flow job submitted but did not complete within 30s.",`Flow job ID: ${l}`,`Parent job: ${e} | Lag: ${t??1}`,"",`Poll with: get_job_status(job_id="${l}")`,`Retrieve with: get_results(job_id="${l}")`].join("\n")}]}}),_.tool("derive_variable",'Create a derived variable from existing dataset columns using mathematical expressions.\n\nBEST FOR: Computing ratios, differences, log transforms, rolling statistics,\nor any combination of existing columns — either to enrich a dataset before\ntraining or to project a computed variable onto an existing SOM.\n\nTWO MODES:\n1. Add to dataset (default): computes the new column and appends it to the dataset CSV.\n The column is then available for future train_som calls via the \'columns\' parameter.\n2. Project onto SOM: computes the column from the training dataset and projects it\n onto a trained SOM, returning the visualization. Use this to explore how\n derived quantities distribute across the learned map structure.\n\nCOMMON FORMULAS:\n- Ratio: "revenue / cost"\n- Difference: "US10Y - US3M"\n- Log return: "log(close) - log(open)"\n- Z-score: "(volume - rolling_mean(volume, 20)) / rolling_std(volume, 20)"\n- Magnitude: "sqrt(x^2 + y^2)"\n- Unit convert: "temperature - 273.15"\n- First diff: "diff(consumption)"\n\nSUPPORTED FUNCTIONS:\n- Arithmetic: +, -, *, /, ^\n- Math: log, log1p, log10, exp, sqrt, abs, sign, clamp, min, max\n- Trig: sin, cos, tan, asin, acos, atan\n- Rolling: rolling_mean(col, window), rolling_std(col, window), rolling_min, rolling_max\n- Temporal: diff(col), diff(col, n)\n- Constants: pi, numeric literals\n\nWORKFLOW: Ask the user what domain-specific variables they care about.\nSuggest derived variables based on the column names. For example, if\nthe dataset has "revenue" and "cost", suggest "revenue - cost" as profit\nand "revenue / cost" as cost efficiency.\n\nEXPRESSION REFERENCE: In expressions, use underscore-normalized column names (e.g. fixed_acidity not "fixed acidity"). Column names with spaces/special chars are converted to underscores — use that form. Operators: +, -, *, /, ^. Functions: log, sqrt, rolling_mean(col, window), diff(col), etc.\n\nCOMMON MISTAKES:\n- Division by zero: if denominator column has zeros, use options.missing="skip"\n- Rolling functions produce NaN for the first (window-1) rows\n- diff() produces NaN for the first row\n- Spaces in column names: use underscores (e.g. fixed_acidity not "fixed acidity")',{dataset_id:a.string().describe("Dataset ID (source of column data)"),name:a.string().describe("Name for the derived variable (used in column header and visualization)"),expression:a.string().describe("Mathematical expression referencing column names. Examples: 'revenue / cost', 'log(price)', 'diff(temperature)', 'sqrt(x^2 + y^2)', 'rolling_mean(volume, 20)'"),project_onto_job:a.string().optional().describe("If provided, project the derived variable onto this SOM job instead of adding to dataset. The job must be a completed train_som job."),aggregation:a.enum(["mean","median","sum","min","max","std","count"]).optional().default("mean").describe("How to aggregate values per SOM node (only used when project_onto_job is set)"),options:a.object({missing:a.enum(["skip","zero","interpolate"]).optional().default("skip").describe("How to handle NaN/missing values in the result"),window:a.number().int().optional().describe("Default window size for rolling functions (default 20)"),description:a.string().optional().describe("Human-readable description of what this variable represents")}).optional().describe("Configuration for expression evaluation"),output_format:a.enum(["png","pdf","svg"]).optional().default("png").describe("Image format for projection visualization when project_onto_job is set (default: png)."),output_dpi:a.enum(["standard","retina","print"]).optional().default("retina").describe("Resolution for projection visualization"),colormap:a.string().optional().describe("Colormap for projection visualization (default: coolwarm). Examples: viridis, plasma, inferno, magma, cividis, turbo, coolwarm, RdBu, Spectral.")},async({dataset_id:e,name:t,expression:a,project_onto_job:o,aggregation:r,options:n,output_format:i,output_dpi:s,colormap:l})=>{const d={standard:1,retina:2,print:4};if(o){const u={name:t,expression:a,aggregation:r??"mean",output_format:i??"png"};e&&(u.dataset_id=e),n&&(u.options=n),s&&"retina"!==s&&(u.output_dpi=d[s]??2),l&&(u.colormap=l);const c=(await g("POST",`/v1/results/${o}/derive`,u)).id,p=await b(c);if("completed"===p.status){const e=(await g("GET",`/v1/results/${c}`)).summary??{},n=e.variable_stats??{},s=[];s.push({type:"text",text:[`Derived Variable Projected: ${t} — job: ${c}`,`Expression: ${a}`,`Parent SOM: ${o} | Aggregation: ${r??"mean"}`,"",`Statistics (per-node ${r??"mean"}):`,` Min: ${void 0!==n.min?Number(n.min).toFixed(3):"N/A"}`,` Max: ${void 0!==n.max?Number(n.max).toFixed(3):"N/A"}`,` Mean: ${void 0!==n.mean?Number(n.mean).toFixed(3):"N/A"}`,` Nodes with data: ${n.n_nodes_with_data??"N/A"}`,e.nan_count?` NaN values: ${e.nan_count}`:""].filter(e=>""!==e).join("\n")});const l=t.replace(/[^a-zA-Z0-9_]/g,"_"),d=e.output_format??i??"pdf";return await T(s,c,`projected_${l}.${d}`),{content:s}}return"failed"===p.status?{content:[{type:"text",text:`Derive+project job ${c} failed: ${p.error??"unknown error"}`}]}:{content:[{type:"text",text:`Derive job submitted. Poll: get_job_status("${c}")`}]}}{const o={name:t,expression:a};n&&(o.options=n);const r=(await g("POST",`/v1/datasets/${e}/derive`,o)).id,i=await b(r);if("completed"===i.status){const o=(await g("GET",`/v1/results/${r}`)).summary??{};return{content:[{type:"text",text:[`Derived column "${t}" added to dataset ${e}`,`Expression: ${a}`,`Rows: ${o.n_rows??"?"}`,o.nan_count?`NaN values: ${o.nan_count}`:"",`Min: ${o.min??"?"} | Max: ${o.max??"?"} | Mean: ${o.mean??"?"}`,"","The column is now available in the dataset. Include it in train_som","via the 'columns' parameter, or use datasets(action=preview) to verify."].filter(e=>""!==e).join("\n")}]}}return"failed"===i.status?{content:[{type:"text",text:`Derive variable job ${r} failed: ${i.error??"unknown error"}`}]}:{content:[{type:"text",text:`Derive job submitted. Poll: get_job_status("${r}")`}]}}}),_.tool("manage_account","Manage compute leases, check account status, and view billing history.\nYour default backend is your local Primary Server. Use this tool to temporarily upgrade to cloud compute for heavy jobs.\n\nActions:\n- request_compute: Provisions a cloud burst instance (requires tier and duration_minutes). Leaves tier blank to list options.\n- compute_status: Checks if a burst lease is active and how much time remains.\n- release_compute: Manually terminates an active lease and reverts routing to the Primary Server.\n- compute_history: Views recent compute leases and credit spend.\n- add_funds: Instructions on how to add credits to your account.",{action:{type:"string",description:"One of: request_compute, compute_status, release_compute, compute_history, add_funds"},tier:{type:"string",description:"Compute tier ID (e.g., cpu-8, gpu-t4). Leave empty during request_compute to list tiers.",optional:!0},duration_minutes:{type:"number",description:"How long to lease the instance (default: 60)",optional:!0},limit:{type:"number",description:"Number of records to fetch for compute_history (default: 10)",optional:!0}},async({action:e,tier:t,duration_minutes:a,limit:o})=>{if("request_compute"===e){if(!t)return{content:[{type:"text",text:"Available Compute Tiers:\nCPU Tiers:\n cpu-8: 16 vCPUs, 32 GB RAM (~$0.20/hr)\n cpu-16: 32 vCPUs, 64 GB RAM (~$0.20/hr)\n cpu-24: 48 vCPUs, 96 GB RAM (~$0.28/hr)\n cpu-32: 64 vCPUs, 128 GB RAM (~$0.42/hr)\n cpu-48: 96 vCPUs, 192 GB RAM (~$0.49/hr)\nGPU Tiers:\n gpu-t4: 8 vCPUs, 32 GB, T4 16GB VRAM (~$0.22/hr)\n gpu-t4x: 16 vCPUs, 64 GB, T4 16GB VRAM (~$0.36/hr)\n gpu-t4xx: 32 vCPUs, 128 GB, T4 16GB VRAM (~$0.27/hr)\n gpu-l4: 8 vCPUs, 32 GB, L4 24GB VRAM (~$0.41/hr)\n gpu-l4x: 16 vCPUs, 64 GB, L4 24GB VRAM (~$0.37/hr)\n gpu-a10: 8 vCPUs, 32 GB, A10G 24GB VRAM (~$0.51/hr)\n gpu-a10x: 16 vCPUs, 64 GB, A10G 24GB VRAM (~$0.52/hr)"}]};const e=await g("POST","/v1/compute/lease",{tier:t,duration_minutes:a}),o="postpaid"===e.billing_mode?`Billing: Postpaid (usage logged, billed retrospectively)\nAccrued Balance: $${(e.credit_balance_cents/100).toFixed(2)}`:`Credits Remaining After Reserve: $${(e.credit_balance_cents/100).toFixed(2)}`;return{content:[{type:"text",text:`Compute Lease Requested:\nLease ID: ${e.lease_id}\nStatus: ${e.status}\nEstimated Wait: ${e.estimated_wait_minutes} minutes\nEstimated Cost: $${(e.estimated_cost_cents/100).toFixed(2)}\n${o}\n\nIMPORTANT: Cloud burst active. Data is pulled from shared Cloudflare R2, so you do NOT need to re-upload datasets. Just wait ~3 minutes and check status.`}]}}if("compute_status"===e){const e=await g("GET","/v1/compute/lease");return"none"!==e.status&&e.lease_id?{content:[{type:"text",text:`Active Compute Lease:\nLease ID: ${e.lease_id}\nStatus: ${e.status}\nTier: ${e.tier} (${e.instance_type})\nTime Remaining: ${Math.round(e.time_remaining_ms/6e4)} minutes`}]}:{content:[{type:"text",text:"No active lease -- running on default Primary Server."}]}}if("release_compute"===e){const e=await g("DELETE","/v1/compute/lease"),t="postpaid"===e.billing_mode?`Cost Logged: $${(e.cost_cents/100).toFixed(2)} (postpaid — billed retrospectively)`:`Credits Deducted: $${((e.credits_deducted||e.cost_cents)/100).toFixed(2)}`;return{content:[{type:"text",text:`Compute Released:\nDuration Billed: ${e.duration_minutes} minutes\n${t}\nBalance: $${(e.final_balance_cents/100).toFixed(2)}\n\nRouting reverted to default Primary Server.`}]}}if("compute_history"===e){const e=await g("GET",`/v1/compute/history?limit=${o||10}`),t=e.history.map(e=>`- ${e.started_at} | ${e.tier} | ${e.duration_minutes} min | $${(e.credits_charged/100).toFixed(2)}`).join("\n");return{content:[{type:"text",text:`Credit Balance: $${(e.credit_balance_cents/100).toFixed(2)}\n\nRecent Usage:\n${t}`}]}}return"add_funds"===e?{content:[{type:"text",text:"To add funds to your account, please visit the Barivia Billing Portal (integration pending) or ask your administrator to use the CLI tool:\nbash scripts/manage-credits.sh add <org_id> <amount_usd>"}]}:{content:[{type:"text",text:`Unknown action: ${e}. Valid actions are request_compute, compute_status, release_compute, compute_history, add_funds.`}]}}),_.tool("license_info","Get plan/license capabilities, backend info, live status, and training time estimates.\n\nReturns what the current API key is connected to: plan tier, compute class\n(CPU/GPU), usage limits, backend hardware details, job queue state, and\nestimated training times.\n\nUse this BEFORE submitting large jobs to:\n- See your plan's compute class and whether GPU is available\n- Check queue depth to decide whether to wait or proceed\n- Estimate wall-clock time based on the current topology",{},async()=>{const e=await g("GET","/v1/system/info"),t=e.plan??{},a=e.backend??{},o=e.status??{},r=e.training_time_estimates_seconds??{},n=e.worker_topology??{},i=t.gpu_enabled??e.gpu_available??!1?a.gpu_model?`GPU-accelerated (${a.gpu_model}${a.gpu_vram_gb?`, ${a.gpu_vram_gb} GB VRAM`:""})`:"GPU-accelerated":"CPU only",s=e=>-1===e||"-1"===e?"unlimited":String(e??"?"),l=await g("GET","/v1/compute/lease").catch(()=>null),d=await g("GET","/v1/compute/history?limit=30").catch(()=>null),u=[`Your Plan: ${String(t.tier??"unknown").charAt(0).toUpperCase()}${String(t.tier??"unknown").slice(1)}`,` Priority: ${t.priority??"normal"}`,` Concurrency: ${t.max_concurrent_jobs??"?"} simultaneous job${1!==t.max_concurrent_jobs?"s":""}`,` Datasets: ${s(t.max_datasets)} max, ${s(t.max_dataset_rows)} rows each`,` Monthly Jobs: ${s(t.max_monthly_jobs)}`,` Grid Size: ${s(t.max_grid_size)} max`,` Features: ${s(t.max_features)} max`];d&&u.push("",`Compute Credits: $${(d.credit_balance_cents/100).toFixed(2)} remaining`),u.push("","Default Backend:",` Label: ${a.label??"unknown"}`,` Compute: ${i}`),a.memory_gb&&u.push(` Memory: ${a.memory_gb} GB`),l&&l.lease_id&&u.push("","Active Compute Lease (Burst):",` Status: ${l.status}`,` Tier: ${l.tier} (${l.instance_type})`,` Time Left: ${Math.round(l.time_remaining_ms/6e4)} min`);const c=Number(o.running_jobs??e.running_jobs??0),p=Number(o.pending_jobs??e.pending_jobs??0),m=Number(o.queue_depth??e.queue_depth??0),h=o.free_memory_gb??e.free_memory_gb,f=(p+1)*Number(e.training_time_estimates_seconds?.total||0);return u.push("","Live Status:",` Running Jobs: ${c}`,` Pending Jobs: ${p}`,` Queue Depth: ${m}`,` Current Wait: ~${Math.round(f/60)} minutes (before your job starts)`),void 0!==h&&u.push(` Free Memory: ${h} GB`),void 0!==n.num_workers&&u.push("","Worker Topology:",` Workers: ${n.num_workers} × ${n.threads_per_worker} threads`,` Total Thread Budget: ${n.total_thread_budget}`),Object.keys(r).length>0&&(u.push("","Estimated Training Times (seconds):",...Object.entries(r).filter(([e])=>"formula"!==e).map(([e,t])=>` ${e}: ~${t}s`)),r.formula&&u.push(` ${r.formula}`)),{content:[{type:"text",text:u.join("\n")}]}}),_.prompt("info","Overview of the Barivia SOM MCP: capabilities, workflow, tools, analysis types, and tips. Use when the user asks what this MCP can do, how to get started, or what the process is.",{},()=>({messages:[{role:"user",content:{type:"text",text:["Inform the user using this overview:","","**What it is:** Barivia MCP connects you to a Self-Organizing Map (SOM) analytics engine. SOMs learn a 2D map from high-dimensional data for visualization, clustering, pattern discovery, and temporal analysis.","","**Core workflow:**","1. **Upload** — `datasets(action=upload)` with a CSV file path or inline data","2. **Preview** — `datasets(action=preview)` to inspect columns, stats, and detect cyclic/datetime fields","3. **Prepare** — use the `prepare_training` prompt for a guided checklist (column selection, transforms, cyclic encoding, feature weights, grid sizing)","4. **Train** — `train_som` with grid size, epochs, model type, and preprocessing options. Use `preset=quick|standard|refined|high_res` for sensible defaults","5. **Monitor** — `get_job_status` to track progress; `get_results` to retrieve figures when complete","6. **Analyze** — `analyze` with various analysis types (see below)","7. **Feedback** — Ask the user if they'd like to submit feedback via `send_feedback` based on their experience.","8. **Iterate** — `recolor_som` to change colormap without retraining, `compare_runs` to compare hyperparameters, `project_variable` to overlay new variables","","**Analysis types** (via `analyze`):","- `u_matrix` — cluster boundary distances","- `component_planes` — per-feature heatmaps","- `clusters` — automatic cluster detection and statistics","- `quality_report` — QE, TE, explained variance, trustworthiness, neighborhood preservation","- `hit_histogram` — data density across the map","- `transition_flow` — temporal flow patterns (requires time-ordered data)","","**Data tools:**","- `datasets(action=subset)` — filter by row range, value thresholds (gt/lt/gte/lte), equality, or set membership. Combine row_range + filter","- `derive_variable` — create computed columns from expressions (ratios, differences, etc.)","- `project_variable` — overlay any variable onto a trained SOM","","**Output options:** Format (png/pdf/svg) and colormap (coolwarm, viridis, plasma, inferno, etc.) can be set at training or changed later via recolor_som.","","**Key tools:** datasets, list, train_som, get_job_status, get_results, analyze, recolor_som, download_results, project_variable, transition_flow, compare_runs, derive_variable, license_info, explore_som.","","**Tips:**","- Always `preview` before training to understand your data","- Use `license_info` to check GPU availability and plan limits before large jobs","- Start with `preset=quick` for fast iteration, then `refined` for publication quality","- For time-series data, consider `transition_flow` after training","","Keep the reply scannable with headers and bullet points."].join("\n")}}]})),_.prompt("prepare_training","Guided pre-training checklist. Use after uploading a dataset and before calling train_som. Walks through data inspection, column selection, transforms, cyclic/temporal features, weighting, subsetting, and grid sizing.",{dataset_id:a.string().describe("Dataset ID to prepare for training")},async({dataset_id:e})=>{let t=`Please run datasets(action="preview", dataset_id="${e}") first, then call train_som with appropriate parameters or preset.`;try{const a=await g("GET",`/v1/docs/prepare_training?dataset_id=${e}`);a.prompt&&(t=a.prompt)}catch(e){}return{messages:[{role:"user",content:{type:"text",text:t}}]}}),_.tool("predict",'Score new data against a trained SOM — returns BMU coordinates, cluster assignment, and quantization error per row.\n\nBEST FOR: Anomaly detection (high QE = outlier), scoring new records against a known cluster structure, real-time monitoring, recommendation ("find nearest node"), customer segmentation on new observations. Train once, use forever.\nNOT FOR: Retraining — this never changes the SOM weights. Not for datasets with different columns than the training data.\n\nTIMING: 5–60s depending on dataset size. Auto-polls for up to 2 minutes then returns a job_id if still running.\n\nINPUT: Provide either dataset_id (upload a CSV first) or rows (inline JSON dicts, max 500 rows). Column names must exactly match the trained feature set — use datasets(action=preview) on the training dataset to verify names.\n\nOUTPUT: predictions.csv with columns: row_id, bmu_x, bmu_y, bmu_node_index, cluster_id, quantization_error.\nHigh QE = the row is far from its nearest prototype = potential anomaly or out-of-distribution sample.\n\nAFTER predict: Use enrich_dataset to instead annotate the original training CSV. Use compare_datasets to compare population-level hit distributions.\n\nESCALATION:\n- Missing column error: Use datasets(action=preview) on your input dataset to verify column names match the SOM training features exactly.\n- All QE values very high: Your new data may be from a different distribution than training data — expected for drift detection use cases.',{job_id:a.string().describe("Job ID of a completed SOM training job"),dataset_id:a.string().optional().describe("ID of an existing uploaded dataset to score"),rows:a.array(a.record(a.string(),a.number())).optional().describe("Inline feature rows to score (max 500). Each object maps feature name → numeric value")},async({job_id:e,dataset_id:t,rows:a})=>{if(!t&&!a)throw new Error("Either dataset_id or rows must be provided.");const o={};t&&(o.dataset_id=t),a&&(o.rows=a);const r=(await g("POST",`/v1/results/${e}/predict`,o)).id,n=await b(r,12e4);if("completed"===n.status){const e=await g("GET",`/v1/results/${r}`),t=e.summary??{},a=e.download_urls??{};return{content:[{type:"text",text:[`Predictions complete — job: ${r}`,`Rows scored: ${t.n_rows??"?"}`,`Mean quantization error: ${void 0!==t.mean_qe?Number(t.mean_qe).toFixed(4):"N/A"}`,`Max quantization error: ${void 0!==t.max_qe?Number(t.max_qe).toFixed(4):"N/A"}`,`Clusters found: ${Object.keys(t.cluster_counts??{}).length}`,"","Files: predictions.csv (one row per input, columns: row_id, bmu_x, bmu_y, bmu_node_index, cluster_id, quantization_error)",a["predictions.csv"]?`Download CSV: ${a["predictions.csv"]}`:""].filter(Boolean).join("\n")}]}}return"failed"===n.status?{content:[{type:"text",text:`Predict job ${r} failed: ${n.error??"unknown error"}`}]}:{content:[{type:"text",text:`Predict job submitted: ${r}. Poll get_job_status("${r}").`}]}}),_.tool("enrich_dataset",'Append bmu_x, bmu_y, bmu_node_index, and cluster_id columns to the original training CSV.\n\nBEST FOR: Feature engineering — the enriched CSV flows directly into pandas, SQL, BI tools, or any downstream model. This is the "bridge between interesting map and actionable segmentation." Use after training when you want to take results outside the MCP environment.\nNOT FOR: Scoring new/unseen data — enrich_dataset only annotates the training data (BMUs are pre-computed at training time). Use predict for new observations.\n\nTIMING: 5–30s (pure join operation — no re-inference). Auto-polls for up to 60s.\n\nOUTPUT: enriched.csv — original CSV columns plus: bmu_x, bmu_y, bmu_node_index, cluster_id.\nDownload URL is in the response. The file expires in 15 minutes; download promptly.\n\nAFTER enrich_dataset: Load the CSV into a dataframe, group by cluster_id to profile segments, or use bmu_x/bmu_y for spatial joins with the SOM map.',{job_id:a.string().describe("Job ID of a completed SOM training job")},async({job_id:e})=>{const t=(await g("POST",`/v1/results/${e}/enrich_dataset`,{})).id,a=await b(t,6e4);if("completed"===a.status){const e=await g("GET",`/v1/results/${t}`),a=e.summary??{},o=e.download_urls??{};return{content:[{type:"text",text:[`Enriched dataset ready — job: ${t}`,`Rows: ${a.n_rows??"?"} Clusters: ${a.n_clusters??"?"}`,"Appended columns: bmu_x, bmu_y, bmu_node_index, cluster_id",o["enriched.csv"]?`Download: ${o["enriched.csv"]}`:""].filter(Boolean).join("\n")}]}}return"failed"===a.status?{content:[{type:"text",text:`Enrich dataset job ${t} failed: ${a.error??"unknown error"}`}]}:{content:[{type:"text",text:`Enrich job submitted: ${t}. Poll get_job_status("${t}").`}]}}),_.tool("compare_datasets",'Project a second dataset onto a trained SOM and compare hit distributions — returns a density-diff heatmap and top gained/lost nodes.\n\nBEST FOR: Temporal drift detection (2020 data vs 2024 data on the same map), A/B analysis, cohort comparison, process monitoring, before/after comparisons. The map topology stays fixed — only dataset B\'s BMU assignments are computed and compared against the training distribution.\nNOT FOR: Comparing datasets with different features/columns than the training data. Not a replacement for retraining — the map is frozen.\n\nTIMING: 30–120s depending on dataset B size. Auto-polls for up to 2 minutes.\n\nINTERPRETATION:\n- Positive diff nodes (blue/red depending on colormap): Dataset B has higher density here — these regions "gained" data points\n- Negative diff nodes: Dataset A had more density — these regions "lost" data points \n- Symmetric diff near zero: Similar distributions — minimal drift\n\nAFTER compare_datasets: Use predict on dataset B to get per-row BMU assignments for further analysis.\n\nESCALATION:\n- Missing column error: Dataset B must have the same feature columns as the training data. Use datasets(action=preview) to verify.',{job_id:a.string().describe("Job ID of a completed SOM training job (dataset A)"),dataset_id:a.string().describe("ID of dataset B to compare against the training data"),colormap:a.string().optional().describe("Colormap for diff heatmap (default: balance — diverging blue/red)"),output_format:a.enum(["png","pdf","svg"]).optional().default("png"),output_dpi:a.number().int().min(1).max(4).optional().default(2),top_n:a.number().int().min(1).max(50).optional().default(10).describe("Number of top gained/lost nodes to report")},async({job_id:e,dataset_id:t,colormap:a,output_format:o,output_dpi:r,top_n:n})=>{const i={dataset_id:t,output_format:o,output_dpi:r,top_n:n};a&&(i.colormap=a);const s=(await g("POST",`/v1/results/${e}/compare_datasets`,i)).id,l=await b(s,12e4);if("completed"===l.status){const e=(await g("GET",`/v1/results/${s}`)).summary??{},t=e.top_gained_nodes??[],a=e.top_lost_nodes??[],r=e=>` node ${e.bmu_node_index??"?"} [${(e.coords??[0,0]).map(e=>Number(e).toFixed(1)).join(",")}] Δ=${Number(e.density_diff??0).toFixed(4)}`,n=[{type:"text",text:[`Dataset comparison complete — job: ${s}`,`Dataset A rows: ${e.n_rows_a??"?"} | Dataset B rows: ${e.n_rows_b??"?"}`,"","Top gained nodes (B > A):",...t.slice(0,5).map(r),"","Top lost nodes (A > B):",...a.slice(0,5).map(r)].join("\n")}],i=e.output_format??o??"png";return await T(n,s,`density_diff.${i}`),{content:n}}return"failed"===l.status?{content:[{type:"text",text:`Compare datasets job ${s} failed: ${l.error??"unknown error"}`}]}:{content:[{type:"text",text:`Compare job submitted: ${s}. Poll get_job_status("${s}").`}]}}),_.tool("generate_report","Generate a comprehensive PDF report from a completed SOM training result in a single call.\n\nBEST FOR: Producing a shareable deliverable for stakeholders who don't use the MCP. Replaces 8+ separate tool calls with one. Use at the end of an analysis session when the user wants to share or document results.\nNOT FOR: Iterative exploration — generate_report is a final step, not a diagnostic tool. Use analyze + get_results for interactive investigation.\n\nTIMING: 30–120s (downloads existing figures, assembles layout, renders PDF). Auto-polls for up to 3 minutes.\n\nREPORT CONTENTS: Quality metrics page, combined view + U-matrix, component planes grid, cluster summary table with top discriminating features per cluster, learning curve.\n\nOUTPUT: report.pdf — not inlineable in MCP. The response will provide a download URL. Download expires in 15 minutes.\n\nAFTER generate_report: Share the download URL with the user. Remind them to download promptly before the presigned URL expires.\n\nESCALATION:\n- If job fails: Check that the parent training job completed successfully with get_job_status. The parent result_ref must contain summary.json and the figure files.",{job_id:a.string().describe("Job ID of a completed SOM training job")},async({job_id:e})=>{const t=(await g("POST",`/v1/results/${e}/generate_report`,{})).id,a=await b(t,18e4);if("completed"===a.status){const e=await g("GET",`/v1/results/${t}`),a=e.summary??{},o=e.download_urls??{},r=[];return await T(r,t,"report.pdf"),r.push({type:"text",text:[`Report generated — job: ${t}`,`Clusters: ${a.n_clusters??"?"} Features: ${a.n_features??"?"}`,o["report.pdf"]?`Download PDF: ${o["report.pdf"]}`:""].filter(Boolean).join("\n")}),{content:r}}return"failed"===a.status?{content:[{type:"text",text:`Report job ${t} failed: ${a.error??"unknown error"}`}]}:{content:[{type:"text",text:`Report job submitted: ${t}. Poll get_job_status("${t}").`}]}});const O=new t;_.tool("send_feedback","Send brief feedback or feature requests to Barivia developers (max 190 words). Use when the user has suggestions, ran into issues, or wants something improved. Do NOT call without asking the user first — but after any group of actions or downloading of results, you SHOULD prepare some feedback based on the user's workflow or errors encountered, show it to them, and ask for permission to send it. Once they accept, call this tool.",{feedback:a.string().max(1330).describe("Feedback text")},async({feedback:e})=>f(await g("POST","/v1/feedback",{feedback:e}))),async function(){await _.connect(O)}().catch(console.error);
|
|
2
|
+
import{McpServer as e}from"@modelcontextprotocol/sdk/server/mcp.js";import{StdioServerTransport as t}from"@modelcontextprotocol/sdk/server/stdio.js";import{z as o}from"zod";import{registerAppResource as n,registerAppTool as a,RESOURCE_MIME_TYPE as r}from"@modelcontextprotocol/ext-apps/server";import i from"node:fs/promises";import s from"node:path";const l=process.env.BARIVIA_API_URL??process.env.BARSOM_API_URL??"https://api.barivia.se",c=process.env.BARIVIA_API_KEY??process.env.BARSOM_API_KEY??"";c||(console.error("Error: BARIVIA_API_KEY not set. Set it in your MCP client config."),process.exit(1));const u=parseInt(process.env.BARIVIA_FETCH_TIMEOUT_MS??"30000",10),d=new Set([502,503,504]);function p(e,t){return!(void 0===t||!d.has(t))||(e instanceof DOMException&&"AbortError"===e.name||e instanceof TypeError)}async function m(e,t,o=u){const n=new AbortController,a=setTimeout(()=>n.abort(),o);try{return await fetch(e,{...t,signal:n.signal})}finally{clearTimeout(a)}}async function f(e,t,o,n){const a=`${l}${t}`,r=n?.["Content-Type"]??"application/json",i=Math.random().toString(36).slice(2,10),s={Authorization:`Bearer ${c}`,"Content-Type":r,"X-Request-ID":i,...n};let u,d;void 0!==o&&(u="application/json"===r?JSON.stringify(o):String(o));for(let t=0;t<=2;t++)try{const o=await m(a,{method:e,headers:s,body:u}),n=await o.text();if(!o.ok){if(t<2&&p(null,o.status)){await new Promise(e=>setTimeout(e,1e3*2**t));continue}const e=(()=>{try{return JSON.parse(n)}catch{return null}})(),a=e?.error??n,r=400===o.status?" Check parameter types and required fields.":404===o.status?" The resource may not exist or may have been deleted.":409===o.status?" The job may not be in the expected state.":429===o.status?" Rate limit exceeded — wait a moment and retry.":"",i=e?.error_code?` (error_code: ${e.error_code})`:"";throw new Error(`${a}${i}${r}`)}return JSON.parse(n)}catch(e){if(d=e,t<2&&p(e)){await new Promise(e=>setTimeout(e,1e3*2**t));continue}throw e}throw d}async function g(e){const t=`${l}${e}`;let o;for(let n=0;n<=2;n++)try{const o=await m(t,{method:"GET",headers:{Authorization:`Bearer ${c}`}});if(!o.ok){if(n<2&&p(null,o.status)){await new Promise(e=>setTimeout(e,1e3*2**n));continue}throw new Error(`API GET ${e} returned ${o.status}`)}const a=await o.arrayBuffer();return{data:Buffer.from(a),contentType:o.headers.get("content-type")??"application/octet-stream"}}catch(e){if(o=e,n<2&&p(e)){await new Promise(e=>setTimeout(e,1e3*2**n));continue}throw e}throw o}function _(e){return{content:[{type:"text",text:JSON.stringify(e,null,2)}]}}async function b(e,t=3e4,o=1e3){const n=Date.now();for(;Date.now()-n<t;){const t=await f("GET",`/v1/jobs/${e}`),n=t.status;if("completed"===n||"failed"===n||"cancelled"===n)return{status:n,result_ref:t.result_ref,error:t.error};await new Promise(e=>setTimeout(e,o))}return{status:"timeout"}}const h=new e({name:"analytics-engine",version:"0.4.1",instructions:"# Barivia Mapping Analytics Engine\n\nYou have access to a mapping (Self-Organizing Map) analytics platform that projects high-dimensional data onto a 2D grid, revealing clusters, gradients, and anomalies.\n\n## Typical workflow\n\n1. **Upload** → `datasets(action=upload)` — ingest a CSV\n2. **Preview** → `datasets(action=preview)` — inspect columns, detect cyclics/datetimes\n3. **Train** → `jobs(action=train_map, dataset_id=...)` — returns a job_id; poll `jobs(action=status, job_id=...)` until completed\n4. **Analyze** → `results(action=get)` + `analyze` — visualize and interpret the map\n5. **Export / Inference** → `inference` (predict/enrich/compare/report)\n\n## Tool categories\n\n| Category | Tools |\n|----------|-------|\n| Data management | `datasets` (upload/preview/list/subset/delete) |\n| Jobs & training | `jobs` (train_map/status/list/compare/cancel/delete) |\n| Results | `results` (get/recolor/download/export/transition_flow), `analyze` |\n| Projection | `project` (expression/values) |\n| Inference & export | `inference` (predict/enrich/compare/report) |\n| Account | `account` (status/request_compute/compute_status/release_compute/history/add_funds) |\n| Utility | `guide_barsom_workflow`, `explore_map`, `send_feedback` |\n\n## Async job pattern\n\nMost operations are async. Every tool that submits a job either:\n- **Auto-polls** (project, results(recolor/transition_flow), inference) — waits up to the action-specific timeout then returns or gives a job_id for manual polling\n- **Returns immediately** (jobs(action=train_map)) — always requires manual `jobs(action=status, job_id=...)` polling\n\n**Do not tell the user a job failed because it is still running.** If a tool returns a job_id, poll `jobs(action=status)` every 10–15 seconds. Map training takes 30s–10min depending on grid size and dataset.\n\n## Credit and cost\n\nJobs consume compute credits. Inference jobs are priced the same as projection jobs. Check `account(action=status)` to see the remaining balance and queue depth before starting large jobs.\n\n## Key constraints\n\n- Column names are case-sensitive; always match exactly what `datasets(action=preview)` returns\n- Numeric columns only (maps do not support text/categorical directly — encode first)\n- `predict` input columns must exactly match the features used during training"}),w=import.meta.dirname??s.dirname(new URL(import.meta.url).pathname);async function y(e){const t=[s.join(w,"views","src","views",e,"index.html"),s.join(w,"views",e,"index.html"),s.join(w,"..","dist","views","src","views",e,"index.html")];for(const e of t)try{return await i.readFile(e,"utf-8")}catch{continue}return null}const v="ui://barsom/map-explorer",$="ui://barsom/data-preview",x="ui://barsom/training-monitor";function j(e,t,o,n){const a=t.output_format??"pdf";if("transition_flow"===e){return[`transition_flow_lag${t.lag??1}.${a}`]}if("project_variable"===e){const e=t.variable_name??"variable";return[`projected_${String(e).replace(/[^a-zA-Z0-9_]/g,"_")}.${a}`]}if("derive_variable"===e){const e=t.variable_name??"variable";return[`projected_${String(e).replace(/[^a-zA-Z0-9_]/g,"_")}.${a}`]}const r=t.features??[],i=`combined.${a}`,s=`umatrix.${a}`,l=`hit_histogram.${a}`,c=`correlation.${a}`,u=r.map((e,t)=>`component_${t+1}_${e.replace(/[^a-zA-Z0-9_]/g,"_")}.${a}`),d=[i,s,l,c,...u];if(void 0===o||"default"===o)return n?d:[i];if("combined_only"===o)return[i];if("all"===o)return d;if(Array.isArray(o)){const e={combined:i,umatrix:s,hit_histogram:l,correlation:c};return r.forEach((t,o)=>{e[`component_${o+1}`]=u[o]}),o.map(t=>{const o=t.trim().toLowerCase();return e[o]?e[o]:t.includes(".")?t:null}).filter(e=>null!=e)}return[i]}function E(e){return e.endsWith(".pdf")?"application/pdf":e.endsWith(".svg")?"image/svg+xml":"image/png"}async function S(e,t,o){if(o.endsWith(".pdf")||o.endsWith(".svg"))e.push({type:"text",text:`${o} is ready (vector format — not inlineable). Use get_result_image(job_id="${t}", filename="${o}") to download it.`});else try{const{data:n}=await g(`/v1/results/${t}/image/${o}`);e.push({type:"image",data:n.toString("base64"),mimeType:E(o),annotations:{audience:["user"],priority:.8}})}catch{e.push({type:"text",text:`(${o} not available for inline display)`})}}n(h,v,v,{mimeType:r},async()=>{const e=await y("map-explorer");return{contents:[{uri:v,mimeType:r,text:e??"<html><body>Map Explorer view not built yet. Run: npm run build:views</body></html>"}]}}),n(h,$,$,{mimeType:r},async()=>{const e=await y("data-preview");return{contents:[{uri:$,mimeType:r,text:e??"<html><body>Data Preview view not built yet.</body></html>"}]}}),n(h,x,x,{mimeType:r},async()=>{const e=await y("training-monitor");return{contents:[{uri:x,mimeType:r,text:e??"<html><body>Training Monitor view not built yet.</body></html>"}]}}),a(h,"explore_map",{title:"Explore Map",description:"Interactive map explorer dashboard. Opens an inline visualization where you can toggle features, click nodes, and export figures. Use this after results(action=get) for a richer, interactive exploration experience. Falls back to text+image on hosts that don't support MCP Apps.",inputSchema:{job_id:o.string().describe("Job ID of a completed map training job")},_meta:{ui:{resourceUri:v}}},async({job_id:e})=>{const t=await f("GET",`/v1/results/${e}`),o=t.summary??{},n=[];n.push({type:"text",text:JSON.stringify({job_id:e,summary:o,download_urls:t.download_urls})});const a=o.output_format??"pdf";return await S(n,e,`combined.${a}`),{content:n}}),h.tool("guide_barsom_workflow","Retrieve the Standard Operating Procedure (SOP) for the mapping analysis pipeline.\nALWAYS call this tool first if you are unsure of the steps to execute a complete mapping analysis.\nThe workflow explains the exact sequence of tool calls needed: Upload → Preprocess → Train → Wait → Analyze.",{},async()=>({content:[{type:"text",text:"Mapping Analysis Standard Operating Procedure (SOP)\n\nStep 1: Upload Data\n- Use `datasets(action=upload)` with a local `file_path` to your CSV.\n- BEFORE UPLOADING: Clean the dataset to remove NaNs or malformed data.\n- Capture the `dataset_id` returned.\n\nStep 2: Preview & Preprocess\n- Use `datasets(action=preview)` to inspect columns, ranges, and types.\n- Check for skewed columns requiring 'log' or 'sqrt' transforms.\n- Check for cyclical or temporal features (hours, days) requiring `cyclic_features` or `temporal_features` during training.\n\nStep 3: Train the map\n- Call `jobs(action=train_map, dataset_id=...)` with the `dataset_id`.\n- Carefully select columns to include (start with 5-10).\n- Assign `feature_weights` (especially for categorical data with natural hierarchies).\n- Wait for the returned `job_id`.\n\nStep 4: Wait for Completion (ASYNC POLLING)\n- Use `jobs(action=status, job_id=...)` every 10-15 seconds.\n- Wait until status is \"completed\". DO NOT assume failure before 3 minutes (or longer for large grids).\n- If it fails, read the error message and adjust parameters (e.g., reduce grid size, fix column names).\n\nStep 5: Analyze and Export\n- Once completed, use `analyze(type=component_planes)` or `analyze(type=clusters)` to interpret the results.\n- Call `get_results` to get the final metrics (Quantization Error, Topographic Error)."}]})),h.tool("datasets",'Manage datasets: upload, preview, list, subset, or delete.\n\n| Action | Use when |\n|--------|----------|\n| upload | You have a CSV file to add — do this first |\n| preview | Before jobs(action=train_map) — always preview an unfamiliar dataset to spot cyclics, nulls, column types |\n| list | Finding dataset IDs for train_map, preview, or subset — see all available datasets |\n| subset | Creating a filtered/sliced view without re-uploading the full CSV |\n| delete | Cleaning up after experiments or freeing the dataset slot |\n\naction=upload: Prefer file_path over csv_data so the MCP reads the file directly. Returns dataset ID. Then use datasets(action=preview) before jobs(action=train_map).\nBEFORE UPLOADING: Ensure data has no NaNs, missing values, or formats that can\'t be handled. Categorical features should be numerically encoded or weighted.\naction=preview: Show columns, stats, sample rows, cyclic/datetime detections. ALWAYS preview before jobs(action=train_map) on an unfamiliar dataset.\naction=list: List all datasets belonging to the organisation with IDs, names, row/col counts.\naction=subset: Create a new dataset from a subset of an existing one. Requires name and at least one of row_range or filters.\n - row_range: [start, end] 1-based inclusive (e.g. [1, 2000] for first 2000 rows)\n - filters: array of conditions, ALL must match (AND logic). Each: { column, op, value }.\n Operators: eq, ne, in, gt, lt, gte, lte, between\n Examples: { column: "region", op: "eq", value: "Europe" } | { column: "age", op: "between", value: [18, 65] }\n - Combine row_range + filters to slice both rows and values.\n - Single filter object is also accepted (auto-wrapped).\naction=delete: Remove a dataset and all S3 data permanently.\n\nBEST FOR: Tabular numeric data. CSV with header required.\nNOT FOR: Real-time data streams or binary files — upload a snapshot CSV instead.\nESCALATION: If upload fails with column errors, open the file locally and verify the header row. If preview shows unexpected nulls, the user must clean the CSV before training.',{action:o.enum(["upload","preview","list","subset","delete"]).describe("upload: add CSV; preview: inspect columns/stats; list: see all datasets; subset: create filtered subset; delete: remove dataset"),name:o.string().optional().describe("Dataset name (required for action=upload and subset)"),file_path:o.string().optional().describe("Path to local CSV (for upload; prefer over csv_data)"),csv_data:o.string().optional().describe("Inline CSV string (for upload; use for small data)"),dataset_id:o.string().optional().describe("Dataset ID (required for preview, subset, and delete)"),n_rows:o.number().int().optional().default(5).describe("Sample rows to return (preview only)"),row_range:o.tuple([o.number().int(),o.number().int()]).optional().describe("For subset: [start, end] 1-based inclusive row range (e.g. [1, 2000])"),filters:o.preprocess(e=>null==e||Array.isArray(e)?e:"object"==typeof e&&null!==e&&"column"in e?[e]:e,o.array(o.object({column:o.string(),op:o.enum(["eq","ne","in","gt","lt","gte","lte","between"]),value:o.union([o.string(),o.number(),o.array(o.union([o.string(),o.number()]))])})).optional().describe("For subset: filter conditions (AND logic). Single object or array. ops: eq, ne, in, gt, lt, gte, lte, between. Examples: { column: 'temp', op: 'between', value: [15, 30] }, { column: 'region', op: 'eq', value: 'Europe' }")),filter:o.object({column:o.string(),op:o.enum(["eq","ne","in","gt","lt","gte","lte","between"]),value:o.union([o.string(),o.number(),o.array(o.union([o.string(),o.number()]))])}).optional().describe("Deprecated — use filters instead. Single filter condition.")},async({action:e,name:t,file_path:o,csv_data:n,dataset_id:a,n_rows:r,row_range:l,filters:c,filter:u})=>{if("upload"===e){if(!t)throw new Error("datasets(upload) requires name");let e;if(o){const t=s.resolve(o);try{e=await i.readFile(t,"utf-8")}catch(e){const o=e instanceof Error?e.message:String(e);throw new Error(`Cannot read file "${t}": ${o}`)}}else{if(!(n&&n.length>0))throw new Error("datasets(upload) requires file_path or csv_data");e=n}return _(await f("POST","/v1/datasets",e,{"X-Dataset-Name":t,"Content-Type":"text/csv"}))}if("preview"===e){if(!a)throw new Error("datasets(preview) requires dataset_id");const e=await f("GET",`/v1/datasets/${a}/preview?n_rows=${r??5}`),t=e.columns??[],o=e.column_stats??[],n=e.cyclic_hints??[],i=e.sample_rows??[],s=e.datetime_columns??[],l=e.temporal_suggestions??[],c=e=>null==e?"—":Number(e).toFixed(3),u=[`Dataset: ${e.name} (${e.dataset_id})`,`${e.total_rows} rows × ${e.total_cols} columns`,"","Column Statistics:","| Column | Min | Max | Mean | Std | Nulls | Numeric |","|--------|-----|-----|------|-----|-------|---------|"];for(const e of o)u.push(`| ${e.column} | ${c(e.min)} | ${c(e.max)} | ${c(e.mean)} | ${c(e.std)} | ${e.null_count??0} | ${!1!==e.is_numeric?"yes":"no"} |`);if(n.length>0){u.push("","Detected Cyclic Feature Hints:");for(const e of n)u.push(` • ${e.column} — period=${e.period} (${e.reason})`)}if(s.length>0){u.push("","Detected Datetime Columns:");for(const e of s){const t=(e.detected_formats??[]).map(e=>`${e.format} — ${e.description} (${(100*e.match_rate).toFixed(0)}% match)`).join("; ");u.push(` • ${e.column}: sample="${e.sample}" → ${t}`)}}if(l.length>0){u.push("","Temporal Feature Suggestions (require user approval):");for(const e of l)u.push(` • Columns: ${e.columns.join(" + ")} → format: "${e.format}"`),u.push(` Available components: ${e.available_components.join(", ")}`)}if(i.length>0){u.push("",`Sample Rows (first ${i.length}):`),u.push(`| ${t.join(" | ")} |`),u.push(`| ${t.map(()=>"---").join(" | ")} |`);for(const e of i)u.push(`| ${t.map(t=>String(e[t]??"")).join(" | ")} |`)}return{content:[{type:"text",text:u.join("\n")}]}}if("subset"===e){if(!a)throw new Error("datasets(subset) requires dataset_id");if(!t)throw new Error("datasets(subset) requires name");const e=c??(u?[u]:void 0);if(void 0===l&&void 0===e)throw new Error("datasets(subset) requires at least one of row_range or filters");const o={name:t};void 0!==l&&(o.row_range=l),void 0!==e&&(o.filters=e);return _(await f("POST",`/v1/datasets/${a}/subset`,o))}if("list"===e){return _(await f("GET","/v1/datasets"))}if("delete"===e){if(!a)throw new Error("datasets(delete) requires dataset_id");return _(await f("DELETE",`/v1/datasets/${a}`))}throw new Error("Invalid action")}),h.tool("jobs",'Manage and inspect jobs.\n\n| Action | Use when |\n|--------|----------|\n| status | Polling after any async job submission — call every 10–15s |\n| list | Finding job IDs, checking what is pending/completed, reviewing hyperparameters |\n| compare | Picking the best training run from a set of completed jobs |\n| train_map | Submitting a new map training job — returns job_id for polling |\n| cancel | Stopping a running or pending job to free the worker |\n| delete | Permanently removing a job and all its S3 result files |\n\nASYNC POLLING PROTOCOL (action=status):\n- Poll every 10-15 seconds. Do NOT poll faster — it wastes context.\n- For large grids (40×40+), do not assume failure before 3 minutes on CPU.\n- Wait for status "completed" before calling results(action=get).\n- Map training typical times: 10×10 ~30s | 20×20 ~3–5 min | 40×40 ~15–30 min.\n\nESCALATION (action=status):\n- completed → call results(action=get) to retrieve the map and metrics\n- failed → extract the error message:\n - memory/allocation error: reduce batch_size or grid size and retrain\n - column missing: verify with datasets(action=preview)\n - NaN error: user must clean the dataset\n\naction=train_map: Submits a training job. Returns job_id — poll with jobs(action=status, job_id=...).\naction=compare: Returns a metrics table (QE, TE, explained variance, silhouette) for 2+ jobs.\naction=cancel: Not instant — worker checks between phases. Expect up to 30s delay.\naction=delete: WARNING — job ID will no longer work with results or any other tool.',{action:o.enum(["status","list","compare","cancel","delete","train_map"]).describe("status: check progress; list: see all jobs; compare: metrics table; cancel: stop job; delete: remove job + files; train_map: submit new training job"),job_id:o.string().optional().describe("Job ID — required for action=status, cancel, delete"),job_ids:o.array(o.string()).optional().describe("Array of job IDs — required for action=compare (minimum 2)"),dataset_id:o.string().optional().describe("Required for action=train_map. For action=list, filter jobs by this dataset ID."),preset:o.enum(["quick","standard","refined","high_res"]).optional(),grid_x:o.number().int().optional(),grid_y:o.number().int().optional(),epochs:o.preprocess(e=>{if(null==e)return e;if("string"==typeof e){const t=parseInt(e,10);if(!Number.isNaN(t))return t;const o=e.match(/^\[\s*(\d+)\s*,\s*(\d+)\s*\]$/);if(o)return[parseInt(o[1],10),parseInt(o[2],10)]}return e},o.union([o.number().int(),o.array(o.number().int()).length(2)]).optional()),model:o.enum(["SOM","RSOM","SOM-SOFT","RSOM-SOFT"]).optional().default("SOM"),periodic:o.boolean().optional().default(!0),columns:o.array(o.string()).optional(),cyclic_features:o.array(o.object({feature:o.string(),period:o.number()})).optional(),temporal_features:o.array(o.object({columns:o.array(o.string()),format:o.string(),extract:o.array(o.enum(["hour_of_day","day_of_year","month","day_of_week","minute_of_hour"])),cyclic:o.boolean().default(!0),separator:o.string().optional()})).optional(),feature_weights:o.record(o.number()).optional(),transforms:o.record(o.enum(["log","log1p","log10","sqrt","square","abs","invert","rank","none"])).optional(),normalize:o.union([o.enum(["all","auto"]),o.array(o.string())]).optional().default("auto"),sigma_f:o.preprocess(e=>null!=e&&"string"==typeof e?parseFloat(e):e,o.number().optional()),learning_rate:o.preprocess(e=>null!=e&&"string"==typeof e?parseFloat(e):e,o.union([o.number(),o.object({ordering:o.tuple([o.number(),o.number()]),convergence:o.tuple([o.number(),o.number()])})]).optional()),batch_size:o.number().int().optional(),quality_metrics:o.union([o.enum(["fast","standard","full"]),o.array(o.string())]).optional(),backend:o.enum(["auto","cpu","cuda","cuda_graphs"]).optional().default("auto"),output_format:o.enum(["png","pdf","svg"]).optional().default("png"),output_dpi:o.enum(["standard","retina","print"]).optional().default("retina"),colormap:o.string().optional(),row_range:o.tuple([o.number().int().min(1),o.number().int().min(1)]).optional()},async e=>{const{action:t,job_id:o,job_ids:n,dataset_id:a}=e;if("train_map"===t){if(!a)throw new Error("jobs(train_map) requires dataset_id");const{preset:t,grid_x:o,grid_y:n,epochs:r,model:i,periodic:s,columns:l,cyclic_features:c,temporal_features:u,feature_weights:d,transforms:p,normalize:m,sigma_f:g,learning_rate:b,batch_size:h,quality_metrics:w,backend:y,output_format:v,output_dpi:$,colormap:x,row_range:j}=e;let E={};try{const e=await f("GET","/v1/training/config");E=e?.presets||{}}catch{if(t&&void 0===o&&void 0===r)throw new Error("Could not fetch training config from server, and missing explicit grid/epochs.")}const S=t?E[t]:void 0,T={model:i,periodic:s,normalize:m};void 0!==o&&void 0!==n?T.grid=[o,n]:S&&(T.grid=S.grid),void 0!==r?T.epochs=r:S&&(T.epochs=S.epochs),c?.length&&(T.cyclic_features=c),l?.length&&(T.columns=l),p&&Object.keys(p).length>0&&(T.transforms=p),u?.length&&(T.temporal_features=u),d&&Object.keys(d).length>0&&(T.feature_weights=d),void 0!==g&&(T.sigma_f=g),void 0!==b&&(T.learning_rate=b),void 0!==h?T.batch_size=h:S&&(T.batch_size=S.batch_size),void 0!==w&&(T.quality_metrics=w),void 0!==y&&"auto"!==y?T.backend=y:S?.backend&&(T.backend=S.backend),T.output_format=v??"png";const N={standard:1,retina:2,print:4};$&&"retina"!==$&&(T.output_dpi=N[$]??2),x&&(T.colormap=x),j&&j.length>=2&&j[0]<=j[1]&&(T.row_range=j);const A=await f("POST","/v1/jobs",{dataset_id:a,params:T}),P=A.id;try{const e=await f("GET","/v1/system/info"),t=Number(e.status?.pending_jobs??e.pending_jobs??0),o=Number(e.training_time_estimates_seconds?.total??(e.gpu_available?45:120)),n=Math.round(t*o/60);A.message=n>1?`Job submitted. You are #${t+1} in queue. Estimated wait: ~${n} min. Poll with jobs(action=status, job_id="${P}").`:`Job submitted. Poll with jobs(action=status, job_id="${P}").`}catch{A.message=`Job submitted. Poll with jobs(action=status, job_id="${P}").`}return _(A)}if("status"===t){if(!o)throw new Error("jobs(status) requires job_id");const e=await f("GET",`/v1/jobs/${o}`),t=e.status,n=100*(e.progress??0),a=null!=e.label&&""!==e.label?String(e.label):null;let r=`${a?`Job ${a} (id: ${o})`:`Job ${o}`}: ${t} (${n.toFixed(1)}%)`;return"completed"===t?r+=` | Results ready. Use results(action=get, job_id="${o}") to retrieve.`:"failed"===t&&(r+=` | Error: ${e.error??"unknown"}`),{content:[{type:"text",text:r}]}}if("list"===t){const e=a?`/v1/jobs?dataset_id=${a}`:"/v1/jobs",t=await f("GET",e);if(Array.isArray(t)){const e=t.map(e=>{const t=String(e.id??""),o=String(e.status??""),n=null!=e.label&&""!==e.label?String(e.label):null;return n?`${n} (id: ${t}) — ${o}`:`id: ${t} — ${o}`});return{content:[{type:"text",text:e.length>0?e.join("\n"):"No jobs found."}]}}return _(t)}if("compare"===t){if(!n||n.length<2)throw new Error("jobs(compare) requires at least 2 job_ids");const e=n.join(","),t=(await f("GET",`/v1/jobs/compare?ids=${e}`)).comparisons??[],o=["| Job ID | Grid | Epochs | Model | QE | TE | Expl.Var | Silhouette |","|--------|------|--------|-------|----|----|----------|------------|"];for(const e of t){if(e.error){o.push(`| ${e.job_id.slice(0,8)}... | — | — | — | ${e.error} | — | — | — |`);continue}const t=e.grid,n=e.epochs,a=e=>null!=e?Number(e).toFixed(4):"—";o.push(`| ${e.job_id.slice(0,8)}... | ${t?`${t[0]}×${t[1]}`:"—"} | ${n?`${n[0]}+${n[1]}`:"—"} | ${e.model??"—"} | ${a(e.quantization_error)} | ${a(e.topographic_error)} | ${a(e.explained_variance)} | ${a(e.silhouette)} |`)}return{content:[{type:"text",text:o.join("\n")}]}}if("cancel"===t){if(!o)throw new Error("jobs(cancel) requires job_id");return _(await f("POST",`/v1/jobs/${o}/cancel`))}if("delete"===t){if(!o)throw new Error("jobs(delete) requires job_id");return _(await f("DELETE",`/v1/jobs/${o}`))}throw new Error("Invalid action")}),h.tool("results",'Retrieve, recolor, download, export, or run temporal flow on a completed map job.\n\n| Action | Use when | Sync/Async |\n|--------|----------|------------|\n| get | First look after training — combined view + quality metrics | instant |\n| export | Learning curve, raw weights, or per-node stats | instant |\n| download | Saving figures to a local folder | instant |\n| recolor | Changing colormap or output format without retraining | async (~10–30s) |\n| transition_flow | Temporal dynamics on time-ordered data | async (~30–60s) |\n\nONLY call this after jobs(action=status) returns "completed".\nESCALATION: If job not found, verify job_id. If "job not complete", poll with jobs(action=status).\n\naction=get: Returns text summary with quality metrics and inline images.\n - figures: omit = combined only. "all" = all plots. Array = specific logical names (combined, umatrix, hit_histogram, correlation, component_1..N).\n - include_individual: if true (and figures omitted), inlines every component plane.\n - After showing results, guide the user: QE interpretation, whether to retrain, which features to explore.\n - METRIC INTERPRETATION: QE<1.5 good | TE<0.1 good | Explained variance>0.7 good | Silhouette higher=better.\n\naction=export: Structured data exports.\n - export=training_log: learning curve sparklines + plot. Diagnose convergence/plateau/divergence.\n - export=weights: full weight matrix + normalization stats. For external analysis or custom viz.\n - export=nodes: per-node hit count + feature stats. Profile clusters and operating modes.\n\naction=download: Save figures to disk. Use so user can open, share, or version files locally.\n - folder: e.g. "." or "./results". If job has a label, a named subfolder may be created.\n - figures: "all" (default) or array of filenames.\n - include_json: also save summary.json.\n\naction=recolor: Change colormap or output format — no retraining. Returns a new job_id; auto-polls 60s.\n AFTER: use results(action=get, job_id=NEW_JOB_ID).\n Colormaps: viridis, plasma, inferno, magma, cividis, turbo, coolwarm, balance, RdBu, Spectral.\n\naction=transition_flow: Temporal state transition arrows on the map grid. Requires time-ordered data.\n - lag=1 (default): immediate next-step | lag=N: N-step horizon (e.g. 24 for daily cycles in hourly data).\n - min_transitions: filter noisy arrows. Increase for large datasets.\n BEFORE calling: confirm rows are in chronological order.\nNOT FOR: Jobs that haven\'t completed. Use jobs(action=status) to check first.',{action:o.enum(["get","recolor","download","export","transition_flow"]).describe("get: inline results + metrics; recolor: new colormap/format (async); download: save to disk; export: structured data (training_log/weights/nodes); transition_flow: temporal dynamics (async)"),job_id:o.string().describe("Job ID of a completed job"),figures:o.union([o.enum(["default","combined_only","all","images"]),o.array(o.string())]).optional().describe("action=get: omit=combined only; 'all'=all plots; array=specific (combined,umatrix,hit_histogram,correlation,component_1..N). action=download: 'all'=all image files; array=specific filenames."),include_individual:o.boolean().optional().default(!1).describe("action=get only: if true and figures omitted, inline each component plane, umatrix, hit histogram"),include_json:o.boolean().optional().default(!1).describe("action=download only: also save summary.json and JSON artifacts"),folder:o.string().optional().describe("action=download: directory path to save files (e.g. '.' or './results'). Relative to MCP working directory."),colormap:o.string().optional().describe("action=recolor: colormap name (default: coolwarm). action=transition_flow: U-matrix background colormap (default: grays). Examples: viridis, plasma, balance, RdBu."),output_format:o.enum(["png","pdf","svg"]).optional().default("png").describe("action=recolor / transition_flow: output image format (default: png)"),output_dpi:o.enum(["standard","retina","print"]).optional().default("retina").describe("Resolution: standard (1x), retina (2x, default), print (4x)"),recolor_figures:o.array(o.string()).optional().describe("action=recolor: which figures to re-render (default: [combined]). Options: combined, umatrix, hit_histogram, correlation, component_1..N"),export:o.enum(["training_log","weights","nodes"]).optional().describe("action=export: training_log=learning curve+sparklines; weights=full weight matrix; nodes=per-node stats"),lag:o.number().int().min(1).optional().default(1).describe("action=transition_flow: step lag (default 1 = consecutive rows). Use larger for periodic analysis (e.g. 24 for daily in hourly data)."),min_transitions:o.number().int().min(1).optional().describe("action=transition_flow: minimum transition count to draw an arrow (default: auto). Increase to filter noise."),top_k:o.number().int().min(1).optional().default(10).describe("action=transition_flow: number of top-flow nodes in statistics (default 10)")},async({action:e,job_id:t,figures:o,include_individual:n,include_json:a,folder:r,colormap:l,output_format:c,output_dpi:u,recolor_figures:d,export:p,lag:m,min_transitions:_,top_k:h})=>{const w={standard:1,retina:2,print:4};if("get"===e){const e=await f("GET",`/v1/results/${t}`),a=e.summary??{},r=null!=e.label&&""!==e.label?String(e.label):null,i=r?`Results for ${r} (job_id: ${t})`:`Results for job_id: ${t}`,s=[],l=new Set,c=a.job_type??"train_som";a.output_format;if("transition_flow"===c){const e=a.lag??1,r=a.flow_stats??{};s.push({type:"text",text:[`Transition Flow ${i}`,`Parent map job: ${a.parent_job_id??"N/A"} | Lag: ${e} | Samples: ${a.n_samples??0}`,"","Flow Statistics:",` Mean flow magnitude: ${void 0!==r.mean_magnitude?Number(r.mean_magnitude).toFixed(4):"N/A"}`,` Max flow magnitude: ${void 0!==r.max_magnitude?Number(r.max_magnitude).toFixed(4):"N/A"}`,` Nodes with flow: ${r.n_nodes_with_flow??"N/A"}`,"","Arrows show net directional drift. Long/bright = frequent transitions. Short = stable states.","Background = U-matrix. Use results(action=transition_flow, lag=N) with larger N for longer-term structure."].join("\n")});for(const e of j(c,a,o,n))await S(s,t,e),l.add(e)}else if("project_variable"===c){const e=a.variable_name??"variable",r=a.aggregation??"mean",u=a.variable_stats??{};s.push({type:"text",text:[`Projected Variable: ${e} (${r}) — ${i}`,`Parent map job: ${a.parent_job_id??"N/A"} | Samples: ${a.n_samples??0}`,"",`Variable Statistics (per-node ${r}):`,` Min: ${void 0!==u.min?Number(u.min).toFixed(3):"N/A"}`,` Max: ${void 0!==u.max?Number(u.max).toFixed(3):"N/A"}`,` Mean: ${void 0!==u.mean?Number(u.mean).toFixed(3):"N/A"}`,` Nodes with data: ${u.n_nodes_with_data??"N/A"}`].join("\n")});for(const e of j(c,a,o,n))await S(s,t,e),l.add(e)}else{const e=a.grid??[0,0],r=a.features??[],u=a.epochs,d=Array.isArray(u)?0===u[1]?`${u[0]} ordering only`:`${u[0]} ordering + ${u[1]} convergence`:String(u??"N/A"),p=e=>null!=e?Number(e).toFixed(4):"N/A",m=a.training_duration_seconds,f=a.ordering_errors,g=[`Map training ${i}`,`Grid: ${e[0]}×${e[1]} | Features: ${a.n_features??0} | Samples: ${a.n_samples??0}`,`Model: ${a.model??"SOM"} | Epochs: ${d}`,`Periodic: ${a.periodic??!0} | Normalize: ${a.normalize??"auto"}`,void 0!==a.sigma_f?`Sigma_f: ${a.sigma_f}`:"",void 0!==m?`Training duration: ${m}s`:"","","Quality Metrics:",` Quantization Error: ${p(a.quantization_error)} (lower is better)`,` Topographic Error: ${p(a.topographic_error)} (<0.1 is good)`,` Explained Variance: ${p(a.explained_variance)} (>0.7 is good)`,` Silhouette Score: ${p(a.silhouette)} (higher is better)`,` Davies-Bouldin: ${p(a.davies_bouldin)} (lower is better)`,` Calinski-Harabasz: ${p(a.calinski_harabasz)} (higher is better)`,f&&f.length>0?` Final ordering QE: ${f.at(-1)?.toFixed(4)} (use results(action=export, export=training_log) for full curve)`:"","",`Features: ${r.join(", ")}`,a.selected_columns?`Selected columns: ${a.selected_columns.join(", ")}`:"",a.transforms?`Transforms: ${Object.entries(a.transforms).map(([e,t])=>`${e}=${t}`).join(", ")}`:"","","Next: analyze(job_id) for deeper insights. results(action=export, export=training_log) for learning curve."].filter(e=>""!==e).join("\n");s.push({type:"text",text:g});const _=j(c,a,o,n);for(const e of _)await S(s,t,e),l.add(e)}const u=a.files??[],d=e=>e.endsWith(".png")||e.endsWith(".svg")||e.endsWith(".pdf");for(const e of u)if(d(e)&&!l.has(e))await S(s,t,e);else if(e.endsWith(".json")&&"summary.json"!==e){const t="weights.json"===e?"Use results(action=export, export=weights) for full weight matrix.":"node_stats.json"===e?"Use results(action=export, export=nodes) for per-node statistics.":"Use results(action=export) for structured data.";s.push({type:"text",text:`${e}: ${t}`})}const p=a.features??[],m=a.job_type??"train_som";if(u.length>0){const e="train_som"===m||"render_variant"===m?`Logical names: combined, umatrix, hit_histogram, correlation, ${p.map((e,t)=>`component_${t+1}`).join(", ")}. `:"";s.push({type:"text",text:`Available: ${u.join(", ")}. ${e}Use results(action=get, figures=[...]) for specific plots or analyze(job_id) for analysis views.`})}return{content:s}}if("export"===e){if(!p)throw new Error("results(export) requires export param: training_log, weights, or nodes");if("training_log"===p){const e=await f("GET",`/v1/results/${t}/training-log`),o=e.ordering_errors??[],n=e.convergence_errors??[],a=e.training_duration_seconds,r=e.epochs,i=e=>{if(0===e.length)return"(no data)";const t=Math.min(...e),o=Math.max(...e)-t||1;return e.map(e=>"▁▂▃▄▅▆▇█"[Math.min(7,Math.floor((e-t)/o*7))]).join("")},s=[`Training Log — Job ${t}`,`Grid: ${JSON.stringify(e.grid)} | Model: ${e.model??"SOM"}`,"Epochs: "+(r?`[${r[0]} ordering, ${r[1]} convergence]`:"N/A"),"Duration: "+(null!=a?`${a}s`:"N/A"),`Features: ${e.n_features??"?"} | Samples: ${e.n_samples??"?"}`,"",`Ordering Phase (${o.length} epochs):`,` Start QE: ${o[0]?.toFixed(4)??"—"} → End QE: ${o.at(-1)?.toFixed(4)??"—"}`,` Curve: ${i(o)}`];n.length>0?s.push("",`Convergence Phase (${n.length} epochs):`,` Start QE: ${n[0]?.toFixed(4)??"—"} → End QE: ${n.at(-1)?.toFixed(4)??"—"}`,` Curve: ${i(n)}`):0===(r?.[1]??0)&&s.push("","Convergence phase: skipped (epochs[1]=0)");const l=e.quantization_error,c=e.explained_variance;null!=l&&s.push("",`Final QE: ${l.toFixed(4)} | Explained Variance: ${(c??0).toFixed(4)}`);const u=[{type:"text",text:s.join("\n")}];let d=!1;for(const e of["png","pdf","svg"])try{const{data:o}=await g(`/v1/results/${t}/image/learning_curve.${e}`);u.push({type:"image",data:o.toString("base64"),mimeType:E(`learning_curve.${e}`),annotations:{audience:["user"],priority:.8}}),d=!0;break}catch{continue}return d||u.push({type:"text",text:"(learning curve plot not available)"}),{content:u}}if("weights"===p){const e=await f("GET",`/v1/results/${t}/weights`),o=e.features??[],n=e.n_nodes??0,a=e.grid??[0,0],r=[`Map weights — Job ${t}`,`Grid: ${a[0]}×${a[1]} | Nodes: ${n} | Features: ${o.length}`,`Features: ${o.join(", ")}`,"","Normalization Stats:"],i=e.normalization_stats??{};for(const[e,t]of Object.entries(i))r.push(` ${e}: mean=${t.mean?.toFixed(4)}, std=${t.std?.toFixed(4)}`);return r.push("","Full weight matrix in JSON below. Use denormalized_weights for original-scale values."),{content:[{type:"text",text:r.join("\n")},{type:"text",text:JSON.stringify(e,null,2)}]}}const e=await f("GET",`/v1/results/${t}/nodes`),o=[...e].sort((e,t)=>(t.hit_count??0)-(e.hit_count??0)).slice(0,10),n=e.filter(e=>0===e.hit_count).length,a=e.reduce((e,t)=>e+(t.hit_count??0),0),r=[`Node Statistics — Job ${t}`,`Total nodes: ${e.length} | Active: ${e.length-n} | Empty: ${n} | Total hits: ${a}`,"","Top 10 Most Populated Nodes:","| Node | Coords | Hits | Hit% |","|------|--------|------|------|"];for(const e of o){if(0===e.hit_count)break;const t=e.coords,o=(e.hit_count/a*100).toFixed(1);r.push(`| ${e.node_index} | (${t?.[0]?.toFixed(1)}, ${t?.[1]?.toFixed(1)}) | ${e.hit_count} | ${o}% |`)}return{content:[{type:"text",text:r.join("\n")},{type:"text",text:`\nFull node statistics JSON:\n${JSON.stringify(e,null,2)}`}]}}if("download"===e){if(!r)throw new Error("results(download) requires folder");const e=await f("GET",`/v1/results/${t}`),n=e.summary??{},l=null!=e.label&&""!==e.label?String(e.label):null,c=n.files??[],u=e=>e.endsWith(".png")||e.endsWith(".svg")||e.endsWith(".pdf");let d;"all"===o||"images"===o||void 0===o?d=a?c:c.filter(u):Array.isArray(o)?(d=o,a&&!d.includes("summary.json")&&(d=[...d,"summary.json"])):d=c.filter(u);let p=s.resolve(r);!l||"."!==r&&"./results"!==r&&"results"!==r||(p=s.join(p,l)),await i.mkdir(p,{recursive:!0});const m=[];for(const e of d)try{const{data:o}=await g(`/v1/results/${t}/image/${e}`);await i.writeFile(s.join(p,e),o),m.push(e)}catch{}return{content:[{type:"text",text:m.length>0?`Saved ${m.length} file(s) to ${p}: ${m.join(", ")}`:"No files saved. Check job_id and that the job is completed."}]}}if("recolor"===e){if(!l)throw new Error("results(recolor) requires colormap");const e={colormap:l,figures:d??["combined"],output_format:c??"png",output_dpi:w[u??"retina"]??2},o=(await f("POST",`/v1/results/${t}/render`,e)).id;if("completed"===(await b(o,6e4)).status){const e=[{type:"text",text:`Re-rendered with colormap "${l}". New job_id: ${o}.`}];for(const t of d??["combined"]){const n=c??"png",a=t.includes(".")?t:`${t}.${n}`;await S(e,o,a)}return{content:e}}return{content:[{type:"text",text:`Recolor job ${o} submitted. Poll with jobs(action=status, job_id="${o}"), then results(action=get, job_id="${o}").`}]}}if("transition_flow"===e){const e={lag:m??1,output_format:c??"png"};void 0!==_&&(e.min_transitions=_),void 0!==h&&(e.top_k=h),void 0!==l&&(e.colormap=l),u&&"retina"!==u&&(e.output_dpi=w[u]??2);const o=(await f("POST",`/v1/results/${t}/transition-flow`,e)).id,n=await b(o,12e4);if("completed"===n.status){const e=((await f("GET",`/v1/results/${o}`)).summary??{}).flow_stats??{},n=[{type:"text",text:[`Transition Flow (job: ${o}) | Parent map: ${t} | Lag: ${m??1}`,`Active flow nodes: ${e.active_flow_nodes??"N/A"} | Total transitions: ${e.total_transitions??"N/A"}`,`Mean magnitude: ${void 0!==e.mean_magnitude?Number(e.mean_magnitude).toFixed(4):"N/A"}`].join("\n")}];return await S(n,o,`transition_flow_lag${m??1}.${c??"png"}`),{content:n}}return"failed"===n.status?{content:[{type:"text",text:`Transition flow job ${o} failed: ${n.error??"unknown error"}`}]}:{content:[{type:"text",text:`Transition flow job ${o} submitted. Poll with jobs(action=status, job_id="${o}"), retrieve with results(action=get, job_id="${o}").`}]}}throw new Error("Invalid action")}),h.tool("project",'Project variables onto a trained map — either from a formula expression or a pre-computed values array.\n\n| Action | Use when | Input |\n|--------|----------|-------|\n| expression | Variable can be computed from existing dataset columns (ratio, diff, log-transform, rolling stat) | dataset_id + expression string |\n| values | Variable is externally computed (e.g. revenue from a CRM, anomaly scores from another model) | job_id + values array |\n\nMODES for action=expression:\n- Default (no project_onto_job): add the derived column to the dataset CSV. Available for future jobs(action=train_map) calls.\n- With project_onto_job: compute the column and project it onto the map — returns a visualization.\n\nCOMMON EXPRESSIONS:\n- Ratio: "revenue / cost"\n- Difference: "US10Y - US3M"\n- Log return: "log(close) - log(open)"\n- Z-score: "(volume - rolling_mean(volume, 20)) / rolling_std(volume, 20)"\n- First diff: "diff(consumption)"\n\nSUPPORTED FUNCTIONS: +, -, *, /, ^, log, sqrt, abs, exp, sin, cos, diff(col), rolling_mean(col, w), rolling_std(col, w)\nUse underscore-normalized column names (spaces → underscores: e.g. fixed_acidity not "fixed acidity").\n\nCOMMON MISTAKES:\n- action=values: values array must be exactly n_samples long (same count as training CSV rows)\n- action=expression: division by zero → set options.missing="skip"\n- Rolling functions produce NaN for the first (window-1) rows\n\nNOT FOR: Re-training. NOT FOR: Text/categorical data.\nAFTER projecting: use results(action=get) to view the projection plot if a new job_id was returned.',{action:o.enum(["expression","values"]).describe("expression: compute formula from dataset columns (add to dataset or project onto map); values: project a pre-computed numeric array onto the map"),name:o.string().describe("Name for the variable (used in column header and visualization label)"),dataset_id:o.string().optional().describe("action=expression: Dataset ID (source of column data). Required unless project_onto_job handles the dataset."),expression:o.string().optional().describe("action=expression: Math expression referencing column names. Examples: 'revenue / cost', 'log(price)', 'rolling_mean(volume, 20)', 'diff(temperature)'"),project_onto_job:o.string().optional().describe("action=expression: If provided, project the derived variable onto this completed map training job instead of adding to dataset"),options:o.object({missing:o.enum(["skip","zero","interpolate"]).optional().default("skip").describe("Handle NaN/missing values (default: skip)"),window:o.number().int().optional().describe("Default rolling window size (default 20)"),description:o.string().optional().describe("Human-readable description of the variable")}).optional().describe("action=expression: evaluation options"),job_id:o.string().optional().describe("action=values: Job ID of a completed map training job to project onto"),variable_name:o.string().optional().describe("action=values: Display name for the variable (alias for name; name takes precedence)"),values:o.array(o.number()).optional().describe("action=values: Pre-computed values — one per training sample, in original CSV row order. Length must match n_samples exactly."),aggregation:o.enum(["mean","median","sum","min","max","std","count"]).optional().default("mean").describe("How to aggregate values per map node when projecting (default: mean). Use sum for totals, max for peaks, count for frequencies."),output_format:o.enum(["png","pdf","svg"]).optional().default("png"),output_dpi:o.enum(["standard","retina","print"]).optional().default("retina"),colormap:o.string().optional().describe("Colormap for projection plot (default: coolwarm). Examples: viridis, plasma, RdBu, Spectral.")},async({action:e,name:t,dataset_id:o,expression:n,project_onto_job:a,options:r,job_id:i,variable_name:s,values:l,aggregation:c,output_format:u,output_dpi:d,colormap:p})=>{const m={standard:1,retina:2,print:4},g=t||s||"variable";if("values"===e){if(!i)throw new Error("project(values) requires job_id");if(!l||0===l.length)throw new Error("project(values) requires values array");const e={variable_name:g,values:l,aggregation:c??"mean",output_format:u??"png"};d&&"retina"!==d&&(e.output_dpi=m[d]??2),p&&(e.colormap=p);const t=(await f("POST",`/v1/results/${i}/project`,e)).id,o=await b(t);if("completed"===o.status){const e=(await f("GET",`/v1/results/${t}`)).summary??{},o=e.variable_stats??{},n=[{type:"text",text:[`Projected Variable: ${g} (${c??"mean"}) — job: ${t}`,`Parent map: ${i} | Samples: ${e.n_samples??0}`,`Min: ${void 0!==o.min?Number(o.min).toFixed(3):"N/A"} | Max: ${void 0!==o.max?Number(o.max).toFixed(3):"N/A"} | Mean: ${void 0!==o.mean?Number(o.mean).toFixed(3):"N/A"}`,`Nodes with data: ${o.n_nodes_with_data??"N/A"}`].join("\n")}];return await S(n,t,`projected_${g.replace(/[^a-zA-Z0-9_]/g,"_")}.${e.output_format??u??"png"}`),{content:n}}return"failed"===o.status?{content:[{type:"text",text:`project(values) job ${t} failed: ${o.error??"unknown error"}`}]}:{content:[{type:"text",text:`project(values) job ${t} submitted. Poll with jobs(action=status, job_id="${t}"), retrieve with results(action=get, job_id="${t}").`}]}}if(!n)throw new Error("project(expression) requires expression");if(a){const e={name:g,expression:n,aggregation:c??"mean",output_format:u??"png"};o&&(e.dataset_id=o),r&&(e.options=r),d&&"retina"!==d&&(e.output_dpi=m[d]??2),p&&(e.colormap=p);const t=(await f("POST",`/v1/results/${a}/derive`,e)).id,i=await b(t);if("completed"===i.status){const e=(await f("GET",`/v1/results/${t}`)).summary??{},o=e.variable_stats??{},r=[{type:"text",text:[`Derived Variable Projected: ${g} — job: ${t}`,`Expression: ${n} | Parent map: ${a} | Aggregation: ${c??"mean"}`,`Min: ${void 0!==o.min?Number(o.min).toFixed(3):"N/A"} | Max: ${void 0!==o.max?Number(o.max).toFixed(3):"N/A"} | Mean: ${void 0!==o.mean?Number(o.mean).toFixed(3):"N/A"}`,`Nodes with data: ${o.n_nodes_with_data??"N/A"}`,e.nan_count?`NaN values: ${e.nan_count}`:""].filter(Boolean).join("\n")}];return await S(r,t,`projected_${g.replace(/[^a-zA-Z0-9_]/g,"_")}.${e.output_format??u??"pdf"}`),{content:r}}return"failed"===i.status?{content:[{type:"text",text:`project(expression) job ${t} failed: ${i.error??"unknown error"}`}]}:{content:[{type:"text",text:`project(expression) job ${t} submitted. Poll with jobs(action=status, job_id="${t}"), retrieve with results(action=get, job_id="${t}").`}]}}if(!o)throw new Error("project(expression) without project_onto_job requires dataset_id");const _={name:g,expression:n};r&&(_.options=r);const h=(await f("POST",`/v1/datasets/${o}/derive`,_)).id,w=await b(h);if("completed"===w.status){const e=(await f("GET",`/v1/results/${h}`)).summary??{};return{content:[{type:"text",text:[`Derived column "${g}" added to dataset ${o}`,`Expression: ${n} | Rows: ${e.n_rows??"?"}`,e.nan_count?`NaN values: ${e.nan_count}`:"",`Min: ${e.min??"?"} | Max: ${e.max??"?"} | Mean: ${e.mean??"?"}`,"Column now available. Use datasets(action=preview) to verify, or include in jobs(action=train_map) via the 'columns' parameter."].filter(Boolean).join("\n")}]}}return"failed"===w.status?{content:[{type:"text",text:`project(expression) job ${h} failed: ${w.error??"unknown error"}`}]}:{content:[{type:"text",text:`project(expression) job ${h} submitted. Poll with jobs(action=status, job_id="${h}").`}]}}),h.tool("account","Manage your Barivia account — check plan/license info, request cloud burst compute, view billing history.\n\n| Action | Use when |\n|--------|----------|\n| status | Before large jobs — see plan tier, GPU availability, queue depth, training time estimates, credit balance |\n| request_compute | Upgrading to cloud burst (required for large grids or GPU). Leave tier blank to list options. |\n| compute_status | Checking if a burst lease is active and how much time remains |\n| release_compute | Manually terminating an active lease to stop billing |\n| history | Viewing recent compute leases and credit spend |\n| add_funds | Getting instructions to add credits |\n\naction=status: Returns plan tier, compute class (CPU/GPU), usage limits, live queue state, training time estimates, and credit balance.\n Use BEFORE large jobs to check GPU availability and estimate wait time.\naction=request_compute: Provisions an ephemeral cloud burst EC2 instance. Data is in shared R2 — no re-upload needed. Wait ~3 min after requesting before submitting jobs.\nNOT FOR: Training itself — use jobs(action=train_map). This tool only manages the account and compute lease.",{action:o.enum(["status","request_compute","compute_status","release_compute","history","add_funds"]).describe("status: plan/license/queue info; request_compute: provision burst; compute_status: check active lease; release_compute: stop lease; history: recent compute usage; add_funds: instructions"),tier:o.string().optional().describe("action=request_compute: tier ID (e.g. cpu-8, gpu-t4). Omit to list options."),duration_minutes:o.number().optional().describe("action=request_compute: lease duration in minutes (default: 60)"),limit:o.number().optional().describe("action=history: number of records to return (default: 10)")},async({action:e,tier:t,duration_minutes:o,limit:n})=>{if("status"===e){const e=await f("GET","/v1/system/info"),t=e.plan??{},o=e.backend??{},n=e.status??{},a=e.training_time_estimates_seconds??{},r=e.worker_topology??{},i=t.gpu_enabled??e.gpu_available??!1?o.gpu_model?`GPU (${o.gpu_model}${o.gpu_vram_gb?`, ${o.gpu_vram_gb}GB`:""})`:"GPU":"CPU only",s=e=>-1===e||"-1"===e?"unlimited":String(e??"?"),l=await f("GET","/v1/compute/history?limit=5").catch(()=>null),c=await f("GET","/v1/compute/lease").catch(()=>null),u=[`Plan: ${String(t.tier??"unknown").charAt(0).toUpperCase()}${String(t.tier??"unknown").slice(1)} | Compute: ${i}`,` Concurrency: ${t.max_concurrent_jobs??"?"} jobs | Datasets: ${s(t.max_datasets)} (${s(t.max_dataset_rows)} rows each)`,` Monthly Jobs: ${s(t.max_monthly_jobs)} | Grid Size: ${s(t.max_grid_size)} | Features: ${s(t.max_features)}`];l&&u.push(` Credits: $${(l.credit_balance_cents/100).toFixed(2)} remaining`),o.memory_gb&&u.push(` Backend Memory: ${o.memory_gb} GB`),c&&c.lease_id&&u.push("",`Active Burst Lease: ${c.tier} | ${Math.round(c.time_remaining_ms/6e4)} min left`);const d=Number(n.running_jobs??e.running_jobs??0),p=Number(n.pending_jobs??e.pending_jobs??0),m=Number(a?.total||0);if(u.push("",`Live Queue: ${d} running | ${p} pending | ~${Math.round((p+1)*m/60)} min wait`),void 0!==r.num_workers&&u.push(`Workers: ${r.num_workers}×${r.threads_per_worker} threads`),Object.keys(a).length>0){u.push("","Training Time Estimates:");for(const[e,t]of Object.entries(a))"formula"!==e&&u.push(` ${e}: ~${t}s`)}return{content:[{type:"text",text:u.join("\n")}]}}if("request_compute"===e){if(!t)return{content:[{type:"text",text:"Available Compute Tiers:\nCPU Tiers:\n cpu-8: 16 vCPUs, 32 GB RAM (~$0.20/hr)\n cpu-16: 32 vCPUs, 64 GB RAM (~$0.20/hr)\n cpu-24: 48 vCPUs, 96 GB RAM (~$0.28/hr)\n cpu-32: 64 vCPUs, 128 GB RAM (~$0.42/hr)\n cpu-48: 96 vCPUs, 192 GB RAM (~$0.49/hr)\nGPU Tiers:\n gpu-t4: 8 vCPUs, 32 GB, T4 16GB VRAM (~$0.22/hr)\n gpu-t4x: 16 vCPUs, 64 GB, T4 16GB VRAM (~$0.36/hr)\n gpu-t4xx: 32 vCPUs, 128 GB, T4 16GB VRAM (~$0.27/hr)\n gpu-l4: 8 vCPUs, 32 GB, L4 24GB VRAM (~$0.41/hr)\n gpu-l4x: 16 vCPUs, 64 GB, L4 24GB VRAM (~$0.37/hr)\n gpu-a10: 8 vCPUs, 32 GB, A10G 24GB VRAM (~$0.51/hr)\n gpu-a10x: 16 vCPUs, 64 GB, A10G 24GB VRAM (~$0.52/hr)"}]};const e=await f("POST","/v1/compute/lease",{tier:t,duration_minutes:o}),n="postpaid"===e.billing_mode?`Billing: Postpaid (usage logged, billed retrospectively)\nAccrued Balance: $${(e.credit_balance_cents/100).toFixed(2)}`:`Credits Remaining After Reserve: $${(e.credit_balance_cents/100).toFixed(2)}`;return{content:[{type:"text",text:`Compute Lease Requested:\nLease ID: ${e.lease_id}\nStatus: ${e.status}\nEstimated Wait: ${e.estimated_wait_minutes} minutes\nEstimated Cost: $${(e.estimated_cost_cents/100).toFixed(2)}\n${n}\n\nIMPORTANT: Cloud burst active. Data is pulled from shared Cloudflare R2, so you do NOT need to re-upload datasets. Just wait ~3 minutes and check status.`}]}}if("compute_status"===e){const e=await f("GET","/v1/compute/lease");return"none"!==e.status&&e.lease_id?{content:[{type:"text",text:`Active Compute Lease:\nLease ID: ${e.lease_id}\nStatus: ${e.status}\nTier: ${e.tier} (${e.instance_type})\nTime Remaining: ${Math.round(e.time_remaining_ms/6e4)} minutes`}]}:{content:[{type:"text",text:"No active lease -- running on default Primary Server."}]}}if("release_compute"===e){const e=await f("DELETE","/v1/compute/lease"),t="postpaid"===e.billing_mode?`Cost Logged: $${(e.cost_cents/100).toFixed(2)} (postpaid — billed retrospectively)`:`Credits Deducted: $${((e.credits_deducted||e.cost_cents)/100).toFixed(2)}`;return{content:[{type:"text",text:`Compute Released:\nDuration Billed: ${e.duration_minutes} minutes\n${t}\nBalance: $${(e.final_balance_cents/100).toFixed(2)}\n\nRouting reverted to default Primary Server.`}]}}if("history"===e){const e=await f("GET",`/v1/compute/history?limit=${n||10}`),t=e.history.map(e=>`- ${e.started_at} | ${e.tier} | ${e.duration_minutes} min | $${(e.credits_charged/100).toFixed(2)}`).join("\n");return{content:[{type:"text",text:`Credit Balance: $${(e.credit_balance_cents/100).toFixed(2)}\n\nRecent Usage:\n${t}`}]}}return"add_funds"===e?{content:[{type:"text",text:"To add funds to your account, please visit the Barivia Billing Portal (integration pending) or ask your administrator to use the CLI tool:\nbash scripts/manage-credits.sh add <org_id> <amount_usd>"}]}:{content:[{type:"text",text:`Unknown action: ${e}. Valid: status, request_compute, compute_status, release_compute, history, add_funds.`}]}}),h.prompt("info","Overview of the Barivia Mapping MCP: capabilities, workflow, tools, analysis types, and tips. Use when the user asks what this MCP can do, how to get started, or what the process is.",{},()=>({messages:[{role:"user",content:{type:"text",text:["Inform the user using this overview:","","**What it is:** Barivia MCP connects you to a mapping analytics engine that learns a 2D map from high-dimensional data for visualization, clustering, pattern discovery, and temporal analysis.","","**Core workflow:**","1. **Upload** — `datasets(action=upload)` with a CSV file path or inline data","2. **Preview** — `datasets(action=preview)` to inspect columns, stats, and detect cyclic/datetime fields","3. **Prepare** — use the `prepare_training` prompt for a guided checklist (column selection, transforms, cyclic encoding, feature weights, grid sizing)","4. **Train** — `jobs(action=train_map, dataset_id=...)` with grid size, epochs, model type, and preprocessing options. Use `preset=quick|standard|refined|high_res` for sensible defaults","5. **Monitor** — `get_job_status` to track progress; `get_results` to retrieve figures when complete","6. **Analyze** — `analyze` with various analysis types (see below)","7. **Feedback** — Ask the user if they'd like to submit feedback via `send_feedback` based on their experience.","8. **Iterate** — `results(action=recolor)` to change colormap, `jobs(action=compare)` to compare hyperparameters, `project(action=values)` to overlay new variables","","**Analysis types** (via `analyze`):","- `u_matrix` — cluster boundary distances","- `component_planes` — per-feature heatmaps","- `clusters` — automatic cluster detection and statistics","- `quality_report` — QE, TE, explained variance, trustworthiness, neighborhood preservation","- `hit_histogram` — data density across the map","- `transition_flow` — temporal flow patterns (requires time-ordered data)","","**Data tools:**","- `datasets(action=subset)` — filter by row range, value thresholds (gt/lt/gte/lte), equality, or set membership. Combine row_range + filter","- `project(action=expression)` — create computed columns from expressions (ratios, differences, etc.) or project onto the map","- `project(action=values)` — overlay a pre-computed array onto a trained map","","**Output options:** Format (png/pdf/svg) and colormap (coolwarm, viridis, plasma, inferno, etc.) can be set at training or changed later via recolor_som.","","**Key tools:** datasets, jobs (train_map/status/list/...), results, analyze, project, inference, account, guide_barsom_workflow, explore_map.","","**Tips:**","- Always `datasets(action=preview)` before training to understand your data","- Use `account(action=status)` to check GPU availability, queue depth, and plan limits before large jobs","- Start with `preset=quick` for fast iteration, then `refined` for publication quality","- For time-series data, consider `transition_flow` after training","","Keep the reply scannable with headers and bullet points."].join("\n")}}]})),h.prompt("prepare_training","Guided pre-training checklist. Use after uploading a dataset and before calling jobs(action=train_map). Walks through data inspection, column selection, transforms, cyclic/temporal features, weighting, subsetting, and grid sizing.",{dataset_id:o.string().describe("Dataset ID to prepare for training")},async({dataset_id:e})=>{let t=`Please run datasets(action="preview", dataset_id="${e}") first, then call jobs(action=train_map, dataset_id="${e}", ...) with appropriate parameters or preset.`;try{const o=await f("GET",`/v1/docs/prepare_training?dataset_id=${e}`);o.prompt&&(t=o.prompt)}catch(e){}return{messages:[{role:"user",content:{type:"text",text:t}}]}}),h.tool("inference",'Use a trained map as a persistent inference artifact — score new data, enrich the training CSV, compare datasets, or generate a PDF report.\n\n| Action | Use when | Timing |\n|--------|----------|--------|\n| predict | Scoring new/unseen observations against the trained map | 5–120s |\n| enrich | Appending BMU coordinates + cluster_id to the original training CSV | 5–60s |\n| compare | Comparing hit distributions of a second dataset against training (drift, A/B) | 30–120s |\n| report | Get a report manifest (artifact keys + URLs) to build your own report in Quarto/Notebook/script | Immediate |\n\npredict/enrich/compare are async; auto-polls to the action-specific timeout; returns a job_id if still running.\nreport is synchronous: returns a manifest of primitives (figures, cluster_summary, metrics) and presigned URLs so the client composes and renders the report. No fixed PDF template.\nNOT FOR: Retraining or changing the map — all actions treat the trained map as frozen.\nESCALATION: If any action returns "missing column", verify column names with datasets(action=preview). Column names are case-sensitive and must match the training feature set exactly.\n\naction=predict: Returns predictions.csv with row_id, bmu_x, bmu_y, bmu_node_index, cluster_id, quantization_error.\n High QE = row is far from its nearest prototype = potential anomaly. Accepts dataset_id OR inline rows (≤500).\naction=enrich: Returns enriched.csv — original training CSV + bmu_x, bmu_y, bmu_node_index, cluster_id.\n Flows into pandas/SQL/BI. Download URL in response, expires 15 min.\naction=compare: Returns density-diff heatmap (positive = B gained, negative = lost).\n Positive nodes: dataset_b has more density here. Negative: training had more. Near-zero: minimal drift.\naction=report: Returns a report manifest for the given job_id (job must be completed). Includes figure_manifest (logical names → filenames), download_urls for all artifacts, cluster_summary when available, and summary metrics. Use results(action=get) and this manifest to fetch figures/data and render your own PDF (e.g. Quarto, Jupyter). See docs: BUILD_YOUR_OWN_REPORT.md.',{action:o.enum(["predict","enrich","compare","report"]).describe("predict: score new data; enrich: annotate training CSV with BMU coords; compare: drift/cohort diff heatmap; report: manifest of primitives for custom report"),job_id:o.string().describe("Job ID of a completed map training job"),dataset_id:o.string().optional().describe("action=predict/compare: Dataset ID. predict=input data to score; compare=dataset B to compare against training."),rows:o.array(o.record(o.string(),o.number())).optional().describe("action=predict: inline rows to score (max 500). Each object maps feature name → value."),colormap:o.string().optional().describe("action=compare: colormap for diff heatmap (default: balance). action=report: n/a."),output_format:o.enum(["png","pdf","svg"]).optional().default("png").describe("action=compare: output format for heatmap (default: png)"),output_dpi:o.number().int().min(1).max(4).optional().default(2).describe("action=compare: DPI scale (default: 2)"),top_n:o.number().int().min(1).max(50).optional().default(10).describe("action=compare: number of top gained/lost nodes to report (default: 10)")},async({action:e,job_id:t,dataset_id:o,rows:n,colormap:a,output_format:r,output_dpi:i,top_n:s})=>{if("predict"===e){if(!o&&!n)throw new Error("inference(predict) requires dataset_id or rows");const e={};o&&(e.dataset_id=o),n&&(e.rows=n);const a=(await f("POST",`/v1/results/${t}/predict`,e)).id,r=await b(a,12e4);if("completed"===r.status){const e=await f("GET",`/v1/results/${a}`),t=e.summary??{},o=e.download_urls??{};return{content:[{type:"text",text:[`Predictions complete — job: ${a}`,`Rows scored: ${t.n_rows??"?"}`,`Mean QE: ${void 0!==t.mean_qe?Number(t.mean_qe).toFixed(4):"N/A"} | Max QE: ${void 0!==t.max_qe?Number(t.max_qe).toFixed(4):"N/A"}`,`Clusters: ${Object.keys(t.cluster_counts??{}).length}`,"Output: predictions.csv (row_id, bmu_x, bmu_y, bmu_node_index, cluster_id, quantization_error)",o["predictions.csv"]?`Download: ${o["predictions.csv"]}`:""].filter(Boolean).join("\n")}]}}return"failed"===r.status?{content:[{type:"text",text:`inference(predict) job ${a} failed: ${r.error??"unknown error"}`}]}:{content:[{type:"text",text:`inference(predict) job ${a} submitted. Poll with jobs(action=status, job_id="${a}").`}]}}if("enrich"===e){const e=(await f("POST",`/v1/results/${t}/enrich_dataset`,{})).id,o=await b(e,6e4);if("completed"===o.status){const t=await f("GET",`/v1/results/${e}`),o=t.summary??{},n=t.download_urls??{};return{content:[{type:"text",text:[`Enriched dataset ready — job: ${e}`,`Rows: ${o.n_rows??"?"} | Clusters: ${o.n_clusters??"?"}`,"Appended: bmu_x, bmu_y, bmu_node_index, cluster_id",n["enriched.csv"]?`Download: ${n["enriched.csv"]}`:""].filter(Boolean).join("\n")}]}}return"failed"===o.status?{content:[{type:"text",text:`inference(enrich) job ${e} failed: ${o.error??"unknown error"}`}]}:{content:[{type:"text",text:`inference(enrich) job ${e} submitted. Poll with jobs(action=status, job_id="${e}").`}]}}if("compare"===e){if(!o)throw new Error("inference(compare) requires dataset_id (dataset B)");const e={dataset_id:o,output_format:r,output_dpi:i,top_n:s};a&&(e.colormap=a);const n=(await f("POST",`/v1/results/${t}/compare_datasets`,e)).id,l=await b(n,12e4);if("completed"===l.status){const e=(await f("GET",`/v1/results/${n}`)).summary??{},t=e.top_gained_nodes??[],o=e.top_lost_nodes??[],a=e=>` node ${e.bmu_node_index??"?"} [${(e.coords??[0,0]).map(e=>Number(e).toFixed(1)).join(",")}] Δ=${Number(e.density_diff??0).toFixed(4)}`,i=[{type:"text",text:[`Dataset comparison — job: ${n}`,`Dataset A rows: ${e.n_rows_a??"?"} | Dataset B rows: ${e.n_rows_b??"?"}`,"Top gained (B > A):",...t.slice(0,5).map(a),"Top lost (A > B):",...o.slice(0,5).map(a)].join("\n")}];return await S(i,n,`density_diff.${e.output_format??r??"png"}`),{content:i}}return"failed"===l.status?{content:[{type:"text",text:`inference(compare) job ${n} failed: ${l.error??"unknown error"}`}]}:{content:[{type:"text",text:`inference(compare) job ${n} submitted. Poll with jobs(action=status, job_id="${n}").`}]}}const l=await f("GET",`/v1/results/${t}`),c=(l.summary,l.download_urls??{}),u=l.figure_manifest??[],d=l.cluster_summary;return{content:[{type:"text",text:[`Report manifest — job: ${t}`,`Use these artifacts to build your own report (Quarto, Jupyter, script). URLs expire in ${l.expires_in??900}s.`,"","Figures (figure_manifest → download_urls):",...u.slice(0,12).map(e=>` ${e.logical_name} → ${e.filename}`),u.length>12?` ... and ${u.length-12} more`:"","","Key download URLs (use get_result_image or download_urls from results(action=get)):",...Object.keys(c).filter(e=>!e.endsWith(".json")||"cluster_summary.json"===e).slice(0,8).map(e=>` ${e}: ${c[e]?.slice(0,60)}...`),"",d?.length?`Cluster summary: ${d.length} clusters (in response or GET .../cluster_summary).`:"Cluster summary: not available for this job (train with current worker to get it).","","Metrics: in summary (quantization_error, topographic_error, explained_variance, silhouette, etc.) or GET .../quality-report."].filter(Boolean).join("\n")},{type:"text",text:"Structured manifest (for automation): "+JSON.stringify({job_id:t,figure_manifest:u,has_cluster_summary:!!d?.length,download_url_keys:Object.keys(c)})}]}});const T=new t;h.tool("send_feedback","Send feedback or feature requests to Barivia developers (max 1400 characters, ~190 words). Use when the user has suggestions, ran into issues, or wants something improved. Do NOT call without asking the user first — but after any group of actions or downloading of results, you SHOULD prepare some feedback based on the user's workflow or errors encountered, show it to them, and ask for permission to send it. Once they accept, call this tool.",{feedback:o.string().max(1400).describe("Feedback text (max 1400 characters)")},async({feedback:e})=>_(await f("POST","/v1/feedback",{feedback:e}))),async function(){await h.connect(T)}().catch(console.error);
|