vortexa-claude-skills 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +28 -0
- package/VERSION +1 -0
- package/bin/.gitkeep +0 -0
- package/bin/setup.js +302 -0
- package/commands/vortexa/_check-setup.md +9 -0
- package/commands/vortexa/_skill-template.md +100 -0
- package/commands/vortexa/breakdown.md +294 -0
- package/commands/vortexa/cargo-flows.md +247 -0
- package/commands/vortexa/compare.md +315 -0
- package/commands/vortexa/custom.md +214 -0
- package/commands/vortexa/explain.md +124 -0
- package/commands/vortexa/init.md +133 -0
- package/commands/vortexa/oow.md +189 -0
- package/commands/vortexa/seasonal.md +185 -0
- package/commands/vortexa/voyages.md +285 -0
- package/context/.gitkeep +0 -0
- package/context/cargo-movements.md +738 -0
- package/context/date-units.md +188 -0
- package/context/endpoint-template.md +176 -0
- package/context/entity-resolution.md +217 -0
- package/context/guardrails.md +161 -0
- package/context/reference-endpoints.md +651 -0
- package/context/voyages.md +636 -0
- package/lib/__init__.py +4 -0
- package/lib/aliases.json +52 -0
- package/lib/api.py +20 -0
- package/lib/entities.py +254 -0
- package/lib/inventory.py +140 -0
- package/lib/movements.py +242 -0
- package/lib/requirements.txt +6 -0
- package/lib/seasonal.py +200 -0
- package/lib/timeseries.py +271 -0
- package/lib/utils.py +120 -0
- package/lib/vessels.py +192 -0
- package/lib/visualization.py +164 -0
- package/lib/voyages.py +236 -0
- package/package.json +28 -0
- package/templates/.env.template +3 -0
|
@@ -0,0 +1,294 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: vortexa:breakdown
|
|
3
|
+
description: "Break down cargo flows by origin, destination, product, vessel class, or other dimensions"
|
|
4
|
+
argument-hint: "crude exports by destination country, monthly, last year"
|
|
5
|
+
allowed-tools:
|
|
6
|
+
- Read
|
|
7
|
+
- Edit
|
|
8
|
+
- Write
|
|
9
|
+
- Bash
|
|
10
|
+
- Glob
|
|
11
|
+
- Grep
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
<objective>
|
|
15
|
+
Break down cargo flow timeseries by any supported dimension (geography, product, vessel, commercial) using CargoTimeSeries split queries. Produce wide-format DataFrames with Top-N bucketing for clean, readable breakdown results.
|
|
16
|
+
</objective>
|
|
17
|
+
|
|
18
|
+
<execution_context>
|
|
19
|
+
## Pre-loaded Context (via CLAUDE.md @imports)
|
|
20
|
+
The following are automatically available -- do NOT re-read them:
|
|
21
|
+
- context/guardrails.md -- NEVER/ALWAYS rules for all Vortexa API calls
|
|
22
|
+
- context/entity-resolution.md -- How to resolve entity names to hex IDs
|
|
23
|
+
- context/date-units.md -- Date parsing rules and unit defaults
|
|
24
|
+
|
|
25
|
+
## Required Context (Read on demand)
|
|
26
|
+
Read before processing the user's query:
|
|
27
|
+
- context/cargo-movements.md -- Full timeseries_property values, CTS parameters, activity filters
|
|
28
|
+
</execution_context>
|
|
29
|
+
|
|
30
|
+
## Setup Check
|
|
31
|
+
@commands/vortexa/_check-setup.md
|
|
32
|
+
|
|
33
|
+
<process>
|
|
34
|
+
|
|
35
|
+
## Step 1: Read Endpoint Context
|
|
36
|
+
|
|
37
|
+
Read `context/cargo-movements.md` for the full list of `timeseries_property` values, activity filter rules, and CTS parameters.
|
|
38
|
+
|
|
39
|
+
## Step 2: Parse the Query
|
|
40
|
+
|
|
41
|
+
Analyze $ARGUMENTS and extract:
|
|
42
|
+
|
|
43
|
+
- **Dimension**: The "by X" phrase. Map to `timeseries_property` using the table below.
|
|
44
|
+
- **Product**: What commodity? (crude, LNG, clean products, etc.)
|
|
45
|
+
- **Geography**: Origin and/or destination filters (country, port, region, shipping_region_v2)
|
|
46
|
+
- **Activity**: What movement stage?
|
|
47
|
+
- exports / shipped out / departures = `loading_end`
|
|
48
|
+
- imports / arrivals / received = `unloading_start`
|
|
49
|
+
- on the water / afloat / in transit = `cargo_on_water_state`
|
|
50
|
+
- floating storage = `storing_state`
|
|
51
|
+
- loadings / loading at port = `loading_state`
|
|
52
|
+
- **Time range**: Apply date-units.md calendar/trailing convention
|
|
53
|
+
- **Unit**: Apply date-units.md commodity defaults (oil=bpd, LNG=t, LPG=t)
|
|
54
|
+
- **Frequency**: daily/weekly/monthly -- note if user specified or not
|
|
55
|
+
- **Top-N**: Look for "top 5", "top 20", "show top 3", etc. Default 10 if not specified.
|
|
56
|
+
|
|
57
|
+
### Dimension Mapping Table
|
|
58
|
+
|
|
59
|
+
**Geographic dimensions:**
|
|
60
|
+
|
|
61
|
+
| User Says | timeseries_property |
|
|
62
|
+
|---|---|
|
|
63
|
+
| "by origin country", "by source country" | `origin_country` |
|
|
64
|
+
| "by destination country", "by receiving country" | `destination_country` |
|
|
65
|
+
| "by origin port", "by loading port" | `origin_port` |
|
|
66
|
+
| "by destination port", "by discharge port" | `destination_port` |
|
|
67
|
+
| "by origin region" | `origin_region` |
|
|
68
|
+
| "by destination region" | `destination_region` |
|
|
69
|
+
| "by origin shipping region" | `origin_shipping_region_v2` |
|
|
70
|
+
| "by destination shipping region" | `destination_shipping_region_v2` |
|
|
71
|
+
| "by origin terminal" | `origin_terminal` |
|
|
72
|
+
| "by destination terminal" | `destination_terminal` |
|
|
73
|
+
| "by storage country" (floating storage) | `storage_location_country` |
|
|
74
|
+
| "by storage region" (floating storage) | `storage_location_region` |
|
|
75
|
+
| "by storage shipping region" (floating storage) | `storage_location_shipping_region_v2` |
|
|
76
|
+
|
|
77
|
+
**Product dimensions:**
|
|
78
|
+
|
|
79
|
+
| User Says | timeseries_property |
|
|
80
|
+
|---|---|
|
|
81
|
+
| "by product group", "by product type" | `product_group` |
|
|
82
|
+
| "by product sub-group" | `product_group_product` |
|
|
83
|
+
| "by product category" | `product_category` |
|
|
84
|
+
| "by product grade", "by grade", "by crude grade" | `product_grade` |
|
|
85
|
+
|
|
86
|
+
**Vessel dimensions:**
|
|
87
|
+
|
|
88
|
+
| User Says | timeseries_property |
|
|
89
|
+
|---|---|
|
|
90
|
+
| "by vessel type" | `vessel_class_group` |
|
|
91
|
+
| "by vessel class" | `vessel_class_granular` |
|
|
92
|
+
| "by vessel class (coarse)" | `vessel_class_coarse` |
|
|
93
|
+
| "by flag", "by flag state" | `vessel_flag` |
|
|
94
|
+
|
|
95
|
+
**Commercial dimensions:**
|
|
96
|
+
|
|
97
|
+
| User Says | timeseries_property |
|
|
98
|
+
|---|---|
|
|
99
|
+
| "by shipper" | `shipper` |
|
|
100
|
+
| "by consignee" | `consignee` |
|
|
101
|
+
| "by buyer" | `load_buyer` |
|
|
102
|
+
| "by seller" | `load_seller` |
|
|
103
|
+
| "by contract type" (spot vs term) | `contract_type` |
|
|
104
|
+
| "by delivery method" (FOB, DES, etc.) | `delivery_method` |
|
|
105
|
+
|
|
106
|
+
**Ambiguity defaults:**
|
|
107
|
+
- "by destination" without specifying level -> default to `destination_country`. Mention: "I'll break down by destination country. If you want port/region/shipping region level, let me know."
|
|
108
|
+
- "by origin" without specifying level -> default to `origin_country`. Same note.
|
|
109
|
+
- "by product" without specifying level -> default to `product_group`.
|
|
110
|
+
|
|
111
|
+
**Not a CTS split property:**
|
|
112
|
+
- "by effective controller", "by charterer", "by time charterer" -- these are NOT available as CTS `timeseries_property` values. If the user requests one, explain: "Corporate breakdowns (effective controller, charterer) are not available as CTS split dimensions. You can use `/vortexa:custom` to query CargoMovements search with a groupby on the corporate entity column instead."
|
|
113
|
+
|
|
114
|
+
## Step 3: Check for Missing Parameters (ASK before confirming)
|
|
115
|
+
|
|
116
|
+
Ask targeted questions BEFORE the confirmation step. Never default silently.
|
|
117
|
+
|
|
118
|
+
**Required (ask if missing):**
|
|
119
|
+
- **Dimension** -- MUST be specified. If user says "breakdown" without "by X", ask: "What dimension should I break down by? e.g., destination country, product group, vessel class"
|
|
120
|
+
- **Product** -- MUST be specified. Ask if missing.
|
|
121
|
+
- **Activity** -- MUST be specified. If direction ambiguous, ask: "exports from origin or imports to destination?"
|
|
122
|
+
- **Time range** -- MUST be specified. Ask if missing. Do NOT default to "last 3 months."
|
|
123
|
+
|
|
124
|
+
**Ask if not specified:**
|
|
125
|
+
- **Frequency** -- Ask if not determinable from query context.
|
|
126
|
+
- **Unit** -- Ask if not obvious from commodity (oil=bpd, LNG=t, LPG=t).
|
|
127
|
+
|
|
128
|
+
Ask one parameter at a time. Do NOT ask all missing params in a bulk dump.
|
|
129
|
+
|
|
130
|
+
## Step 4: Resolve Entity IDs
|
|
131
|
+
|
|
132
|
+
For each entity mentioned (origin, destination, product):
|
|
133
|
+
1. Check `lib/aliases.json` for common shorthands (ME, AG, USG, ARA, etc.)
|
|
134
|
+
2. Call `lib/entities.py` resolve functions with the correct entity type and layer:
|
|
135
|
+
- `resolve_geography(term, layer)` for origins/destinations
|
|
136
|
+
- `resolve_product(term, layer)` for products
|
|
137
|
+
- `resolve(term, entity_type, layer, cache)` for the unified resolver with caching
|
|
138
|
+
3. If multiple matches: present top 3 candidates. Ask user to pick. NEVER auto-correct.
|
|
139
|
+
4. If zero matches: tell user the term wasn't found, suggest checking spelling or trying a broader term.
|
|
140
|
+
|
|
141
|
+
Use `EntityCache()` from `lib/entities.py` for the session to avoid re-resolving the same entities.
|
|
142
|
+
|
|
143
|
+
## Step 5: Confirm Before Executing (MANDATORY)
|
|
144
|
+
|
|
145
|
+
Present the complete parameter set including the split dimension:
|
|
146
|
+
|
|
147
|
+
```
|
|
148
|
+
Query: [restate the user's question]
|
|
149
|
+
Endpoint: CargoTimeSeries (split)
|
|
150
|
+
Activity: [filter_activity value] ([user's term])
|
|
151
|
+
Split by: [timeseries_property value] ([user's term])
|
|
152
|
+
Top N: [N] categories + "Other"
|
|
153
|
+
Origin: [name (layer)] or "not specified"
|
|
154
|
+
Destination: [name (layer)] or "not specified"
|
|
155
|
+
Product: [name (layer)]
|
|
156
|
+
Time: [start datetime] -> [end datetime] ([interpretation])
|
|
157
|
+
Unit: [unit] ([reason])
|
|
158
|
+
Frequency: [value]
|
|
159
|
+
Intra: [exclude_intra_country for exports/imports, all otherwise]
|
|
160
|
+
|
|
161
|
+
Confirm or adjust?
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
NEVER execute without confirmation. Wait for user response.
|
|
165
|
+
|
|
166
|
+
## Step 6: Generate & Execute Code Artifact
|
|
167
|
+
|
|
168
|
+
**Primary path** -- use `flows_time_series_split()` from `lib/timeseries.py`:
|
|
169
|
+
|
|
170
|
+
```python
|
|
171
|
+
"""Vortexa Breakdown: {description}
|
|
172
|
+
Generated by /vortexa:breakdown
|
|
173
|
+
Date: {date}
|
|
174
|
+
"""
|
|
175
|
+
from datetime import datetime
|
|
176
|
+
import sys; sys.path.insert(0, "lib")
|
|
177
|
+
from entities import resolve_geography, resolve_product, EntityCache
|
|
178
|
+
from timeseries import flows_time_series_split
|
|
179
|
+
|
|
180
|
+
# Entity Resolution
|
|
181
|
+
cache = EntityCache()
|
|
182
|
+
{entity_resolution_code}
|
|
183
|
+
|
|
184
|
+
# Query
|
|
185
|
+
full_df, plot_df = flows_time_series_split(
|
|
186
|
+
time_min=..., time_max=...,
|
|
187
|
+
activity=...,
|
|
188
|
+
split_property=...,
|
|
189
|
+
products=[...],
|
|
190
|
+
origins=[...],
|
|
191
|
+
destinations=[...],
|
|
192
|
+
unit=...,
|
|
193
|
+
frequency=...,
|
|
194
|
+
top_n=N,
|
|
195
|
+
)
|
|
196
|
+
|
|
197
|
+
print(plot_df)
|
|
198
|
+
|
|
199
|
+
# Chart
|
|
200
|
+
from visualization import flow_split_bars
|
|
201
|
+
|
|
202
|
+
fig = flow_split_bars(plot_df.reset_index(), title="{title}", y_label="{unit_label}")
|
|
203
|
+
|
|
204
|
+
import os
|
|
205
|
+
os.makedirs("output", exist_ok=True)
|
|
206
|
+
filepath = f"output/{slug}_breakdown_{datetime.now().strftime('%Y-%m-%d')}.html"
|
|
207
|
+
fig.write_html(filepath, auto_open=True)
|
|
208
|
+
print(f"Chart saved: {filepath}")
|
|
209
|
+
|
|
210
|
+
# Export (uncomment to save)
|
|
211
|
+
# plot_df.to_csv("output/{description}_{date}.csv")
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
`flows_time_series_split()` returns two DataFrames: `full_df` (all categories, daily resolution) and `plot_df` (Top-N + Other, resampled to requested frequency). Use `plot_df` for display.
|
|
215
|
+
|
|
216
|
+
**Fallback path** -- if `flows_time_series_split` doesn't support a specific parameter combination, use `cargo_timeseries_split()` + `top_n_with_other()`:
|
|
217
|
+
|
|
218
|
+
```python
|
|
219
|
+
from timeseries import cargo_timeseries_split
|
|
220
|
+
from utils import top_n_with_other
|
|
221
|
+
|
|
222
|
+
df = cargo_timeseries_split(
|
|
223
|
+
time_min=..., time_max=...,
|
|
224
|
+
activity=..., split_property=...,
|
|
225
|
+
products=[...], origins=[...], destinations=[...],
|
|
226
|
+
unit=..., frequency=...,
|
|
227
|
+
)
|
|
228
|
+
df = top_n_with_other(df, n=N)
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
After generating:
|
|
232
|
+
- Ask user: "Save as new file, or add to an existing notebook/script?" Default: new file.
|
|
233
|
+
- If new file: create `vortexa_breakdown_{slug}.py` in project directory.
|
|
234
|
+
- Run the generated code to get results.
|
|
235
|
+
|
|
236
|
+
## Step 7: Present Results
|
|
237
|
+
|
|
238
|
+
- Show chart path: "Stacked bar chart saved to output/{slug}_breakdown_{date}.html and opened in browser"
|
|
239
|
+
- Show a preview of the wide-format DataFrame with row cap by frequency:
|
|
240
|
+
- daily: 30 rows
|
|
241
|
+
- weekly: 52 rows
|
|
242
|
+
- monthly: 24 rows
|
|
243
|
+
- yearly: all rows
|
|
244
|
+
- If capped: "Showing first N of M rows"
|
|
245
|
+
- List the categories shown: "Columns: Saudi Arabia, China, India, ... Other (aggregating 15 smaller categories)"
|
|
246
|
+
- Show row count and date range
|
|
247
|
+
- File path to generated code artifact
|
|
248
|
+
- Offer: "Export to CSV?" and "Want a summary of these results?"
|
|
249
|
+
|
|
250
|
+
After the DataFrame preview and before the export offer, show a one-line methodology footnote using the CONFIRMED parameters from Step 5:
|
|
251
|
+
```
|
|
252
|
+
Methodology: {Endpoint} (split) | {activity} | timeseries_property={dimension} | {key filters} | {frequency} | {unit} | Top {N}
|
|
253
|
+
```
|
|
254
|
+
Example: `Methodology: CargoTimeSeries (split) | loading_end | timeseries_property=destination_country | filter_products=[Crude & Condensates] | monthly | bpd | Top 10`
|
|
255
|
+
|
|
256
|
+
Use human-readable names for entity filters (not hex IDs).
|
|
257
|
+
|
|
258
|
+
If user says yes to CSV: ask where to save, default `./output/{description}_{date}.csv`, create `output/` directory if needed.
|
|
259
|
+
|
|
260
|
+
## Step 8: Summary (if triggered)
|
|
261
|
+
|
|
262
|
+
**Smart trigger:** If the user's query contains "analyze", "summarize", "explain results", "what does this show", or "key findings" -- generate the summary automatically.
|
|
263
|
+
|
|
264
|
+
**Post-query offer:** After showing results, always offer: "Want a summary of these results?"
|
|
265
|
+
|
|
266
|
+
**Brief summary (default):**
|
|
267
|
+
Analyze the DataFrame and produce 3-5 bullet points:
|
|
268
|
+
- Top 3-5 contributors by volume with percentages of total
|
|
269
|
+
- Period-over-period change if comparable time periods exist
|
|
270
|
+
- Notable outliers: spikes, drops, record values, trend reversals
|
|
271
|
+
- Each bullet includes specific numbers, not vague descriptions
|
|
272
|
+
|
|
273
|
+
**Extended summary (if user requests "detailed summary" or "full analysis"):**
|
|
274
|
+
- 100-200 word narrative paragraph in market commentary style
|
|
275
|
+
- Includes everything from brief plus trend descriptions and broader context
|
|
276
|
+
|
|
277
|
+
After showing the summary, offer: "Save this summary to a .md file?"
|
|
278
|
+
If yes: create `output/{description}_summary_{date}.md`
|
|
279
|
+
|
|
280
|
+
## Error Handling
|
|
281
|
+
|
|
282
|
+
Report immediately in plain English, no auto-retry:
|
|
283
|
+
|
|
284
|
+
- **401 Unauthorized**: "API key is invalid or expired. Run /vortexa:init to check your setup."
|
|
285
|
+
- **500 Server Error**: "Vortexa API returned a server error. Try again in a few minutes."
|
|
286
|
+
- **Timeout**: "Query timed out. Try narrowing the date range or reducing filters."
|
|
287
|
+
- **Empty results**: "No data found for this query. Check that the entity names, date range, and filters are correct."
|
|
288
|
+
- **Entity not found**: "The term '{term}' returned no matches. Check spelling or try a broader term."
|
|
289
|
+
- **Multiple entity matches**: "{n} matches found for '{term}': [list]. Which did you mean?"
|
|
290
|
+
- **Unsupported dimension**: "'{dimension}' is not available as a CTS split dimension. [suggest alternative approach]"
|
|
291
|
+
|
|
292
|
+
Never auto-retry. Never self-correct. Report the error and let the user decide.
|
|
293
|
+
|
|
294
|
+
</process>
|
|
@@ -0,0 +1,247 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: vortexa:cargo-flows
|
|
3
|
+
description: "Query cargo flows, exports, imports, and timeseries from the Vortexa API"
|
|
4
|
+
argument-hint: "How much crude oil moved from Middle East to China last quarter?"
|
|
5
|
+
allowed-tools:
|
|
6
|
+
- Read
|
|
7
|
+
- Edit
|
|
8
|
+
- Write
|
|
9
|
+
- Bash
|
|
10
|
+
- Glob
|
|
11
|
+
- Grep
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
<objective>
|
|
15
|
+
Translate natural language cargo flow questions into correct Vortexa API calls using CargoTimeSeries (aggregate volumes) or CargoMovements search (individual cargoes). Generate re-runnable Python code artifacts with transparent filter confirmation.
|
|
16
|
+
</objective>
|
|
17
|
+
|
|
18
|
+
<execution_context>
|
|
19
|
+
## Pre-loaded Context (via CLAUDE.md @imports)
|
|
20
|
+
The following are automatically available -- do NOT re-read them:
|
|
21
|
+
- context/guardrails.md -- NEVER/ALWAYS rules for all Vortexa API calls
|
|
22
|
+
- context/entity-resolution.md -- How to resolve entity names to hex IDs
|
|
23
|
+
- context/date-units.md -- Date parsing rules and unit defaults
|
|
24
|
+
|
|
25
|
+
## Required Context (Read on demand)
|
|
26
|
+
Read before processing the user's query:
|
|
27
|
+
- context/cargo-movements.md -- Full parameter reference, column names, filter options, code examples
|
|
28
|
+
</execution_context>
|
|
29
|
+
|
|
30
|
+
## Setup Check
|
|
31
|
+
@commands/vortexa/_check-setup.md
|
|
32
|
+
|
|
33
|
+
<process>
|
|
34
|
+
|
|
35
|
+
## Step 1: Read Endpoint Context
|
|
36
|
+
|
|
37
|
+
Read `context/cargo-movements.md` for the full parameter reference, activity filter rules, column naming convention, breakdown properties, and worked examples.
|
|
38
|
+
|
|
39
|
+
## Step 2: Parse the Query
|
|
40
|
+
|
|
41
|
+
Analyze $ARGUMENTS and extract:
|
|
42
|
+
- **Product**: What commodity? (crude, LNG, clean products, etc.)
|
|
43
|
+
- **Geography**: Origin and/or destination? What layer? (country, port, region, shipping_region_v2)
|
|
44
|
+
- **Activity**: What movement stage?
|
|
45
|
+
- exports / shipped out / departures = `loading_end`
|
|
46
|
+
- imports / arrivals / received = `unloading_start`
|
|
47
|
+
- on the water / afloat / in transit = `cargo_on_water_state`
|
|
48
|
+
- floating storage = `storing_state`
|
|
49
|
+
- loadings / loading at port = `loading_state`
|
|
50
|
+
- **Time range**: Apply date-units.md calendar/trailing convention
|
|
51
|
+
- **Unit**: Apply date-units.md commodity defaults (oil=bpd, LNG=t, LPG=t)
|
|
52
|
+
- **Frequency**: daily/weekly/monthly -- note if user specified or not
|
|
53
|
+
- **Additional filters**: vessel class, intra_movements, waypoints, corporate entities, etc.
|
|
54
|
+
|
|
55
|
+
## Step 3: Check for Breakdown Redirect
|
|
56
|
+
|
|
57
|
+
If the query contains a dimension phrase -- "by origin", "by destination", "by product", "by vessel class", "by grade", "by flag", "split by", "breakdown by", "broken down by" -- this is a breakdown query.
|
|
58
|
+
|
|
59
|
+
**Exclude frequency phrases:** "by month", "by week", "by day", "by quarter", "by year", "monthly", "weekly", "daily" are NOT dimension phrases. These specify frequency, not a split dimension.
|
|
60
|
+
|
|
61
|
+
**Trigger rule:** "by" + a DIMENSION word (country, port, region, product, vessel, class, grade, flag, origin, destination, shipper, consignee, type, terminal). NOT "by" + a TIME word (month, week, day, year, quarter).
|
|
62
|
+
|
|
63
|
+
If a dimension phrase is detected, suggest:
|
|
64
|
+
"This looks like a breakdown query. Try: `/vortexa:breakdown {restate the full query}` for split analysis with Top-N bucketing and wide-format DataFrames."
|
|
65
|
+
|
|
66
|
+
If the user explicitly wants the basic aggregate query without breakdown, continue to Step 4.
|
|
67
|
+
|
|
68
|
+
## Step 4: Route -- CargoTimeSeries or CargoMovements.search?
|
|
69
|
+
|
|
70
|
+
Use signal words to determine the endpoint:
|
|
71
|
+
|
|
72
|
+
**CargoTimeSeries** (aggregate volumes over time) -- DEFAULT if ambiguous:
|
|
73
|
+
- Signal words: "how much", "total", "trend", "monthly", "weekly", "bpd", "chart", "exports", "imports", "timeseries", "volumes", "flow", "by origin", "by destination", "breakdown"
|
|
74
|
+
|
|
75
|
+
**CargoMovements.search** (individual cargo records):
|
|
76
|
+
- Signal words: "list cargoes", "which shipments", "specific vessel", "cargo details", "show me the movements", "individual", "cargo records"
|
|
77
|
+
|
|
78
|
+
If still ambiguous after checking signal words, ask:
|
|
79
|
+
"Are you looking for aggregate volume trends or a list of individual cargo records?"
|
|
80
|
+
|
|
81
|
+
## Step 5: Check for Missing Parameters
|
|
82
|
+
|
|
83
|
+
Ask targeted questions BEFORE the confirmation step. Never default silently.
|
|
84
|
+
|
|
85
|
+
**For CargoTimeSeries queries, ALL of these must be decided:**
|
|
86
|
+
- Product -- MUST be specified. Ask if missing.
|
|
87
|
+
- Activity -- MUST be specified. If direction ambiguous, ask: "exports from origin or imports to destination?"
|
|
88
|
+
- Time range -- MUST be specified. Ask if missing. Do NOT default to "last 3 months."
|
|
89
|
+
- Frequency -- Ask if not specified and not determinable from query context.
|
|
90
|
+
- Unit -- Ask if not obvious from commodity (oil=bpd, LNG=t, LPG=t).
|
|
91
|
+
- Geography -- optional but commonly specified.
|
|
92
|
+
|
|
93
|
+
**For CargoMovements.search queries:**
|
|
94
|
+
- Time range -- MUST be specified.
|
|
95
|
+
- At least one filter: product, origin, destination, or vessel.
|
|
96
|
+
|
|
97
|
+
Ask one parameter at a time. Do NOT ask all missing params in a bulk dump. Do NOT propose a concrete interpretation and ask "is this right?"
|
|
98
|
+
|
|
99
|
+
## Step 6: Resolve Entity IDs
|
|
100
|
+
|
|
101
|
+
For each entity mentioned (origin, destination, product):
|
|
102
|
+
1. Check `lib/aliases.json` for common shorthands (ME, AG, USG, ARA, etc.)
|
|
103
|
+
2. Call `lib/entities.py` resolve functions with the correct entity type and layer:
|
|
104
|
+
- `resolve_geography(term, layer)` for origins/destinations
|
|
105
|
+
- `resolve_product(term, layer)` for products
|
|
106
|
+
- `resolve(term, entity_type, layer, cache)` for the unified resolver with caching
|
|
107
|
+
3. If multiple matches: present top 3 candidates with name, layer, and parent context. Ask user to pick. NEVER auto-correct, even for single matches that look obvious.
|
|
108
|
+
4. If zero matches: tell user the term wasn't found, suggest checking spelling or trying a broader term.
|
|
109
|
+
|
|
110
|
+
Use `EntityCache()` from `lib/entities.py` for the session to avoid re-resolving the same entities.
|
|
111
|
+
|
|
112
|
+
## Step 7: Confirm Before Executing (MANDATORY)
|
|
113
|
+
|
|
114
|
+
Present the complete parameter set:
|
|
115
|
+
|
|
116
|
+
```
|
|
117
|
+
Query: [restate the user's question]
|
|
118
|
+
Endpoint: CargoTimeSeries / CargoMovements.search
|
|
119
|
+
Activity: [filter_activity value] ([user's term])
|
|
120
|
+
Origin: [name (layer)] or "not specified"
|
|
121
|
+
Destination: [name (layer)] or "not specified"
|
|
122
|
+
Product: [name (layer)]
|
|
123
|
+
Time: [start datetime] -> [end datetime] ([interpretation])
|
|
124
|
+
Unit: [unit] ([reason])
|
|
125
|
+
Frequency: [value] (CTS only)
|
|
126
|
+
Intra: [exclude_intra_country for exports/imports, all otherwise]
|
|
127
|
+
Additional: [any other filters]
|
|
128
|
+
|
|
129
|
+
Confirm or adjust?
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
NEVER execute without confirmation. Wait for user response.
|
|
133
|
+
|
|
134
|
+
## Step 8: Determine Execution Mode
|
|
135
|
+
|
|
136
|
+
Choose the code generation approach:
|
|
137
|
+
|
|
138
|
+
**Mode 1 -- Pre-built function** (when query matches a lib/ pattern):
|
|
139
|
+
- Aggregate CTS query -> `cargo_timeseries_split()` from `lib/timeseries.py`
|
|
140
|
+
- CTS with split/breakdown -> `flows_time_series_split()` from `lib/timeseries.py`
|
|
141
|
+
- These functions handle parsing, pivoting, and Top-N automatically.
|
|
142
|
+
|
|
143
|
+
**Mode 2 -- Custom SDK code** (when no pre-built function fits):
|
|
144
|
+
- CM.search always uses Mode 2 (no pre-built wrapper exists).
|
|
145
|
+
- Write raw SDK code using `context/cargo-movements.md` as reference.
|
|
146
|
+
- Add comment: `# Custom query -- built from Vortexa API documentation`
|
|
147
|
+
|
|
148
|
+
**Mode 3 -- Code-aware** (when user references an existing file):
|
|
149
|
+
- If user says "add this to my analysis.py" or similar, read the file first.
|
|
150
|
+
- Extend or fix the existing code using context docs.
|
|
151
|
+
- Preserve the user's code style and structure.
|
|
152
|
+
|
|
153
|
+
## Step 9: Generate & Execute Code Artifact
|
|
154
|
+
|
|
155
|
+
Write a self-contained `.py` file:
|
|
156
|
+
|
|
157
|
+
```python
|
|
158
|
+
"""Vortexa Query: {description}
|
|
159
|
+
Generated by /vortexa:cargo-flows
|
|
160
|
+
Date: {date}
|
|
161
|
+
"""
|
|
162
|
+
from datetime import datetime
|
|
163
|
+
import sys; sys.path.insert(0, "lib")
|
|
164
|
+
from entities import resolve_geography, resolve_product, EntityCache
|
|
165
|
+
# ... imports based on execution mode
|
|
166
|
+
|
|
167
|
+
# Entity Resolution
|
|
168
|
+
cache = EntityCache()
|
|
169
|
+
{entity_resolution_code}
|
|
170
|
+
|
|
171
|
+
# Query Parameters (confirmed by user)
|
|
172
|
+
{parameter_setup}
|
|
173
|
+
|
|
174
|
+
# Execute Query
|
|
175
|
+
{api_call_code}
|
|
176
|
+
|
|
177
|
+
# Format Results
|
|
178
|
+
{column_renaming_using_rename_cm_columns_or_inline}
|
|
179
|
+
{display_code}
|
|
180
|
+
|
|
181
|
+
# Export (uncomment to save)
|
|
182
|
+
# df.to_csv("output/{description}_{date}.csv", index=False)
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
After generating:
|
|
186
|
+
- Ask user: "Save as new file, or add to an existing notebook/script?" Default: new file.
|
|
187
|
+
- If new file: create `vortexa_query_{slug}.py` in project directory.
|
|
188
|
+
- Run the generated code to get results.
|
|
189
|
+
|
|
190
|
+
## Step 10: Present Results
|
|
191
|
+
|
|
192
|
+
Terminal summary:
|
|
193
|
+
- Brief natural language description of what was queried
|
|
194
|
+
- Row count and date range covered
|
|
195
|
+
- DataFrame preview with row cap by frequency:
|
|
196
|
+
- daily: 30 rows
|
|
197
|
+
- weekly: 52 rows
|
|
198
|
+
- monthly: 24 rows
|
|
199
|
+
- yearly: all rows
|
|
200
|
+
- If capped: "Showing first N of M rows"
|
|
201
|
+
- File path to generated code artifact
|
|
202
|
+
- Offer: "Export to CSV?"
|
|
203
|
+
|
|
204
|
+
After the DataFrame preview and before the export offer, show a one-line methodology footnote using the CONFIRMED parameters from Step 7:
|
|
205
|
+
```
|
|
206
|
+
Methodology: {Endpoint} | {activity} | {key filters} | {frequency} | {unit} | {other relevant params}
|
|
207
|
+
```
|
|
208
|
+
Example: `Methodology: CargoTimeSeries | loading_end | filter_origins=[Saudi Arabia] | filter_products=[Crude & Condensates] | monthly | bpd | exclude_intra_country`
|
|
209
|
+
|
|
210
|
+
Use human-readable names for entity filters (not hex IDs).
|
|
211
|
+
|
|
212
|
+
If user says yes to CSV: ask where to save, default `./output/{description}_{date}.csv`, create `output/` directory if needed.
|
|
213
|
+
|
|
214
|
+
## Step 11: Summary (if triggered)
|
|
215
|
+
|
|
216
|
+
**Smart trigger:** If the user's query contains "analyze", "summarize", "explain results", "what does this show", or "key findings" -- generate the summary automatically.
|
|
217
|
+
|
|
218
|
+
**Post-query offer:** After showing results, always offer: "Want a summary of these results?"
|
|
219
|
+
|
|
220
|
+
**Brief summary (default):**
|
|
221
|
+
Analyze the DataFrame and produce 3-5 bullet points:
|
|
222
|
+
- Top 3-5 contributors by volume with percentages of total
|
|
223
|
+
- Period-over-period change if comparable time periods exist
|
|
224
|
+
- Notable outliers: spikes, drops, record values, trend reversals
|
|
225
|
+
- Each bullet includes specific numbers, not vague descriptions
|
|
226
|
+
|
|
227
|
+
**Extended summary (if user requests "detailed summary" or "full analysis"):**
|
|
228
|
+
- 100-200 word narrative paragraph in market commentary style
|
|
229
|
+
- Includes everything from brief plus trend descriptions and broader context
|
|
230
|
+
|
|
231
|
+
After showing the summary, offer: "Save this summary to a .md file?"
|
|
232
|
+
If yes: create `output/{description}_summary_{date}.md`
|
|
233
|
+
|
|
234
|
+
## Error Handling
|
|
235
|
+
|
|
236
|
+
Wrap execution with these patterns -- report immediately in plain English, no auto-retry:
|
|
237
|
+
|
|
238
|
+
- **401 Unauthorized**: "API key is invalid or expired. Run /vortexa:init to check your setup."
|
|
239
|
+
- **500 Server Error**: "Vortexa API returned a server error. Try again in a few minutes."
|
|
240
|
+
- **Timeout**: "Query timed out. Try narrowing the date range or reducing filters."
|
|
241
|
+
- **Empty results**: "No data found for this query. Check that the entity names, date range, and filters are correct."
|
|
242
|
+
- **Entity not found**: "The term '{term}' returned no matches. Check spelling or try a broader term."
|
|
243
|
+
- **Multiple entity matches**: "{n} matches found for '{term}': [list]. Which did you mean?"
|
|
244
|
+
|
|
245
|
+
Never auto-retry. Never self-correct. Report the error and let the user decide.
|
|
246
|
+
|
|
247
|
+
</process>
|