@prefixcheck/edi-mcp 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 PrefixCheck
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,115 @@
1
+ # @prefixcheck/edi-mcp
2
+
3
+ MCP server exposing operator-grade EDIFACT **CODECO** + **COPRAR** tooling to any MCP client (Claude Desktop, Cursor, Cline, Continue, Claude Code).
4
+
5
+ ```bash
6
+ npx -y @prefixcheck/edi-mcp
7
+ ```
8
+
9
+ ---
10
+
11
+ ## What it does
12
+
13
+ Drops EDI parsing, SMDG validation, ISO 6346 check-digit verification, UN/LOCODE extraction, and COPRAR ↔ CODECO reconciliation directly into your AI workflow. Now you can paste a broken EDIFACT message into Claude/Cursor and ask "what's wrong with this?" — and get a real, operator-grade answer.
14
+
15
+ ---
16
+
17
+ ## Quick install · Claude Desktop
18
+
19
+ Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
20
+
21
+ ```json
22
+ {
23
+ "mcpServers": {
24
+ "prefixcheck-edi": {
25
+ "command": "npx",
26
+ "args": ["-y", "@prefixcheck/edi-mcp"]
27
+ }
28
+ }
29
+ }
30
+ ```
31
+
32
+ Restart Claude Desktop. The 9 EDI tools become available.
33
+
34
+ ## Quick install · Cursor
35
+
36
+ Add to `.cursor/mcp.json` (per-project) or `~/.cursor/mcp.json` (global):
37
+
38
+ ```json
39
+ {
40
+ "mcpServers": {
41
+ "prefixcheck-edi": {
42
+ "command": "npx",
43
+ "args": ["-y", "@prefixcheck/edi-mcp"]
44
+ }
45
+ }
46
+ }
47
+ ```
48
+
49
+ ## Quick install · Cline / Continue
50
+
51
+ Same `mcpServers` shape — both clients use the standard MCP configuration format.
52
+
53
+ ---
54
+
55
+ ## Tools (9)
56
+
57
+ | Tool | Returns |
58
+ | --------------------------- | ---------------------------------------------------------- |
59
+ | `parse_message` | Full ParsedMessage structure for a CODECO/COPRAR text |
60
+ | `diagnose_message` | All 11 SMDG-grade diagnostic findings |
61
+ | `reconcile_messages` | COPRAR ↔ CODECO field-level diff report (8 fields) |
62
+ | `validate_container_number` | ISO 6346 check digit (true/false + computed value) |
63
+ | `decode_size_type` | 4-character ISO size-type → operator-readable English |
64
+ | `lookup_code` | Any of 21 code-list values → English (DTM/LOC/EQD/NAD/...) |
65
+ | `segment_info` | Operator-grade name + brief for any 3-letter segment tag |
66
+ | `extract_containers` | All ISO 6346 container numbers from a message |
67
+ | `extract_locodes` | All UN/LOCODE values from LOC segments |
68
+
69
+ ## Resources (6)
70
+
71
+ | URI | Type | Content |
72
+ | --------------------- | ---- | --------------------------------------------------------------------- |
73
+ | `edi://schema/codeco` | json | CODECO message metadata (purpose, BGM codes, required segments) |
74
+ | `edi://schema/coprar` | json | COPRAR message metadata |
75
+ | `edi://sample/codeco` | text | Real-shape SMDG D.00B CODECO sample message |
76
+ | `edi://sample/coprar` | text | Real-shape SMDG D.00B COPRAR sample (matched pair with CODECO sample) |
77
+ | `edi://segments` | json | Full 32-segment dictionary |
78
+ | `edi://codes` | json | All 21 code lists with codes + English decodes |
79
+
80
+ ---
81
+
82
+ ## What you can do with it
83
+
84
+ **Depot dispatcher**: paste a CODECO into Claude, ask "what's wrong?" → tool runs `diagnose_message`, returns the failing rule (bad check digit, wrong DTM format, missing NAD+CF, etc.) with the exact segment that triggered it.
85
+
86
+ **Developer debugging**: paste a COPRAR your partner rejected → tool surfaces every SMDG validation failure with the rule that caught it.
87
+
88
+ **Reconciliation**: "here's the COPRAR I sent and the CODECO I got back — do they match?" → tool runs `reconcile_messages`, returns container-by-container field diffs (size-type, full/empty, POL, POD, booking, gross weight ±2%, VGM ±5%, reefer temp ±1°C).
89
+
90
+ **Reference**: "what does EQD position 5 mean?" → tool reads `edi://segments` + `edi://codes` resources.
91
+
92
+ **Training**: junior operator pastes a message → AI walks through each segment using `segment_info` + `lookup_code`.
93
+
94
+ ---
95
+
96
+ ## Built on
97
+
98
+ - [`@prefixcheck/edi`](https://www.npmjs.com/package/@prefixcheck/edi) — the underlying TS library
99
+ - [UN/EDIFACT D.00B](https://service.unece.org/trade/untdid/d00b/) — directory
100
+ - [SMDG](https://smdg.org/) — 2.1.3 ST VGM CODECO + COPRAR Implementation Guides
101
+ - Operator guides from DAKOSY (Hamburg), Valenciaport PCS, Transnet, EPB Bilbao
102
+
103
+ Companion surfaces:
104
+
105
+ - **In-browser tool**: [prefixcheck.com/container-edi/](https://prefixcheck.com/container-edi/)
106
+ - **Public HTTP API**: `POST /api/edi/decode` + `POST /api/edi/reconcile` at prefixcheck.com
107
+ - **Embeddable widget**: `<iframe src="https://prefixcheck.com/embed/edi/">`
108
+ - **npm library**: `npm install @prefixcheck/edi`
109
+ - **MCP server**: this package
110
+
111
+ ---
112
+
113
+ ## License
114
+
115
+ MIT
package/dist/index.js ADDED
@@ -0,0 +1,303 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * @prefixcheck/edi-mcp
4
+ *
5
+ * MCP server exposing operator-grade EDIFACT CODECO + COPRAR
6
+ * tooling to any MCP client (Claude Desktop, Cursor, Cline,
7
+ * Continue, Claude Code, etc.).
8
+ *
9
+ * Wraps the same parser + schemas that power:
10
+ * - https://prefixcheck.com/container-edi/ (in-browser tool)
11
+ * - https://prefixcheck.com/api/edi/decode (HTTP API)
12
+ * - @prefixcheck/edi (npm library)
13
+ *
14
+ * Nine tools + six resources. Pure stdio MCP — no HTTP server,
15
+ * no auth, no state.
16
+ */
17
+ import { Server } from "@modelcontextprotocol/sdk/server/index.js";
18
+ import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
19
+ import { CallToolRequestSchema, ListResourcesRequestSchema, ListToolsRequestSchema, ReadResourceRequestSchema, } from "@modelcontextprotocol/sdk/types.js";
20
+ import { parse, extractContainerNumbers, extractUNLocodes } from "./parser.js";
21
+ import { CODECO, COPRAR, CODE_LISTS, SEGMENTS, decodeISOSizeType, detectMessageType, diagnoseSingle, lookup, reconcile, segmentInfo, validateCheckDigit, } from "./schemas.js";
22
+ import { SAMPLE_CODECO, SAMPLE_COPRAR } from "./samples.js";
23
+ // -------------------------------------------------------------
24
+ // Server setup
25
+ // -------------------------------------------------------------
26
+ const SERVER_NAME = "prefixcheck-edi-mcp";
27
+ const SERVER_VERSION = "0.1.0";
28
+ const server = new Server({ name: SERVER_NAME, version: SERVER_VERSION }, { capabilities: { tools: {}, resources: {} } });
29
+ const TOOLS = [
30
+ {
31
+ name: "parse_message",
32
+ description: "Tokenize a raw EDIFACT CODECO or COPRAR message into structured segments + envelope metadata. Handles UNA delimiter overrides, UNB/UNZ + UNH/UNT envelopes, and release-character escapes. Returns the full ParsedMessage structure.",
33
+ inputSchema: {
34
+ type: "object",
35
+ properties: {
36
+ text: { type: "string", description: "Raw EDIFACT message text." },
37
+ },
38
+ required: ["text"],
39
+ },
40
+ },
41
+ {
42
+ name: "diagnose_message",
43
+ description: "Parse a CODECO or COPRAR message and run all 11 SMDG-grade diagnostic rules against it. Returns the list of findings (errors + warnings + info). Empty list = clean message.",
44
+ inputSchema: {
45
+ type: "object",
46
+ properties: {
47
+ text: { type: "string", description: "Raw EDIFACT message text." },
48
+ },
49
+ required: ["text"],
50
+ },
51
+ },
52
+ {
53
+ name: "reconcile_messages",
54
+ description: "Cross-message reconciliation between a COPRAR (carrier → terminal load list) and its matching CODECO (terminal → carrier gate report). Returns container-by-container field-level diff report. Tolerances: gross weight ±2%, VGM ±5%, reefer temp ±1°C.",
55
+ inputSchema: {
56
+ type: "object",
57
+ properties: {
58
+ coprar: { type: "string", description: "Raw EDIFACT COPRAR text." },
59
+ codeco: { type: "string", description: "Raw EDIFACT CODECO text." },
60
+ },
61
+ required: ["coprar", "codeco"],
62
+ },
63
+ },
64
+ {
65
+ name: "validate_container_number",
66
+ description: "Validate an ISO 6346 container number's check digit (mod-11 weighted-letter algorithm). Returns { valid: boolean, code, computed_check_digit }.",
67
+ inputSchema: {
68
+ type: "object",
69
+ properties: {
70
+ code: {
71
+ type: "string",
72
+ description: "11-character container number (e.g. 'MSCU1234566').",
73
+ },
74
+ },
75
+ required: ["code"],
76
+ },
77
+ },
78
+ {
79
+ name: "decode_size_type",
80
+ description: "Decode a 4-character ISO 6346 size-type code (e.g. '45R1') into operator-readable parts: size, type, height/variant, variant digit.",
81
+ inputSchema: {
82
+ type: "object",
83
+ properties: {
84
+ code: { type: "string", description: "4-character ISO 6346 size-type code." },
85
+ },
86
+ required: ["code"],
87
+ },
88
+ },
89
+ {
90
+ name: "lookup_code",
91
+ description: "Decode any code-list value to plain English. Lists available: BGM.docname, BGM.function, DTM.qualifier, DTM.format, LOC.qualifier, EQD.type, EQD.supplier, EQD.fullEmpty, STS.code, RFF.qualifier, NAD.party, MEA.qualifier, MEA.unit, VGM.method, HAN.code, SEL.party, FTX.qualifier, TDT.mode, TDT.idCodeList, CNT.qualifier, UNB.syntax.",
92
+ inputSchema: {
93
+ type: "object",
94
+ properties: {
95
+ list_name: { type: "string", description: "Code list name (e.g. 'DTM.qualifier')." },
96
+ code: { type: "string", description: "Code value (e.g. '137')." },
97
+ },
98
+ required: ["list_name", "code"],
99
+ },
100
+ },
101
+ {
102
+ name: "segment_info",
103
+ description: "Get the operator-grade English name + brief explanation for a 3-letter EDIFACT segment tag (e.g. 'EQD', 'LOC', 'TDT').",
104
+ inputSchema: {
105
+ type: "object",
106
+ properties: {
107
+ tag: { type: "string", description: "3-letter segment tag." },
108
+ },
109
+ required: ["tag"],
110
+ },
111
+ },
112
+ {
113
+ name: "extract_containers",
114
+ description: "Extract every ISO 6346-shaped container number (4 letters + 7 digits) from anywhere in an EDIFACT message. Returns deduplicated list.",
115
+ inputSchema: {
116
+ type: "object",
117
+ properties: {
118
+ text: { type: "string", description: "Raw EDIFACT message text." },
119
+ },
120
+ required: ["text"],
121
+ },
122
+ },
123
+ {
124
+ name: "extract_locodes",
125
+ description: "Extract every 5-character UN/LOCODE (2-letter country + 3-char place) from LOC segments in an EDIFACT message. Returns deduplicated list.",
126
+ inputSchema: {
127
+ type: "object",
128
+ properties: {
129
+ text: { type: "string", description: "Raw EDIFACT message text." },
130
+ },
131
+ required: ["text"],
132
+ },
133
+ },
134
+ ];
135
+ server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: TOOLS }));
136
+ server.setRequestHandler(CallToolRequestSchema, async (request) => {
137
+ const { name, arguments: args } = request.params;
138
+ const a = (args || {});
139
+ try {
140
+ switch (name) {
141
+ case "parse_message": {
142
+ const text = String(a.text || "");
143
+ const parsed = parse(text);
144
+ return jsonResult({
145
+ message_type: detectMessageType(parsed),
146
+ interchange: parsed.interchange,
147
+ message: parsed.message,
148
+ segments: parsed.segments,
149
+ delimiters: parsed.delimiters,
150
+ envelope_warnings: parsed.envelopeWarnings,
151
+ });
152
+ }
153
+ case "diagnose_message": {
154
+ const text = String(a.text || "");
155
+ const parsed = parse(text);
156
+ const diagnostics = diagnoseSingle(parsed);
157
+ return jsonResult({
158
+ message_type: detectMessageType(parsed),
159
+ diagnostics,
160
+ counts: {
161
+ errors: diagnostics.filter((d) => d.level === "error").length,
162
+ warnings: diagnostics.filter((d) => d.level === "warn").length,
163
+ infos: diagnostics.filter((d) => d.level === "info").length,
164
+ },
165
+ });
166
+ }
167
+ case "reconcile_messages": {
168
+ const coprar = parse(String(a.coprar || ""));
169
+ const codeco = parse(String(a.codeco || ""));
170
+ return jsonResult({
171
+ report: reconcile(coprar, codeco),
172
+ coprar_warnings: coprar.envelopeWarnings,
173
+ codeco_warnings: codeco.envelopeWarnings,
174
+ });
175
+ }
176
+ case "validate_container_number": {
177
+ const code = String(a.code || "");
178
+ const valid = validateCheckDigit(code);
179
+ return jsonResult({ code, valid });
180
+ }
181
+ case "decode_size_type": {
182
+ const code = String(a.code || "");
183
+ const decoded = decodeISOSizeType(code);
184
+ return jsonResult({ code, decoded });
185
+ }
186
+ case "lookup_code": {
187
+ const list_name = String(a.list_name || "");
188
+ const code = String(a.code || "");
189
+ const decoded = lookup(list_name, code);
190
+ return jsonResult({ list_name, code, decoded });
191
+ }
192
+ case "segment_info": {
193
+ const tag = String(a.tag || "").toUpperCase();
194
+ return jsonResult({ tag, ...segmentInfo(tag) });
195
+ }
196
+ case "extract_containers": {
197
+ const parsed = parse(String(a.text || ""));
198
+ return jsonResult({ container_numbers: extractContainerNumbers(parsed) });
199
+ }
200
+ case "extract_locodes": {
201
+ const parsed = parse(String(a.text || ""));
202
+ return jsonResult({ un_locodes: extractUNLocodes(parsed) });
203
+ }
204
+ default:
205
+ return jsonResult({ error: `Unknown tool: ${name}` }, true);
206
+ }
207
+ }
208
+ catch (err) {
209
+ return jsonResult({ error: err instanceof Error ? err.message : "Unknown error", tool: name }, true);
210
+ }
211
+ });
212
+ function jsonResult(payload, isError = false) {
213
+ return {
214
+ content: [{ type: "text", text: JSON.stringify(payload, null, 2) }],
215
+ ...(isError ? { isError: true } : {}),
216
+ };
217
+ }
218
+ // -------------------------------------------------------------
219
+ // Resources
220
+ // -------------------------------------------------------------
221
+ const RESOURCES = [
222
+ {
223
+ uri: "edi://schema/codeco",
224
+ name: "CODECO schema",
225
+ description: "CODECO message metadata: name, longName, purpose, BGM codes, required segments.",
226
+ mimeType: "application/json",
227
+ },
228
+ {
229
+ uri: "edi://schema/coprar",
230
+ name: "COPRAR schema",
231
+ description: "COPRAR message metadata: name, longName, purpose, BGM codes, required segments.",
232
+ mimeType: "application/json",
233
+ },
234
+ {
235
+ uri: "edi://sample/codeco",
236
+ name: "CODECO sample",
237
+ description: "Real-shape SMDG D.00B CODECO sample message (gate-in, terminal → carrier, MSCU1234566 full 40HC NLRTM → USNYC).",
238
+ mimeType: "text/plain",
239
+ },
240
+ {
241
+ uri: "edi://sample/coprar",
242
+ name: "COPRAR sample",
243
+ description: "Real-shape SMDG D.00B COPRAR Load sample message (carrier → terminal, 3 containers including 1 reefer, matched-pair with the CODECO sample on MSCU1234566).",
244
+ mimeType: "text/plain",
245
+ },
246
+ {
247
+ uri: "edi://segments",
248
+ name: "Segment dictionary",
249
+ description: "Full 32-segment dictionary with operator-grade name + brief for every common CODECO/COPRAR segment.",
250
+ mimeType: "application/json",
251
+ },
252
+ {
253
+ uri: "edi://codes",
254
+ name: "Code lists index",
255
+ description: "Index of all 21 code lists available via lookup_code. Each list has 5-40 codes with English decodes.",
256
+ mimeType: "application/json",
257
+ },
258
+ ];
259
+ server.setRequestHandler(ListResourcesRequestSchema, async () => ({ resources: RESOURCES }));
260
+ server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
261
+ const uri = request.params.uri;
262
+ switch (uri) {
263
+ case "edi://schema/codeco":
264
+ return {
265
+ contents: [{ uri, mimeType: "application/json", text: JSON.stringify(CODECO, null, 2) }],
266
+ };
267
+ case "edi://schema/coprar":
268
+ return {
269
+ contents: [{ uri, mimeType: "application/json", text: JSON.stringify(COPRAR, null, 2) }],
270
+ };
271
+ case "edi://sample/codeco":
272
+ return { contents: [{ uri, mimeType: "text/plain", text: SAMPLE_CODECO }] };
273
+ case "edi://sample/coprar":
274
+ return { contents: [{ uri, mimeType: "text/plain", text: SAMPLE_COPRAR }] };
275
+ case "edi://segments":
276
+ return {
277
+ contents: [{ uri, mimeType: "application/json", text: JSON.stringify(SEGMENTS, null, 2) }],
278
+ };
279
+ case "edi://codes": {
280
+ const index = Object.fromEntries(Object.entries(CODE_LISTS).map(([k, v]) => [
281
+ k,
282
+ { code_count: Object.keys(v).length, codes: v },
283
+ ]));
284
+ return {
285
+ contents: [{ uri, mimeType: "application/json", text: JSON.stringify(index, null, 2) }],
286
+ };
287
+ }
288
+ default:
289
+ throw new Error(`Unknown resource: ${uri}`);
290
+ }
291
+ });
292
+ // -------------------------------------------------------------
293
+ // Boot
294
+ // -------------------------------------------------------------
295
+ async function main() {
296
+ const transport = new StdioServerTransport();
297
+ await server.connect(transport);
298
+ process.stderr.write(`${SERVER_NAME} v${SERVER_VERSION} ready · 9 tools · 6 resources\n`);
299
+ }
300
+ main().catch((err) => {
301
+ process.stderr.write(`Fatal: ${err instanceof Error ? err.message : String(err)}\n`);
302
+ process.exit(1);
303
+ });
package/dist/parser.js ADDED
@@ -0,0 +1,246 @@
1
+ // ============================================================
2
+ // @prefixcheck/edi · EDIFACT tokenizer + envelope handling
3
+ //
4
+ // Universal layer that turns raw EDIFACT text into a structured
5
+ // object tree. The schema layer (CODECO/COPRAR validation) runs
6
+ // on top of this and is loaded separately so new message types
7
+ // can be added without touching the parser.
8
+ //
9
+ // EDIFACT delimiter conventions:
10
+ // element separator default '+'
11
+ // composite separator default ':'
12
+ // segment terminator default "'"
13
+ // release character default '?' (escapes the next char)
14
+ // decimal default '.' (or ',')
15
+ // repetition default '*'
16
+ //
17
+ // The optional UNA segment at the start of an interchange overrides
18
+ // the defaults. Format: `UNA:+.? '` — exactly 6 single-character
19
+ // overrides in the order: composite, element, decimal, release,
20
+ // repetition, segment.
21
+ // ============================================================
22
+ export const DEFAULT_DELIMITERS = Object.freeze({
23
+ element: "+",
24
+ composite: ":",
25
+ segment: "'",
26
+ release: "?",
27
+ decimal: ".",
28
+ repetition: "*",
29
+ });
30
+ function parseUNA(raw) {
31
+ if (raw.length < 9 || raw.slice(0, 3) !== "UNA") {
32
+ return { delimiters: { ...DEFAULT_DELIMITERS }, rest: raw };
33
+ }
34
+ const spec = raw.slice(3, 9);
35
+ return {
36
+ delimiters: {
37
+ composite: spec[0],
38
+ element: spec[1],
39
+ decimal: spec[2],
40
+ release: spec[3],
41
+ repetition: spec[4],
42
+ segment: spec[5],
43
+ },
44
+ rest: raw.slice(9),
45
+ };
46
+ }
47
+ /**
48
+ * Split text by an unescaped delimiter character. A delimiter preceded
49
+ * by the release character (default `?`) is a literal, not a delimiter.
50
+ *
51
+ * Note: this function PRESERVES release-character escape sequences in
52
+ * the output. The escape (`?X`) stays as `?X` so that downstream splits
53
+ * (e.g., element → composite → sub-element) can also honour escapes.
54
+ * The final value layer strips escapes with `unescape()`.
55
+ */
56
+ function splitUnescaped(text, delim, release) {
57
+ const parts = [];
58
+ let buf = "";
59
+ for (let i = 0; i < text.length; i++) {
60
+ const c = text[i];
61
+ if (c === release && i + 1 < text.length) {
62
+ buf += c + text[i + 1];
63
+ i++;
64
+ continue;
65
+ }
66
+ if (c === delim) {
67
+ parts.push(buf);
68
+ buf = "";
69
+ continue;
70
+ }
71
+ buf += c;
72
+ }
73
+ parts.push(buf);
74
+ return parts;
75
+ }
76
+ /**
77
+ * Strip release-character escape sequences from a sub-element value.
78
+ * Called only at the leaf layer, after all splitting is done.
79
+ */
80
+ function unescape(text, release) {
81
+ let out = "";
82
+ for (let i = 0; i < text.length; i++) {
83
+ if (text[i] === release && i + 1 < text.length) {
84
+ out += text[i + 1];
85
+ i++;
86
+ }
87
+ else {
88
+ out += text[i];
89
+ }
90
+ }
91
+ return out;
92
+ }
93
+ /**
94
+ * Strip whitespace following each segment terminator without touching
95
+ * content inside segments. Human-readable transmission commonly
96
+ * inserts `\r\n` after each terminator; not part of the standard
97
+ * but ubiquitous in archived files and operator pastes.
98
+ */
99
+ function normalizeWhitespace(text, segDelim) {
100
+ let out = "";
101
+ for (let i = 0; i < text.length; i++) {
102
+ const c = text[i];
103
+ out += c;
104
+ if (c === segDelim) {
105
+ while (i + 1 < text.length && /\s/.test(text[i + 1]))
106
+ i++;
107
+ }
108
+ }
109
+ return out;
110
+ }
111
+ function extractEnvelopes(segments) {
112
+ let interchange = null;
113
+ let message = null;
114
+ const warnings = [];
115
+ const first = segments[0];
116
+ const last = segments[segments.length - 1];
117
+ if (first && first.tag === "UNB") {
118
+ interchange = {
119
+ syntaxId: (first.elements[0] || [])[0] || "",
120
+ syntaxVer: (first.elements[0] || [])[1] || "",
121
+ sender: (first.elements[1] || [])[0] || "",
122
+ senderQual: (first.elements[1] || [])[1] || "",
123
+ recipient: (first.elements[2] || [])[0] || "",
124
+ recipQual: (first.elements[2] || [])[1] || "",
125
+ dateTime: (first.elements[3] || []).join(":") || "",
126
+ controlRef: (first.elements[4] || [])[0] || "",
127
+ };
128
+ if (!last || last.tag !== "UNZ") {
129
+ warnings.push("UNB interchange header found but no UNZ trailer.");
130
+ }
131
+ }
132
+ for (let i = 0; i < segments.length; i++) {
133
+ if (segments[i].tag === "UNH") {
134
+ const unh = segments[i];
135
+ message = {
136
+ controlRef: (unh.elements[0] || [])[0] || "",
137
+ type: (unh.elements[1] || [])[0] || "",
138
+ version: (unh.elements[1] || [])[1] || "",
139
+ release: (unh.elements[1] || [])[2] || "",
140
+ agency: (unh.elements[1] || [])[3] || "",
141
+ assocCode: (unh.elements[1] || [])[4] || "",
142
+ };
143
+ break;
144
+ }
145
+ }
146
+ if (!message) {
147
+ warnings.push("No UNH message header found. Parsed as raw segment body.");
148
+ }
149
+ return { interchange, message, envelopeWarnings: warnings };
150
+ }
151
+ /**
152
+ * Tokenize a raw EDIFACT string into structured form.
153
+ *
154
+ * Accepts any of:
155
+ * - bare message body (UNH ... UNT)
156
+ * - full interchange (optional UNA, UNB ... UNZ wrapping one or more messages)
157
+ * - whitespace-separated segments (newlines between `'` terminators)
158
+ *
159
+ * @example
160
+ * ```ts
161
+ * import { parse } from "@prefixcheck/edi";
162
+ * const parsed = parse("UNH+1+CODECO:D:00B:UN:SMDG21'BGM+34+REF+9'...");
163
+ * console.log(parsed.message?.type); // "CODECO"
164
+ * ```
165
+ */
166
+ export function parse(rawInput) {
167
+ if (!rawInput || typeof rawInput !== "string") {
168
+ return {
169
+ interchange: null,
170
+ message: null,
171
+ segments: [],
172
+ delimiters: { ...DEFAULT_DELIMITERS },
173
+ envelopeWarnings: ["Empty input."],
174
+ };
175
+ }
176
+ // Strip BOM + outer whitespace
177
+ const trimmed = rawInput.replace(/^/, "").trim();
178
+ const unaResult = parseUNA(trimmed);
179
+ const delim = unaResult.delimiters;
180
+ const body = normalizeWhitespace(unaResult.rest, delim.segment);
181
+ const rawSegments = splitUnescaped(body, delim.segment, delim.release);
182
+ const segments = [];
183
+ let bodyIndex = 0;
184
+ for (let s = 0; s < rawSegments.length; s++) {
185
+ const segText = rawSegments[s].trim();
186
+ if (!segText)
187
+ continue;
188
+ const elements = splitUnescaped(segText, delim.element, delim.release);
189
+ const tagRaw = elements.shift() || "";
190
+ const tag = unescape(tagRaw, delim.release);
191
+ // Composite split → unescape each leaf sub-element value.
192
+ const composed = elements.map((el) => splitUnescaped(el, delim.composite, delim.release).map((v) => unescape(v, delim.release)));
193
+ segments.push({
194
+ tag,
195
+ index: bodyIndex++,
196
+ elements: composed,
197
+ raw: unescape(segText, delim.release),
198
+ });
199
+ }
200
+ const env = extractEnvelopes(segments);
201
+ return {
202
+ interchange: env.interchange,
203
+ message: env.message,
204
+ segments,
205
+ delimiters: delim,
206
+ envelopeWarnings: env.envelopeWarnings,
207
+ };
208
+ }
209
+ /**
210
+ * Extract every ISO 6346-shaped container number (4 letters + 7 digits)
211
+ * found anywhere in the parsed message. Useful for cross-referencing
212
+ * to a registry or driving downstream linking.
213
+ */
214
+ export function extractContainerNumbers(parsed) {
215
+ const pattern = /\b[A-Z]{4}\d{7}\b/g;
216
+ const seen = new Set();
217
+ const out = [];
218
+ for (const seg of parsed.segments) {
219
+ const matches = seg.raw.match(pattern) || [];
220
+ for (const m of matches) {
221
+ if (!seen.has(m)) {
222
+ seen.add(m);
223
+ out.push(m);
224
+ }
225
+ }
226
+ }
227
+ return out;
228
+ }
229
+ /**
230
+ * Extract every UN/LOCODE-shaped token (5 chars: 2-letter country
231
+ * code + 3-letter/digit location code) from LOC segments specifically.
232
+ */
233
+ export function extractUNLocodes(parsed) {
234
+ const seen = new Set();
235
+ const out = [];
236
+ for (const seg of parsed.segments) {
237
+ if (seg.tag !== "LOC")
238
+ continue;
239
+ const place = (seg.elements[1] || [])[0];
240
+ if (place && /^[A-Z]{2}[A-Z0-9]{3}$/.test(place) && !seen.has(place)) {
241
+ seen.add(place);
242
+ out.push(place);
243
+ }
244
+ }
245
+ return out;
246
+ }