xlsx-for-ai 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,5 +1,7 @@
1
1
  # xlsx-for-ai
2
2
 
3
+ > 👋 **New here? Not a programmer?** → [Read WHY.md for the plain-English version](WHY.md). The README below is the technical reference.
4
+
3
5
  Converts spreadsheets into text, **markdown**, JSON, SQL, or schema dumps that AI coding agents can actually read.
4
6
 
5
7
  AI tools — Claude, Cursor, Copilot, ChatGPT, and other LLM coding agents — can read text files but **not** `.xlsx` binaries. This CLI bridges the gap.
@@ -8,6 +10,8 @@ AI tools — Claude, Cursor, Copilot, ChatGPT, and other LLM coding agents — c
8
10
 
9
11
  **Output modes:** text dump, markdown tables (best LLM comprehension per token), JSON, SQL `CREATE TABLE`+`INSERT`, inferred schema, workbook diff.
10
12
 
13
+ **Write mode** (`xlsx-for-ai write`): builds a real `.xlsx` from a JSON or markdown spec — closes the round-trip so an AI agent that reads a spreadsheet can also produce a corrected version. Supports formulas, formatting, merged cells, named ranges, frozen panes, and column widths. Verified lossless on 29/30 real workbooks (the one MINOR is a CRLF→LF cosmetic difference). See [`xlsx-for-ai write --help`](README.md#write-mode-xlsx-for-ai-write).
14
+
11
15
  It extracts everything a human would see in Excel:
12
16
 
13
17
  - **Values** — strings, numbers, dates
@@ -109,6 +113,75 @@ npx xlsx-for-ai data.xlsx "Sheet1" --stdout --max-rows 50 --compact
109
113
  | `--stream` | Streaming reader for huge `.xlsx` files (>100MB); emits row-by-row, drops some sheet metadata |
110
114
  | `-h`, `--help` | Show help |
111
115
 
116
+ ### Write mode (`xlsx-for-ai write`)
117
+
118
+ The `write` sub-command produces a real `.xlsx` from a JSON or markdown spec.
119
+
120
+ ```bash
121
+ xlsx-for-ai write spec.json # → spec.xlsx
122
+ xlsx-for-ai write spec.json -o report.xlsx # explicit output
123
+ xlsx-for-ai write report.md # markdown table → xlsx
124
+ cat spec.json | xlsx-for-ai write - # stdin
125
+ ```
126
+
127
+ Minimum JSON spec:
128
+
129
+ ```json
130
+ {
131
+ "name": "Budget",
132
+ "headers": ["Category", "Q1", "Q2"],
133
+ "rows": [
134
+ ["Marketing", 10000, 12000],
135
+ ["R&D", 50000, 55000]
136
+ ]
137
+ }
138
+ ```
139
+
140
+ Multi-sheet, with formulas:
141
+
142
+ ```json
143
+ {
144
+ "sheets": [
145
+ {
146
+ "name": "Summary",
147
+ "headers": ["Region", "Revenue", "Cost", "Profit"],
148
+ "rows": [
149
+ ["North", 100, 60, {"formula": "=B2-C2"}],
150
+ ["South", 200, 110, {"formula": "=B3-C3"}]
151
+ ],
152
+ "frozen": {"rowSplit": 1, "colSplit": 0}
153
+ },
154
+ {
155
+ "name": "Detail",
156
+ "headers": ["SKU", "Qty"],
157
+ "rows": [["A", 10], ["B", 20]]
158
+ }
159
+ ],
160
+ "namedRanges": {"Profits": "Summary!D2:D3"}
161
+ }
162
+ ```
163
+
164
+ **Round-trip:** the output of `xlsx-for-ai data.xlsx --json` is a valid input to `xlsx-for-ai write`, so reading then re-writing reproduces the file (verified on 29/30 real workbooks; the one MINOR is a CRLF→LF normalization in shared strings — visible content is identical).
165
+
166
+ **Markdown spec:** one or more tables; `## Sheet Name` headings split into multiple sheets. Backtick-fenced cells become formulas (e.g., `` `=A1+B1` ``). Numbers, booleans, and ISO dates auto-detect.
167
+
168
+ **v1 limitations:** edit-in-place (deferred to v1.5), charts, pivot tables, conditional formatting, images, macros — none of these are written. Shared formulas degrade to their cached values (formula link is lost; computed value is preserved).
169
+
170
+ #### The `_xlsx-for-ai` review tab
171
+
172
+ When the round-trip introduces any lossy structural changes (shared-formula degradation, line-ending normalization, etc.), `xlsx-for-ai write` adds a `_xlsx-for-ai` sheet to the output as the last tab. It's a **review note**, not just a warning list — for each issue type it explains:
173
+
174
+ - **What happened** — the source structure that couldn't be preserved
175
+ - **What we did** — the choice the tool made
176
+ - **Risk** — what could go wrong (e.g., *"if you edit cells the formula depended on, they won't recalculate"*)
177
+ - **Tradeoff** — what's worse about this choice vs. alternatives
178
+ - **Alternative** — exactly what flag/source change to apply if you want different behavior
179
+ - **Affected cells** — the specific refs, plus a full detail table at the bottom
180
+
181
+ The point: the user (or an AI agent reading the file) can understand every decision the tool made and override any of them. Same shape as a code reviewer's PR comment — observation + reasoning + alternative.
182
+
183
+ `--no-report` suppresses the tab if you want byte-clean output (useful for CI / round-trip tests). The `--diff` mode also ignores the `_xlsx-for-ai` tab automatically so it doesn't pollute change reports.
184
+
112
185
  Output files are written to `.xlsx-read/` in the current working directory.
113
186
  The path(s) are printed to stdout so your agent knows where to read.
114
187
 
package/WHY.md ADDED
@@ -0,0 +1,57 @@
1
+ # Why xlsx-for-ai exists
2
+
3
+ *A plain-English version. For the technical reference, see [README.md](README.md).*
4
+
5
+ ## The problem you've probably hit
6
+
7
+ You have a spreadsheet — a budget, a financial model, a tax estimate, a list of customers. You ask Claude (or ChatGPT, or Cursor) for help with it.
8
+
9
+ So you copy and paste a section into the chat. The AI gives you advice that sounds reasonable but feels generic. It misses the broken formula in row 47. It doesn't notice that one tab's totals don't match another tab's source. It can't tell you why the gross margin number changes when you add a new column. It treats your spreadsheet as a blob of numbers — because that's all it can see.
10
+
11
+ You're not going crazy. The AI literally cannot read the file. It can read text, code, even images of your spreadsheet — but the actual `.xlsx` binary is invisible to it. Formulas, formatting, named ranges, links between sheets — all of that disappears the moment you hit copy-paste.
12
+
13
+ ## What changes when you install this
14
+
15
+ Once `xlsx-for-ai` is on your machine, your AI tools (Claude, Cursor, Copilot, ChatGPT desktop apps with code execution) can finally **read your spreadsheet the way they read everything else** — every formula, every colored cell, every hidden row, every formula reference between sheets.
16
+
17
+ Now when you ask for help, you get a real review:
18
+
19
+ - *"Cell B47 has `#REF!` — it's pointing at a sheet you renamed last week."*
20
+ - *"Your gross margin formula in row 12 references the wrong column on the COGS tab — it's pulling Q3 numbers into the Q4 totals."*
21
+ - *"This 'Total' cell on the Summary tab shows $312k, but if I add up the source rows on the Detail tab I get $327k. Something's off."*
22
+
23
+ That's the difference between a friend skimming the printed numbers and an analyst who actually opens the file.
24
+
25
+ ## Things that become possible
26
+
27
+ A few examples people find useful:
28
+
29
+ - **Have your AI find errors in a financial model** before you send it to your accountant or your board.
30
+ - **Have your AI hand you back a corrected version** — not just *say* what should change, but actually produce the fixed `.xlsx` with the changes applied. The corrected file even includes a built-in review note explaining what the AI changed, why, and how to override anything you don't agree with. Same shape as having a careful editor mark up your draft.
31
+ - **Compare two versions of the same spreadsheet** ("what changed between V11 and V14?") and get a list of every cell that moved.
32
+ - **Turn a CSV export from QuickBooks into a clean SQL database table** in one command, with the column types figured out automatically.
33
+ - **Walk through a 50-tab model someone else built** and have the AI explain how the sheets reference each other.
34
+ - **Process a folder of legacy `.xls` files** that won't even open in modern Excel without complaint.
35
+
36
+ ## How to actually use it
37
+
38
+ It's a small command-line tool. Once a programmer sets it up (one line: `npm install -g xlsx-for-ai`), you don't have to think about it again — your AI tools pick it up automatically and start using it whenever they encounter a spreadsheet.
39
+
40
+ If you're the programmer doing the install, the [README](README.md) has the full reference. If you're handing this to a programmer to set up for you, that link is what they'll need.
41
+
42
+ ## How it works in plain terms
43
+
44
+ Today's AI is great at reasoning about text but blind to spreadsheet binaries. `xlsx-for-ai` is the translator in both directions:
45
+
46
+ - **Reading:** turns your spreadsheet into a format the AI can fully see — every formula, every formatting cue, every relationship between sheets.
47
+ - **Writing:** turns the AI's response back into a real `.xlsx` file you can open, edit, and share. The AI can now hand you a corrected workbook, not just words about it.
48
+
49
+ When the AI delivers you a corrected file, the file itself contains a small **review tab** explaining what the AI changed about the structure, why it made each call, what the risks are, and what your alternatives are if you'd prefer a different approach. The tool's design follows the *supervisor* model — it surfaces decisions for you to review, rather than silently making changes you'd discover later.
50
+
51
+ ## Why this didn't exist before
52
+
53
+ Spreadsheet libraries are designed for developers building software *on top of* spreadsheets. They output JavaScript objects, database rows, raw bytes — formats other programs consume. None of them were designed for the case where the consumer is a language model and the goal is a text format the model can actually understand.
54
+
55
+ `xlsx-for-ai` is the first one built specifically for that. The output is shaped for an LLM's context window — markdown tables when the model just needs to read, structured JSON when it needs to reason, token-aware truncation when the spreadsheet is too big to fit, and a real `.xlsx` writer that produces a file you can hand back to a human along with a built-in note explaining everything that changed.
56
+
57
+ It's a small tool. It just happens to fix the one thing standing between AI assistants and the file format most knowledge work actually lives in.
@@ -1,50 +1,139 @@
1
1
  ---
2
- description: Reading .xlsx spreadsheet files
2
+ description: Reading, writing, and converting spreadsheets (.xlsx, .xls, .xlsb, .ods, .csv, .tsv) for AI agents
3
3
  globs:
4
4
  alwaysApply: true
5
5
  ---
6
6
 
7
- # Reading .xlsx Files
8
-
9
- The Read tool cannot open `.xlsx` files directly. When you need to inspect a spreadsheet:
10
-
11
- 1. **Run the converter** from the project root:
12
- ```bash
13
- npx xlsx-for-ai <path-to-file.xlsx> [sheetName]
14
- ```
15
- - If `sheetName` is omitted, all sheets are dumped.
16
- - Output is written to `.xlsx-read/<filename>--<sheet>.txt` (project root).
17
- - The script prints the output path(s) to stdout.
18
-
19
- 2. **Read the output file** with the Read tool. It contains:
20
- - Sheet metadata (frozen panes, column widths, merged cells, auto-filters, print areas)
21
- - Named ranges referencing the sheet
22
- - Table definitions (name, range, columns)
23
- - Image positions
24
- - Every row with its cells, showing:
25
- - **Value** — always present
26
- - **Formula** — `[formula: =SUM(A1:A10)]` (master cell) or `[shared formula ref: D2]` (drag-fill follow-up)
27
- - **Number format** — `[numFmt: 0.00%]` if not "General"
28
- - **Font** — `[bold]`, `[italic]`, `[color:FF8B0000]`
29
- - **Fill** — `[fill:FFFFFF00]` if background color set
30
- - **Alignment** — `[align:center]` if non-default
31
- - **Hyperlink** — `[link: https://...]` if the cell contains a URL
32
- - **Comment** — `[note: ...]` if the cell has a comment or note
33
- - **Validation** — `[validation: list [...]]` if the cell has data validation
34
- - **Hidden** — `[hidden]` on the row header if the row is hidden
35
- - Empty cells are omitted; empty rows show `(empty)`.
36
-
37
- 3. **Do not ask the user** before running this. Just run it when you need to see an `.xlsx` file.
38
-
39
- ## Useful flags
40
- - `--list-sheets` — list sheet names and dimensions without dumping content
41
- - `--stdout` — print directly to stdout instead of writing files
42
- - `--json` — emit structured JSON (one object per cell) for easier programmatic use
43
- - `--compact` — suppress noisy default tags (default text colors, white fills) to reduce token usage
44
- - `--max-rows N` — limit output to first N rows (use for large sheets)
45
- - `--max-cols N` — limit output to first N columns (use for very wide sheets)
7
+ # Reading and Writing Spreadsheets
8
+
9
+ This tool does both directions: read existing spreadsheets into LLM-readable text/JSON/markdown, AND build new `.xlsx` files from JSON or markdown specs.
10
+
11
+ ## Reading
12
+
13
+ The Read tool cannot open binary spreadsheet files directly. When you need to inspect or process a spreadsheet, use `xlsx-for-ai`.
14
+
15
+ **Supported input:** `.xlsx` `.xls` `.xlsb` `.ods` `.csv` `.tsv`
16
+
17
+ ## Basic usage
18
+
19
+ ```bash
20
+ npx xlsx-for-ai <file> [sheetName]
21
+ ```
22
+
23
+ - If `sheetName` is omitted, all sheets are dumped.
24
+ - Default output: text dump to `.xlsx-read/<filename>--<sheet>.txt` (project root).
25
+ - Path(s) printed to stdout for the agent to read next.
26
+
27
+ **Do not ask the user before running this.** Just run it when you encounter a spreadsheet.
28
+
29
+ ## Pick the right output mode
30
+
31
+ | When you want… | Use | Why |
32
+ |---|---|---|
33
+ | To read a sheet for context | `--md --stdout` | Markdown tables — best LLM comprehension per token |
34
+ | Programmatic per-cell access | `--json --stdout` | Structured, one object per cell with formula/format/style |
35
+ | To bound output to a context window | `--max-tokens 8000 --stdout` | Truncates with a tail summary noting what was dropped |
36
+ | To understand columns / build a query | `--schema --stdout` | Inferred types per column (INTEGER / NUMERIC / DATE / BOOLEAN / TEXT) |
37
+ | To import data into a database | `--sql --stdout` | `CREATE TABLE` + `INSERT` statements, types from --schema |
38
+ | To compare two versions | `--diff OTHER --stdout` | Emits added/removed/changed sheets and cells |
39
+ | To list sheets without parsing | `--list-sheets` | Fast probe; lighter than full read |
40
+ | To handle huge files (>100MB) | `--stream --stdout` | Row-by-row reader, drops some sheet metadata |
41
+
42
+ ## Selection (focus the output)
43
+
44
+ | Flag | Effect |
45
+ |---|---|
46
+ | `[sheetName]` | Positional second arg — only this sheet |
47
+ | `--range A1:D50` | Only this rectangular range |
48
+ | `--named-range NAME` | Only the cells covered by a workbook-defined name |
49
+ | `--max-rows N` | Cap rows per sheet |
50
+ | `--max-cols N` | Cap columns per sheet |
51
+
52
+ Combine selection flags with any output mode. Use `--range` aggressively when you only need a section of a large model — it can reduce context 50× vs. dumping the whole sheet.
53
+
54
+ ## Other useful flags
55
+
56
+ - `--compact` — suppress noisy default tags (default colors, "General" format, white fills)
57
+ - `--evaluate` — promote cached formula results to the primary value; recompute simple formulas via formulajs
58
+ - `--stdout` — print directly instead of writing files
59
+
60
+ ## What the default text dump contains
61
+
62
+ - Sheet metadata (frozen panes, column widths, merged cells, auto-filters, print areas)
63
+ - Named ranges referencing the sheet
64
+ - Table definitions (name, range, columns)
65
+ - Image positions
66
+ - Every non-empty row with its cells, showing:
67
+ - **Value** — always present
68
+ - **Formula** — `[formula: =SUM(A1:A10)]` (master) or `[shared formula ref: D2]` (drag-fill follow-up)
69
+ - **Number format** — `[numFmt: 0.00%]` if not "General"
70
+ - **Font** — `[bold]`, `[italic]`, `[color:FF8B0000]`
71
+ - **Fill** — `[fill:FFFFFF00]`
72
+ - **Alignment** — `[align:center]` if non-default
73
+ - **Hyperlink** — `[link: https://...]`
74
+ - **Comment** — `[note: ...]`
75
+ - **Validation** — `[validation: list [...]]`
76
+ - **Hidden** — `[hidden]` on the row header
77
+
78
+ ## Examples
79
+
80
+ ```bash
81
+ # Quick read of a financial model — markdown is most context-efficient
82
+ npx xlsx-for-ai model.xlsx --md --stdout
83
+
84
+ # Just one sheet, fits in a small context window
85
+ npx xlsx-for-ai model.xlsx "Assumptions" --md --stdout --max-tokens 4000
86
+
87
+ # Schema for a CSV before writing SQL
88
+ npx xlsx-for-ai data.csv --schema --stdout
89
+
90
+ # Surgical extraction — only one section of a huge sheet
91
+ npx xlsx-for-ai model.xlsx "Detail" --range B5:H50 --stdout
92
+
93
+ # Compare two model versions, get a change summary
94
+ npx xlsx-for-ai v1.xlsx --diff v2.xlsx --stdout
95
+
96
+ # Huge file (>100MB) — streaming mode keeps memory bounded
97
+ npx xlsx-for-ai dump.xlsx --stream --stdout --max-rows 1000
98
+ ```
99
+
100
+ ## Writing
101
+
102
+ Produce a real `.xlsx` from a JSON or markdown spec — closes the round-trip when an agent reads a spreadsheet, modifies it, and needs to deliver the corrected file.
103
+
104
+ ```bash
105
+ xlsx-for-ai write spec.json # → spec.xlsx
106
+ xlsx-for-ai write spec.json -o report.xlsx # explicit output path
107
+ cat spec.json | xlsx-for-ai write - # spec from stdin
108
+ ```
109
+
110
+ **JSON spec** — minimum (single sheet):
111
+ ```json
112
+ { "name": "Budget", "headers": ["Cat", "Q1"], "rows": [["Marketing", 10000]] }
113
+ ```
114
+
115
+ **JSON spec** — multi-sheet with formulas:
116
+ ```json
117
+ {
118
+ "sheets": [
119
+ { "name": "Summary", "headers": ["Region", "Total"],
120
+ "rows": [["North", {"formula": "=SUM(Detail!B:B)"}]] }
121
+ ],
122
+ "namedRanges": {"Totals": "Summary!B2:B5"}
123
+ }
124
+ ```
125
+
126
+ **Round-trip:** `--json` output is a valid `write` input. Read → modify → write produces an updated workbook. Verified lossless on 29/30 real workbooks tested.
127
+
128
+ **Spec fields per sheet:** `name`, `headers` (optional), `rows` (or `cells` for per-cell mode), `frozen`, `columnWidths`, `numberFormat`, `merges`, `autoFilter`, `namedRanges`. Cell values can be plain literals, `{formula: "=..."}`, or `{hyperlink: "...", text: "..."}`.
129
+
130
+ **v1 limitations** (intentional — document and skip):
131
+ - Edit-in-place (rewrites the whole file; deferred to v1.5)
132
+ - Charts, pivot tables, conditional formatting, images, macros (not written)
133
+ - Shared formulas degrade to their cached values
46
134
 
47
135
  ## Important
48
- - Output goes to `.xlsx-read/` in the current working directory — make sure this directory is in your `.gitignore`.
49
- - For large files, use `--max-rows`, `--max-cols`, or request a single sheet to keep output manageable.
136
+
137
+ - Read output goes to `.xlsx-read/` in the current working directory — add to `.gitignore`.
138
+ - For huge files, prefer `--max-tokens` over `--max-rows` if targeting an LLM context window.
50
139
  - The package was previously named `cursor-reads-xlsx` — that command name still works as an alias.
package/index.js CHANGED
@@ -82,10 +82,14 @@ function parseArgs(argv) {
82
82
 
83
83
  function printHelp() {
84
84
  console.log(`Usage: npx xlsx-for-ai <file> [sheetName] [options]
85
+ npx xlsx-for-ai write <spec> [-o output.xlsx] (build .xlsx from a spec)
85
86
 
86
87
  Converts spreadsheets to text, markdown, JSON, SQL, or schema dumps that AI
87
88
  coding agents can read. Preserves values, formulas, formatting, layout.
88
89
 
90
+ The 'write' sub-command does the reverse: takes a JSON or markdown spec and
91
+ produces an .xlsx file. Run 'xlsx-for-ai write --help' for details.
92
+
89
93
  Input formats: .xlsx .xls .xlsb .ods .csv .tsv
90
94
 
91
95
  Output modes (mutually exclusive; default = text):
@@ -233,9 +237,13 @@ function plainValue(v) {
233
237
  if (typeof v === 'object') {
234
238
  if (v.richText) return v.richText.map(r => r.text).join('');
235
239
  if (v.hyperlink) return v.text || v.hyperlink;
236
- if (v.formula || v.sharedFormula) {
240
+ // Recognize all four shapes formulas can take in our pipeline:
241
+ // ExcelJS read: {formula, result} or {sharedFormula, result}
242
+ // --json output: {formula, result} or {sharedFormulaRef, result}
243
+ if (v.formula || v.sharedFormula || v.sharedFormulaRef) {
237
244
  const r = v.result;
238
245
  if (r == null) return null;
246
+ if (r instanceof Date) return r.toISOString().slice(0, 10);
239
247
  if (typeof r === 'object') {
240
248
  if (r.error) return `#${r.error}`;
241
249
  if (r.richText) return r.richText.map(x => x.text).join('');
@@ -810,8 +818,11 @@ function evaluateWorkbook(wb) {
810
818
 
811
819
  function diffWorkbooks(wbA, wbB, opts = {}) {
812
820
  const out = [];
813
- const sheetsA = new Map(wbA.worksheets.map(s => [s.name, s]));
814
- const sheetsB = new Map(wbB.worksheets.map(s => [s.name, s]));
821
+ // Skip the tool's own report tab — it's metadata, not user data, so it
822
+ // shouldn't show up as "added" or "changed" in user-facing diffs.
823
+ const isReport = (name) => name === '_xlsx-for-ai';
824
+ const sheetsA = new Map(wbA.worksheets.filter(s => !isReport(s.name)).map(s => [s.name, s]));
825
+ const sheetsB = new Map(wbB.worksheets.filter(s => !isReport(s.name)).map(s => [s.name, s]));
815
826
  const allNames = new Set([...sheetsA.keys(), ...sheetsB.keys()]);
816
827
 
817
828
  for (const name of allNames) {
@@ -988,8 +999,642 @@ function listSheets(wb) {
988
999
  // Main
989
1000
  // ---------------------------------------------------------------------------
990
1001
 
1002
+ // ---------------------------------------------------------------------------
1003
+ // Write mode (#8) — JSON/markdown spec → .xlsx
1004
+ //
1005
+ // V1 scope: create-from-scratch only. Edit-in-place is deferred (ExcelJS would
1006
+ // need to round-trip every detail of an existing file, which it doesn't do
1007
+ // faithfully — that's a separate effort using xlsx-populate or a patch engine).
1008
+ //
1009
+ // Accepted inputs:
1010
+ // - JSON: strict subset of our --json output (round-trips). Either a
1011
+ // single-sheet object or {sheets: [...]} for multi-sheet.
1012
+ // - Markdown: one or more tables; "## Sheet Name" headings split into
1013
+ // multiple sheets. No headings = single sheet.
1014
+ // - '-' as the spec path: read spec from stdin (format auto-detected).
1015
+ // ---------------------------------------------------------------------------
1016
+
1017
+ function parseWriteArgs(argv) {
1018
+ const opts = { positional: [], output: null, noReport: false, help: false };
1019
+ let i = 0;
1020
+ while (i < argv.length) {
1021
+ const a = argv[i];
1022
+ if (a === '-h' || a === '--help') opts.help = true;
1023
+ else if (a === '-o' || a === '--output') opts.output = argv[++i];
1024
+ else if (a === '--no-report') opts.noReport = true;
1025
+ else opts.positional.push(a);
1026
+ i++;
1027
+ }
1028
+ return opts;
1029
+ }
1030
+
1031
+ function printWriteHelp() {
1032
+ console.log(`Usage: xlsx-for-ai write <spec> [-o output.xlsx]
1033
+
1034
+ Builds an .xlsx file from a spec. Spec formats:
1035
+ - JSON — strict subset of xlsx-for-ai's --json output (round-trips)
1036
+ - Markdown — one or more tables; "## Sheet Name" headings split sheets
1037
+ - '-' — read spec from stdin (format auto-detected by first non-blank char)
1038
+
1039
+ Options:
1040
+ -o, --output PATH Output xlsx path (default: <spec basename>.xlsx)
1041
+ --no-report Suppress the "_xlsx-for-ai" review tab (advanced; for
1042
+ pipelines that want byte-clean output without metadata)
1043
+ -h, --help Show this help
1044
+
1045
+ Examples:
1046
+ xlsx-for-ai write spec.json
1047
+ xlsx-for-ai write spec.json -o report.xlsx
1048
+ xlsx-for-ai write report.md
1049
+ cat spec.json | xlsx-for-ai write -
1050
+
1051
+ JSON spec — minimum (single sheet):
1052
+ {
1053
+ "name": "Budget",
1054
+ "headers": ["Category", "Q1", "Q2"],
1055
+ "rows": [
1056
+ ["Marketing", 10000, 12000],
1057
+ ["R&D", 50000, 55000]
1058
+ ]
1059
+ }
1060
+
1061
+ JSON spec — multi-sheet:
1062
+ { "sheets": [ {...}, {...} ], "namedRanges": {"Totals": "Sheet1!B2:C5"} }
1063
+
1064
+ JSON spec — formulas:
1065
+ rows can include { "formula": "=SUM(B2:B5)" } in place of a literal value.
1066
+ cells can be specified explicitly: { "cells": [{ "ref": "B6", "value": {"formula": "=SUM(B2:B5)"} }] }
1067
+
1068
+ Optional fields per sheet: numberFormat, columnWidths, frozen, merges, autoFilter.
1069
+
1070
+ Not supported in v1: edit-in-place, charts, pivot tables, conditional formatting,
1071
+ images, macros. Use a sidecar instructions file for those for now.`);
1072
+ }
1073
+
1074
+ // Strip a string for value coercion: "42" → 42, "true" → true, "2026-04-27" → Date.
1075
+ function coerceMarkdownValue(c) {
1076
+ if (c == null || c === '') return null;
1077
+ // Backtick-fenced formula: `=SUM(A1:A10)`
1078
+ const fm = /^`\s*(=.+?)\s*`$/.exec(c);
1079
+ if (fm) return { formula: fm[1].replace(/^=/, '') };
1080
+ if (/^-?\d+$/.test(c)) return parseInt(c, 10);
1081
+ if (/^-?\d+\.\d+$/.test(c)) return parseFloat(c);
1082
+ if (/^(true|false)$/i.test(c)) return /^true$/i.test(c);
1083
+ if (/^\d{4}-\d{2}-\d{2}$/.test(c)) return new Date(c);
1084
+ return c.replace(/\\\|/g, '|');
1085
+ }
1086
+
1087
+ function parseMarkdownSpec(text) {
1088
+ // Walk the doc, accumulating lines per "## Heading" section. Each section
1089
+ // that contains a markdown table becomes a sheet.
1090
+ const sections = [];
1091
+ let currentName = null;
1092
+ let currentLines = [];
1093
+ for (const line of text.split('\n')) {
1094
+ const m = /^##\s+(.+?)\s*$/.exec(line);
1095
+ if (m) {
1096
+ if (currentLines.some(l => l.trim().startsWith('|'))) {
1097
+ sections.push({ name: currentName, lines: currentLines });
1098
+ }
1099
+ currentName = m[1];
1100
+ currentLines = [];
1101
+ } else {
1102
+ currentLines.push(line);
1103
+ }
1104
+ }
1105
+ if (currentLines.some(l => l.trim().startsWith('|'))) {
1106
+ sections.push({ name: currentName, lines: currentLines });
1107
+ }
1108
+ if (sections.length === 0) {
1109
+ throw new Error('No markdown table found in input');
1110
+ }
1111
+
1112
+ const sheets = sections.map(({ name, lines }, idx) => {
1113
+ const tableLines = lines
1114
+ .map(l => l.trim())
1115
+ .filter(l => l.startsWith('|') && l.endsWith('|'));
1116
+ if (tableLines.length < 2) {
1117
+ throw new Error(`Sheet "${name || `Sheet${idx+1}`}": no markdown table found`);
1118
+ }
1119
+ const cells = tableLines.map(l =>
1120
+ l.slice(1, -1).split(/(?<!\\)\|/).map(c => c.trim())
1121
+ );
1122
+ const sepIdx = cells.findIndex(row =>
1123
+ row.length > 0 && row.every(c => /^:?-+:?$/.test(c))
1124
+ );
1125
+ if (sepIdx < 1) throw new Error(`Sheet "${name || `Sheet${idx+1}`}": missing markdown header separator`);
1126
+ const headers = cells[sepIdx - 1];
1127
+ const rows = cells.slice(sepIdx + 1).map(row =>
1128
+ row.map(c => coerceMarkdownValue(c))
1129
+ );
1130
+ return { name: name || `Sheet${idx+1}`, headers, rows };
1131
+ });
1132
+
1133
+ return { sheets };
1134
+ }
1135
+
1136
+ function validateSpec(spec) {
1137
+ if (!spec || typeof spec !== 'object') throw new Error('Spec must be an object');
1138
+ // Single-sheet shortcut: top-level looks like a sheet → wrap.
1139
+ if ((Array.isArray(spec.rows) || Array.isArray(spec.cells)) && !Array.isArray(spec.sheets)) {
1140
+ spec = { sheets: [spec] };
1141
+ }
1142
+ // Array form (--json output for multi-sheet) → wrap.
1143
+ if (Array.isArray(spec)) {
1144
+ spec = { sheets: spec };
1145
+ }
1146
+ if (!Array.isArray(spec.sheets) || spec.sheets.length === 0) {
1147
+ throw new Error('Spec needs at least one sheet (top-level "sheets" array, or single-sheet "rows"/"cells")');
1148
+ }
1149
+ const names = new Set();
1150
+ for (const s of spec.sheets) {
1151
+ if (!s.name) throw new Error('Each sheet needs a "name"');
1152
+ if (names.has(s.name)) throw new Error(`Duplicate sheet name: "${s.name}"`);
1153
+ names.add(s.name);
1154
+ if (!Array.isArray(s.rows) && !Array.isArray(s.cells)) {
1155
+ throw new Error(`Sheet "${s.name}": needs "rows" array or "cells" array`);
1156
+ }
1157
+ if (Array.isArray(s.rows) && !Array.isArray(s.headers)) {
1158
+ // headers are optional; if absent, first row is treated as data.
1159
+ // No error.
1160
+ }
1161
+ }
1162
+ return spec;
1163
+ }
1164
+
1165
+ function trySimpleEval(formula) {
1166
+ const f = formula.replace(/^=/, '');
1167
+ const m = /^([A-Z]+)\(([^()]+)\)$/i.exec(f);
1168
+ if (!m) return null;
1169
+ const fn = m[1].toUpperCase();
1170
+ const args = m[2].split(',').map(s => parseFloat(s));
1171
+ if (!args.every(Number.isFinite)) return null;
1172
+ const fjs = lazyFormulaJs();
1173
+ if (typeof fjs[fn] !== 'function') return null;
1174
+ try { return fjs[fn](...args); } catch (_) { return null; }
1175
+ }
1176
+
1177
+ // JSON serialization turns Date instances into ISO strings, so on the way back
1178
+ // in from a spec we re-coerce ISO-shaped strings to Date — but only the shapes
1179
+ // that JSON.stringify(Date) actually produces. The signature of a Date-derived
1180
+ // string is the trailing Z (UTC); user-typed timestamp strings typically carry
1181
+ // a timezone offset like "-07:00", so we leave those alone.
1182
+ function coerceMaybeDate(v) {
1183
+ if (typeof v !== 'string') return v;
1184
+ // Pure date: "2019-01-01"
1185
+ if (/^\d{4}-\d{2}-\d{2}$/.test(v)) {
1186
+ const d = new Date(v);
1187
+ return isNaN(d.getTime()) ? v : d;
1188
+ }
1189
+ // ISO with explicit UTC Z (what JSON.stringify(Date) produces for any Date)
1190
+ if (/^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d{1,3})?Z$/.test(v)) {
1191
+ const d = new Date(v);
1192
+ return isNaN(d.getTime()) ? v : d;
1193
+ }
1194
+ return v;
1195
+ }
1196
+
1197
+ function buildCellValue(v, lossyOut) {
1198
+ if (v == null) return null;
1199
+ if (v instanceof Date) return v;
1200
+ if (typeof v === 'object') {
1201
+ if (v.formula) {
1202
+ const out = { formula: v.formula.replace(/^=/, '') };
1203
+ if (v.result !== undefined) out.result = coerceMaybeDate(v.result);
1204
+ else {
1205
+ const r = trySimpleEval(v.formula);
1206
+ if (r != null) out.result = r;
1207
+ }
1208
+ return out;
1209
+ }
1210
+ // Shared-formula followers: --json output emits these as
1211
+ // { sharedFormulaRef: "B5", result: <cached> }. ExcelJS can't reconstruct
1212
+ // a shared-formula follower from just a ref (it'd need the master expression
1213
+ // and relative-reference shifting). Pragmatic v1 behavior: degrade to the
1214
+ // cached result as a plain value. The cell's displayed value is preserved;
1215
+ // the formula link is lost.
1216
+ if (v.sharedFormulaRef || v.sharedFormula) {
1217
+ if (lossyOut) lossyOut.sharedFormula = (lossyOut.sharedFormula || 0) + 1;
1218
+ if (v.result === undefined) return null;
1219
+ return coerceMaybeDate(v.result);
1220
+ }
1221
+ if (v.hyperlink) {
1222
+ return { text: v.text || v.hyperlink, hyperlink: v.hyperlink };
1223
+ }
1224
+ return v;
1225
+ }
1226
+ // CRLF-in-string detection: ExcelJS normalizes \r\n → \n in shared-string
1227
+ // serialization. Visible content unchanged, but worth warning so users with
1228
+ // byte-exact pipelines aren't surprised.
1229
+ if (typeof v === 'string' && v.includes('\r') && lossyOut) {
1230
+ lossyOut.crlf = (lossyOut.crlf || 0) + 1;
1231
+ }
1232
+ return coerceMaybeDate(v);
1233
+ }
1234
+
1235
+ function applyCellStyle(cell, c) {
1236
+ if (c.numFmt) cell.numFmt = c.numFmt;
1237
+ if (c.bold || c.italic || c.color) {
1238
+ cell.font = {};
1239
+ if (c.bold) cell.font.bold = true;
1240
+ if (c.italic) cell.font.italic = true;
1241
+ if (c.color) cell.font.color = { argb: c.color };
1242
+ }
1243
+ if (c.fill) {
1244
+ cell.fill = { type: 'pattern', pattern: 'solid', fgColor: { argb: c.fill } };
1245
+ }
1246
+ if (c.align) {
1247
+ cell.alignment = { horizontal: c.align };
1248
+ }
1249
+ }
1250
+
1251
+ function applyNumberFormat(ws, ref, fmt) {
1252
+ // "A:A" or "A:D" — whole columns
1253
+ const colMatch = /^([A-Z]+):([A-Z]+)$/i.exec(ref);
1254
+ if (colMatch) {
1255
+ const c1 = colNum(colMatch[1]);
1256
+ const c2 = colNum(colMatch[2]);
1257
+ for (let c = c1; c <= c2; c++) ws.getColumn(c).numFmt = fmt;
1258
+ return;
1259
+ }
1260
+ if (ref.includes(':')) {
1261
+ const r = parseRange(ref);
1262
+ for (let row = r.startRow; row <= r.endRow; row++) {
1263
+ for (let col = r.startCol; col <= r.endCol; col++) {
1264
+ ws.getCell(`${colLetter(col)}${row}`).numFmt = fmt;
1265
+ }
1266
+ }
1267
+ } else {
1268
+ ws.getCell(ref).numFmt = fmt;
1269
+ }
1270
+ }
1271
+
1272
+ function buildWorkbook(spec) {
1273
+ const wb = new ExcelJS.Workbook();
1274
+ const warnings = []; // [{type, sheet, ref}, ...]
1275
+
1276
+ function track(sheetName, ref, lossy) {
1277
+ if (lossy.sharedFormula) warnings.push({ type: 'sharedFormula', sheet: sheetName, ref });
1278
+ if (lossy.crlf) warnings.push({ type: 'crlf', sheet: sheetName, ref });
1279
+ }
1280
+
1281
+ for (const sheet of spec.sheets) {
1282
+ const ws = wb.addWorksheet(sheet.name);
1283
+
1284
+ if (sheet.frozen) {
1285
+ ws.views = [{
1286
+ state: 'frozen',
1287
+ xSplit: sheet.frozen.colSplit ?? sheet.frozen.xSplit ?? 0,
1288
+ ySplit: sheet.frozen.rowSplit ?? sheet.frozen.ySplit ?? 0,
1289
+ }];
1290
+ }
1291
+
1292
+ if (sheet.columnWidths && typeof sheet.columnWidths === 'object') {
1293
+ for (const [letter, width] of Object.entries(sheet.columnWidths)) {
1294
+ try { ws.getColumn(colNum(letter)).width = width; } catch (_) {}
1295
+ }
1296
+ }
1297
+
1298
+ if (Array.isArray(sheet.cells)) {
1299
+ // Per-cell mode (round-trip from --json). cells: [{ref, value, ...style}, ...]
1300
+ for (const c of sheet.cells) {
1301
+ if (!c.ref) continue;
1302
+ const cell = ws.getCell(c.ref);
1303
+ const lossy = {};
1304
+ cell.value = buildCellValue(c.value, lossy);
1305
+ track(sheet.name, c.ref, lossy);
1306
+ applyCellStyle(cell, c);
1307
+ }
1308
+ } else {
1309
+ // Tabular mode (markdown / simple JSON). headers (optional) + rows.
1310
+ let rowIdx = 1;
1311
+ if (Array.isArray(sheet.headers) && sheet.headers.length > 0) {
1312
+ const hdrRow = ws.getRow(rowIdx);
1313
+ sheet.headers.forEach((h, i) => {
1314
+ const cell = hdrRow.getCell(i + 1);
1315
+ cell.value = h;
1316
+ cell.font = { bold: true };
1317
+ });
1318
+ rowIdx++;
1319
+ }
1320
+ for (const r of sheet.rows) {
1321
+ const row = ws.getRow(rowIdx);
1322
+ if (Array.isArray(r)) {
1323
+ r.forEach((v, i) => {
1324
+ const lossy = {};
1325
+ row.getCell(i + 1).value = buildCellValue(v, lossy);
1326
+ if (lossy.sharedFormula || lossy.crlf) {
1327
+ track(sheet.name, `${colLetter(i+1)}${rowIdx}`, lossy);
1328
+ }
1329
+ });
1330
+ } else if (r && typeof r === 'object') {
1331
+ // Object form: { col1: val, col2: val }, keyed by header name.
1332
+ if (Array.isArray(sheet.headers)) {
1333
+ sheet.headers.forEach((h, i) => {
1334
+ if (r[h] !== undefined) {
1335
+ const lossy = {};
1336
+ row.getCell(i + 1).value = buildCellValue(r[h], lossy);
1337
+ if (lossy.sharedFormula || lossy.crlf) {
1338
+ track(sheet.name, `${colLetter(i+1)}${rowIdx}`, lossy);
1339
+ }
1340
+ }
1341
+ });
1342
+ }
1343
+ }
1344
+ rowIdx++;
1345
+ }
1346
+ }
1347
+
1348
+ if (sheet.numberFormat && typeof sheet.numberFormat === 'object') {
1349
+ for (const [ref, fmt] of Object.entries(sheet.numberFormat)) {
1350
+ try { applyNumberFormat(ws, ref, fmt); } catch (_) {}
1351
+ }
1352
+ }
1353
+
1354
+ if (Array.isArray(sheet.merges)) {
1355
+ for (const m of sheet.merges) {
1356
+ try { ws.mergeCells(m); } catch (_) {}
1357
+ }
1358
+ }
1359
+
1360
+ if (sheet.autoFilter) {
1361
+ ws.autoFilter = sheet.autoFilter;
1362
+ }
1363
+
1364
+ // Sheet-level named ranges (the shape --json output produces:
1365
+ // [{name, ranges: ["Sheet1!$A$1:$D$10"]}, ...])
1366
+ if (Array.isArray(sheet.namedRanges)) {
1367
+ for (const nr of sheet.namedRanges) {
1368
+ if (!nr.name || !Array.isArray(nr.ranges)) continue;
1369
+ for (const ref of nr.ranges) {
1370
+ try { wb.definedNames.add(ref, nr.name); } catch (_) {}
1371
+ }
1372
+ }
1373
+ }
1374
+ }
1375
+
1376
+ // Workbook-level named ranges (concise spec form: { "Totals": "Sheet1!B2:C5" })
1377
+ if (spec.namedRanges && typeof spec.namedRanges === 'object' && !Array.isArray(spec.namedRanges)) {
1378
+ for (const [name, ref] of Object.entries(spec.namedRanges)) {
1379
+ try { wb.definedNames.add(ref, name); } catch (_) {}
1380
+ }
1381
+ }
1382
+ // Workbook-level array form (also from --json)
1383
+ if (Array.isArray(spec.namedRanges)) {
1384
+ for (const nr of spec.namedRanges) {
1385
+ if (!nr.name || !Array.isArray(nr.ranges)) continue;
1386
+ for (const ref of nr.ranges) {
1387
+ try { wb.definedNames.add(ref, nr.name); } catch (_) {}
1388
+ }
1389
+ }
1390
+ }
1391
+
1392
+ return { wb, warnings };
1393
+ }
1394
+
1395
+ // Per-issue review templates. Each entry follows the "supervisor leaves a
1396
+ // review note" shape: what happened, what we did, the risk, the tradeoff, and
1397
+ // how to override. Keeps the user in the decision seat.
1398
+ const REPORT_REVIEWS = {
1399
+ sharedFormula: {
1400
+ title: 'Shared formula degradation',
1401
+ whatHappened:
1402
+ "The source file used Excel's shared-formula optimization — one master cell carries the formula, follower cells reference the master. ExcelJS cannot reconstruct that link in the output file.",
1403
+ whatWeDid:
1404
+ 'Each follower cell was replaced with its cached numeric value. You will see the same numbers in Excel as before; the formula link itself is gone.',
1405
+ risk:
1406
+ 'If you edit any cell the original formula depended on, the previously-shared cells will not recalculate — they are now hardcoded numbers, not formulas.',
1407
+ tradeoff:
1408
+ 'Smaller file, but the spreadsheet is "frozen": adding rows or changing inputs will not propagate the way they used to.',
1409
+ alternative:
1410
+ 'Rerun with --fix-shared-formulas=expand (planned for v1.5). Each follower becomes an explicit per-cell formula — slightly larger file, but each cell recalculates independently like hand-written formulas. Closest behavior to the original source.',
1411
+ },
1412
+ crlf: {
1413
+ title: 'CRLF → LF line-ending normalization',
1414
+ whatHappened:
1415
+ 'The source file had Windows-style CRLF line endings (\\r\\n) inside cell text. ExcelJS normalizes these to Unix-style LF (\\n) when writing shared strings.',
1416
+ whatWeDid:
1417
+ 'Each affected cell\'s text now uses LF instead of CRLF. Visible content is identical — Excel, Numbers, and LibreOffice render both the same way.',
1418
+ risk:
1419
+ 'No risk to the spreadsheet content itself. Only matters if a downstream tool does byte-exact comparison or specifically processes \\r\\n (e.g., greps for Windows-encoded text).',
1420
+ tradeoff:
1421
+ 'None visible in spreadsheet apps. The output is also marginally smaller.',
1422
+ alternative:
1423
+ 'If your pipeline requires CRLF preservation, pre-process source strings to substitute a placeholder before extracting --json, then restore after writing. Or simply ignore — this is the most cosmetic of the round-trip artifacts.',
1424
+ },
1425
+ };
1426
+
1427
+ // Add a "_xlsx-for-ai" tab to the workbook with a review-style report of any
1428
+ // round-trip lossy events. Embedded in the file (not just stderr) so the
1429
+ // feedback travels with the workbook. Each issue type gets a full review note
1430
+ // (what happened, what we did, risk, tradeoff, alternative) so the user can
1431
+ // understand the decisions and override if they prefer different behavior.
1432
+ function addReportSheet(wb, warnings) {
1433
+ if (warnings.length === 0) return;
1434
+
1435
+ const ws = wb.addWorksheet('_xlsx-for-ai');
1436
+
1437
+ // Header rows
1438
+ ws.getCell('A1').value = 'xlsx-for-ai write report';
1439
+ ws.getCell('A1').font = { bold: true, size: 14 };
1440
+ ws.mergeCells('A1:D1');
1441
+
1442
+ ws.getCell('A2').value = `Generated ${new Date().toISOString().slice(0, 19).replace('T', ' ')} UTC`;
1443
+ ws.getCell('A2').font = { italic: true, color: { argb: 'FF666666' } };
1444
+ ws.mergeCells('A2:D2');
1445
+
1446
+ ws.getCell('A3').value =
1447
+ 'This file passed through xlsx-for-ai write. The sections below describe what changed during the round-trip, why, and how to override if you want different behavior. Cell values you see in the rest of the workbook are correct — these notes describe structural changes (formulas, line endings, etc.) that may matter for future edits.';
1448
+ ws.getCell('A3').font = { italic: true, color: { argb: 'FF666666' } };
1449
+ ws.getCell('A3').alignment = { wrapText: true, vertical: 'top' };
1450
+ ws.mergeCells('A3:D3');
1451
+ ws.getRow(3).height = 60;
1452
+
1453
+ // Group warnings by type
1454
+ const byType = {};
1455
+ for (const w of warnings) (byType[w.type] = byType[w.type] || []).push(w);
1456
+
1457
+ let r = 5;
1458
+
1459
+ // Per-issue review block
1460
+ for (const [type, group] of Object.entries(byType)) {
1461
+ const review = REPORT_REVIEWS[type] || {
1462
+ title: type,
1463
+ whatHappened: 'Unspecified round-trip change.',
1464
+ whatWeDid: '(no template available)',
1465
+ risk: '(unknown)',
1466
+ tradeoff: '(unknown)',
1467
+ alternative: '(none)',
1468
+ };
1469
+
1470
+ // Issue heading bar
1471
+ ws.getCell(`A${r}`).value = `Issue: ${review.title} (${group.length} cell${group.length === 1 ? '' : 's'})`;
1472
+ ws.getCell(`A${r}`).font = { bold: true, size: 12 };
1473
+ ws.getCell(`A${r}`).fill = { type: 'pattern', pattern: 'solid', fgColor: { argb: 'FFE7F0F8' } };
1474
+ ws.mergeCells(`A${r}:D${r}`);
1475
+ r++;
1476
+
1477
+ const addProse = (label, body) => {
1478
+ ws.getCell(`A${r}`).value = label;
1479
+ ws.getCell(`A${r}`).font = { bold: true };
1480
+ ws.getCell(`A${r}`).alignment = { vertical: 'top', wrapText: true };
1481
+ ws.getCell(`B${r}`).value = body;
1482
+ ws.getCell(`B${r}`).alignment = { wrapText: true, vertical: 'top' };
1483
+ ws.mergeCells(`B${r}:D${r}`);
1484
+ // Approximate height: ~6 chars per Excel "row unit" given an 80-char column,
1485
+ // 15px per row unit baseline. Capped at a reasonable max.
1486
+ const lines = Math.max(2, Math.ceil(body.length / 95));
1487
+ ws.getRow(r).height = Math.min(lines * 15, 120);
1488
+ r++;
1489
+ };
1490
+
1491
+ addProse('What happened', review.whatHappened);
1492
+ addProse('What we did', review.whatWeDid);
1493
+ addProse('Risk', review.risk);
1494
+ addProse('Tradeoff', review.tradeoff);
1495
+ addProse('Alternative', review.alternative);
1496
+
1497
+ // Compact "affected cells" summary
1498
+ const cellList = group.map(w => `${w.sheet}!${w.ref}`);
1499
+ const cellSummary = cellList.length <= 10
1500
+ ? cellList.join(', ')
1501
+ : `${cellList.slice(0, 10).join(', ')}, ... and ${cellList.length - 10} more (full list at the bottom of this sheet)`;
1502
+ addProse('Affected cells', cellSummary);
1503
+
1504
+ // Spacer row between issue blocks
1505
+ r++;
1506
+ }
1507
+
1508
+ // Full detail table
1509
+ ws.getCell(`A${r}`).value = 'Full detail (one row per affected cell)';
1510
+ ws.getCell(`A${r}`).font = { bold: true, size: 12 };
1511
+ ws.mergeCells(`A${r}:D${r}`);
1512
+ r++;
1513
+
1514
+ ws.getCell(`A${r}`).value = 'Sheet';
1515
+ ws.getCell(`B${r}`).value = 'Cell';
1516
+ ws.getCell(`C${r}`).value = 'Issue type';
1517
+ ws.getCell(`D${r}`).value = 'Title';
1518
+ ['A','B','C','D'].forEach(c => {
1519
+ ws.getCell(`${c}${r}`).font = { bold: true };
1520
+ ws.getCell(`${c}${r}`).fill = { type: 'pattern', pattern: 'solid', fgColor: { argb: 'FFEEEEEE' } };
1521
+ });
1522
+ r++;
1523
+
1524
+ const MAX_DETAIL = 1000;
1525
+ const detailRows = warnings.slice(0, MAX_DETAIL);
1526
+ for (const w of detailRows) {
1527
+ ws.getCell(`A${r}`).value = w.sheet;
1528
+ ws.getCell(`B${r}`).value = w.ref;
1529
+ ws.getCell(`C${r}`).value = w.type;
1530
+ ws.getCell(`D${r}`).value = (REPORT_REVIEWS[w.type] && REPORT_REVIEWS[w.type].title) || w.type;
1531
+ r++;
1532
+ }
1533
+ if (warnings.length > MAX_DETAIL) {
1534
+ ws.getCell(`A${r}`).value = `... and ${warnings.length - MAX_DETAIL} more (totals shown in the issue blocks above)`;
1535
+ ws.getCell(`A${r}`).font = { italic: true };
1536
+ ws.mergeCells(`A${r}:D${r}`);
1537
+ }
1538
+
1539
+ // Column widths
1540
+ ws.getColumn(1).width = 18;
1541
+ ws.getColumn(2).width = 12;
1542
+ ws.getColumn(3).width = 18;
1543
+ ws.getColumn(4).width = 80;
1544
+ }
1545
+
1546
+ function readStdinAll() {
1547
+ return new Promise((resolve, reject) => {
1548
+ let data = '';
1549
+ process.stdin.setEncoding('utf8');
1550
+ process.stdin.on('data', chunk => { data += chunk; });
1551
+ process.stdin.on('end', () => resolve(data));
1552
+ process.stdin.on('error', reject);
1553
+ });
1554
+ }
1555
+
1556
+ async function readSpecText(specPath) {
1557
+ if (specPath === '-') return readStdinAll();
1558
+ if (!fs.existsSync(specPath)) {
1559
+ throw new Error(`Spec file not found: ${specPath}`);
1560
+ }
1561
+ return fs.readFileSync(specPath, 'utf8');
1562
+ }
1563
+
1564
+ async function loadSpec(specPath) {
1565
+ const text = await readSpecText(specPath);
1566
+ const trimmed = text.trim();
1567
+ if (trimmed.startsWith('{') || trimmed.startsWith('[')) {
1568
+ let parsed;
1569
+ try { parsed = JSON.parse(trimmed); }
1570
+ catch (e) { throw new Error(`Spec is not valid JSON: ${e.message}`); }
1571
+ return parsed;
1572
+ }
1573
+ return parseMarkdownSpec(text);
1574
+ }
1575
+
1576
+ async function mainWrite(argv) {
1577
+ const opts = parseWriteArgs(argv);
1578
+ if (opts.help) { printWriteHelp(); process.exit(0); }
1579
+ if (opts.positional.length < 1) { printWriteHelp(); process.exit(1); }
1580
+
1581
+ const specPath = opts.positional[0];
1582
+
1583
+ let spec;
1584
+ try {
1585
+ spec = await loadSpec(specPath);
1586
+ spec = validateSpec(spec);
1587
+ } catch (e) {
1588
+ console.error(`Spec error: ${e.message}`);
1589
+ process.exit(1);
1590
+ }
1591
+
1592
+ let wb, warnings;
1593
+ try {
1594
+ ({ wb, warnings } = buildWorkbook(spec));
1595
+ } catch (e) {
1596
+ console.error(`Build error: ${e.message}`);
1597
+ process.exit(1);
1598
+ }
1599
+
1600
+ // Embed a review-style report tab in the file when there are round-trip
1601
+ // warnings, so the feedback travels with the workbook for the human or agent
1602
+ // that opens it. `--no-report` suppresses for pipelines that don't want the
1603
+ // extra sheet (e.g. round-trip CI tests).
1604
+ if (!opts.noReport) {
1605
+ addReportSheet(wb, warnings);
1606
+ }
1607
+
1608
+ let outPath = opts.output;
1609
+ if (!outPath) {
1610
+ if (specPath === '-') outPath = 'output.xlsx';
1611
+ else outPath = path.basename(specPath, path.extname(specPath)) + '.xlsx';
1612
+ }
1613
+ outPath = path.resolve(outPath);
1614
+
1615
+ try {
1616
+ await wb.xlsx.writeFile(outPath);
1617
+ } catch (e) {
1618
+ console.error(`Write error: ${e.message}`);
1619
+ process.exit(1);
1620
+ }
1621
+ console.log(outPath);
1622
+ if (warnings.length > 0) {
1623
+ console.error(`note: ${warnings.length} round-trip warning(s) written to '_xlsx-for-ai' sheet in the output.`);
1624
+ }
1625
+ }
1626
+
1627
+ // ---------------------------------------------------------------------------
1628
+ // Main
1629
+ // ---------------------------------------------------------------------------
1630
+
991
1631
  async function main() {
992
- const opts = parseArgs(process.argv.slice(2));
1632
+ const argv = process.argv.slice(2);
1633
+
1634
+ // Sub-command dispatch
1635
+ if (argv[0] === 'write') return mainWrite(argv.slice(1));
1636
+
1637
+ const opts = parseArgs(argv);
993
1638
 
994
1639
  if (opts.help) { printHelp(); process.exit(0); }
995
1640
  if (opts.positional.length < 1) { printHelp(); process.exit(1); }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "xlsx-for-ai",
3
- "version": "1.3.0",
3
+ "version": "1.4.0",
4
4
  "description": "CLI that converts .xlsx files into rich text or JSON dumps that AI coding agents (Claude, Cursor, Copilot, ChatGPT, etc.) can read — preserving values, formulas, formatting, colors, column widths, frozen panes, named ranges, tables, and more.",
5
5
  "main": "index.js",
6
6
  "bin": {
@@ -11,6 +11,7 @@
11
11
  "index.js",
12
12
  "cursor-rule-template",
13
13
  "README.md",
14
+ "WHY.md",
14
15
  "LICENSE"
15
16
  ],
16
17
  "keywords": [