xlsx-for-ai 1.3.0 β†’ 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,5 +1,7 @@
1
1
  # xlsx-for-ai
2
2
 
3
+ > πŸ‘‹ **New here? Not a programmer?** β†’ [Read WHY.md for the plain-English version](WHY.md). The README below is the technical reference.
4
+
3
5
  Converts spreadsheets into text, **markdown**, JSON, SQL, or schema dumps that AI coding agents can actually read.
4
6
 
5
7
  AI tools β€” Claude, Cursor, Copilot, ChatGPT, and other LLM coding agents β€” can read text files but **not** `.xlsx` binaries. This CLI bridges the gap.
package/WHY.md ADDED
@@ -0,0 +1,47 @@
1
+ # Why xlsx-for-ai exists
2
+
3
+ *A plain-English version. For the technical reference, see [README.md](README.md).*
4
+
5
+ ## The problem you've probably hit
6
+
7
+ You have a spreadsheet β€” a budget, a financial model, a tax estimate, a list of customers. You ask Claude (or ChatGPT, or Cursor) for help with it.
8
+
9
+ So you copy and paste a section into the chat. The AI gives you advice that sounds reasonable but feels generic. It misses the broken formula in row 47. It doesn't notice that one tab's totals don't match another tab's source. It can't tell you why the gross margin number changes when you add a new column. It treats your spreadsheet as a blob of numbers β€” because that's all it can see.
10
+
11
+ You're not going crazy. The AI literally cannot read the file. It can read text, code, even images of your spreadsheet β€” but the actual `.xlsx` binary is invisible to it. Formulas, formatting, named ranges, links between sheets β€” all of that disappears the moment you hit copy-paste.
12
+
13
+ ## What changes when you install this
14
+
15
+ Once `xlsx-for-ai` is on your machine, your AI tools (Claude, Cursor, Copilot, ChatGPT desktop apps with code execution) can finally **read your spreadsheet the way they read everything else** β€” every formula, every colored cell, every hidden row, every formula reference between sheets.
16
+
17
+ Now when you ask for help, you get a real review:
18
+
19
+ - *"Cell B47 has `#REF!` β€” it's pointing at a sheet you renamed last week."*
20
+ - *"Your gross margin formula in row 12 references the wrong column on the COGS tab β€” it's pulling Q3 numbers into the Q4 totals."*
21
+ - *"This 'Total' cell on the Summary tab shows $312k, but if I add up the source rows on the Detail tab I get $327k. Something's off."*
22
+
23
+ That's the difference between a friend skimming the printed numbers and an analyst who actually opens the file.
24
+
25
+ ## Things that become possible
26
+
27
+ A few examples people find useful:
28
+
29
+ - **Have your AI find errors in a financial model** before you send it to your accountant or your board.
30
+ - **Compare two versions of the same spreadsheet** ("what changed between V11 and V14?") and get a list of every cell that moved.
31
+ - **Turn a CSV export from QuickBooks into a clean SQL database table** in one command, with the column types figured out automatically.
32
+ - **Walk through a 50-tab model someone else built** and have the AI explain how the sheets reference each other.
33
+ - **Process a folder of legacy `.xls` files** that won't even open in modern Excel without complaint.
34
+
35
+ ## How to actually use it
36
+
37
+ It's a small command-line tool. Once a programmer sets it up (one line: `npm install -g xlsx-for-ai`), you don't have to think about it again β€” your AI tools pick it up automatically and start using it whenever they encounter a spreadsheet.
38
+
39
+ If you're the programmer doing the install, the [README](README.md) has the full reference. If you're handing this to a programmer to set up for you, that link is what they'll need.
40
+
41
+ ## Why this didn't exist before
42
+
43
+ Spreadsheet libraries are designed for developers building software *on top of* spreadsheets. They output JavaScript objects, database rows, raw bytes β€” formats other programs consume. None of them were designed for the case where the consumer is a language model and the goal is a text format the model can actually understand.
44
+
45
+ `xlsx-for-ai` is the first one built specifically for that. The output is shaped for an LLM's context window β€” markdown tables when the model just needs to read, structured JSON when it needs to reason, token-aware truncation when the spreadsheet is too big to fit.
46
+
47
+ It's a small tool. It just happens to fix the one thing standing between AI assistants and the file format most knowledge work actually lives in.
@@ -1,50 +1,100 @@
1
1
  ---
2
- description: Reading .xlsx spreadsheet files
2
+ description: Reading and converting spreadsheets (.xlsx, .xls, .xlsb, .ods, .csv, .tsv) for AI agents
3
3
  globs:
4
4
  alwaysApply: true
5
5
  ---
6
6
 
7
- # Reading .xlsx Files
8
-
9
- The Read tool cannot open `.xlsx` files directly. When you need to inspect a spreadsheet:
10
-
11
- 1. **Run the converter** from the project root:
12
- ```bash
13
- npx xlsx-for-ai <path-to-file.xlsx> [sheetName]
14
- ```
15
- - If `sheetName` is omitted, all sheets are dumped.
16
- - Output is written to `.xlsx-read/<filename>--<sheet>.txt` (project root).
17
- - The script prints the output path(s) to stdout.
18
-
19
- 2. **Read the output file** with the Read tool. It contains:
20
- - Sheet metadata (frozen panes, column widths, merged cells, auto-filters, print areas)
21
- - Named ranges referencing the sheet
22
- - Table definitions (name, range, columns)
23
- - Image positions
24
- - Every row with its cells, showing:
25
- - **Value** β€” always present
26
- - **Formula** β€” `[formula: =SUM(A1:A10)]` (master cell) or `[shared formula ref: D2]` (drag-fill follow-up)
27
- - **Number format** β€” `[numFmt: 0.00%]` if not "General"
28
- - **Font** β€” `[bold]`, `[italic]`, `[color:FF8B0000]`
29
- - **Fill** β€” `[fill:FFFFFF00]` if background color set
30
- - **Alignment** β€” `[align:center]` if non-default
31
- - **Hyperlink** β€” `[link: https://...]` if the cell contains a URL
32
- - **Comment** β€” `[note: ...]` if the cell has a comment or note
33
- - **Validation** β€” `[validation: list [...]]` if the cell has data validation
34
- - **Hidden** β€” `[hidden]` on the row header if the row is hidden
35
- - Empty cells are omitted; empty rows show `(empty)`.
36
-
37
- 3. **Do not ask the user** before running this. Just run it when you need to see an `.xlsx` file.
38
-
39
- ## Useful flags
40
- - `--list-sheets` β€” list sheet names and dimensions without dumping content
41
- - `--stdout` β€” print directly to stdout instead of writing files
42
- - `--json` β€” emit structured JSON (one object per cell) for easier programmatic use
43
- - `--compact` β€” suppress noisy default tags (default text colors, white fills) to reduce token usage
44
- - `--max-rows N` β€” limit output to first N rows (use for large sheets)
45
- - `--max-cols N` β€” limit output to first N columns (use for very wide sheets)
7
+ # Reading Spreadsheet Files
8
+
9
+ The Read tool cannot open binary spreadsheet files directly. When you need to inspect or process a spreadsheet, use `xlsx-for-ai`.
10
+
11
+ **Supported input:** `.xlsx` `.xls` `.xlsb` `.ods` `.csv` `.tsv`
12
+
13
+ ## Basic usage
14
+
15
+ ```bash
16
+ npx xlsx-for-ai <file> [sheetName]
17
+ ```
18
+
19
+ - If `sheetName` is omitted, all sheets are dumped.
20
+ - Default output: text dump to `.xlsx-read/<filename>--<sheet>.txt` (project root).
21
+ - Path(s) printed to stdout for the agent to read next.
22
+
23
+ **Do not ask the user before running this.** Just run it when you encounter a spreadsheet.
24
+
25
+ ## Pick the right output mode
26
+
27
+ | When you want… | Use | Why |
28
+ |---|---|---|
29
+ | To read a sheet for context | `--md --stdout` | Markdown tables β€” best LLM comprehension per token |
30
+ | Programmatic per-cell access | `--json --stdout` | Structured, one object per cell with formula/format/style |
31
+ | To bound output to a context window | `--max-tokens 8000 --stdout` | Truncates with a tail summary noting what was dropped |
32
+ | To understand columns / build a query | `--schema --stdout` | Inferred types per column (INTEGER / NUMERIC / DATE / BOOLEAN / TEXT) |
33
+ | To import data into a database | `--sql --stdout` | `CREATE TABLE` + `INSERT` statements, types from --schema |
34
+ | To compare two versions | `--diff OTHER --stdout` | Emits added/removed/changed sheets and cells |
35
+ | To list sheets without parsing | `--list-sheets` | Fast probe; lighter than full read |
36
+ | To handle huge files (>100MB) | `--stream --stdout` | Row-by-row reader, drops some sheet metadata |
37
+
38
+ ## Selection (focus the output)
39
+
40
+ | Flag | Effect |
41
+ |---|---|
42
+ | `[sheetName]` | Positional second arg β€” only this sheet |
43
+ | `--range A1:D50` | Only this rectangular range |
44
+ | `--named-range NAME` | Only the cells covered by a workbook-defined name |
45
+ | `--max-rows N` | Cap rows per sheet |
46
+ | `--max-cols N` | Cap columns per sheet |
47
+
48
+ Combine selection flags with any output mode. Use `--range` aggressively when you only need a section of a large model β€” it can reduce context 50Γ— vs. dumping the whole sheet.
49
+
50
+ ## Other useful flags
51
+
52
+ - `--compact` β€” suppress noisy default tags (default colors, "General" format, white fills)
53
+ - `--evaluate` β€” promote cached formula results to the primary value; recompute simple formulas via formulajs
54
+ - `--stdout` β€” print directly instead of writing files
55
+
56
+ ## What the default text dump contains
57
+
58
+ - Sheet metadata (frozen panes, column widths, merged cells, auto-filters, print areas)
59
+ - Named ranges referencing the sheet
60
+ - Table definitions (name, range, columns)
61
+ - Image positions
62
+ - Every non-empty row with its cells, showing:
63
+ - **Value** β€” always present
64
+ - **Formula** β€” `[formula: =SUM(A1:A10)]` (master) or `[shared formula ref: D2]` (drag-fill follow-up)
65
+ - **Number format** β€” `[numFmt: 0.00%]` if not "General"
66
+ - **Font** β€” `[bold]`, `[italic]`, `[color:FF8B0000]`
67
+ - **Fill** β€” `[fill:FFFFFF00]`
68
+ - **Alignment** β€” `[align:center]` if non-default
69
+ - **Hyperlink** β€” `[link: https://...]`
70
+ - **Comment** β€” `[note: ...]`
71
+ - **Validation** β€” `[validation: list [...]]`
72
+ - **Hidden** β€” `[hidden]` on the row header
73
+
74
+ ## Examples
75
+
76
+ ```bash
77
+ # Quick read of a financial model β€” markdown is most context-efficient
78
+ npx xlsx-for-ai model.xlsx --md --stdout
79
+
80
+ # Just one sheet, fits in a small context window
81
+ npx xlsx-for-ai model.xlsx "Assumptions" --md --stdout --max-tokens 4000
82
+
83
+ # Schema for a CSV before writing SQL
84
+ npx xlsx-for-ai data.csv --schema --stdout
85
+
86
+ # Surgical extraction β€” only one section of a huge sheet
87
+ npx xlsx-for-ai model.xlsx "Detail" --range B5:H50 --stdout
88
+
89
+ # Compare two model versions, get a change summary
90
+ npx xlsx-for-ai v1.xlsx --diff v2.xlsx --stdout
91
+
92
+ # Huge file (>100MB) β€” streaming mode keeps memory bounded
93
+ npx xlsx-for-ai dump.xlsx --stream --stdout --max-rows 1000
94
+ ```
46
95
 
47
96
  ## Important
48
- - Output goes to `.xlsx-read/` in the current working directory β€” make sure this directory is in your `.gitignore`.
49
- - For large files, use `--max-rows`, `--max-cols`, or request a single sheet to keep output manageable.
97
+
98
+ - Output goes to `.xlsx-read/` in the current working directory β€” add this to `.gitignore`.
99
+ - For huge files, prefer `--max-tokens` over `--max-rows` if you're targeting an LLM context window β€” token count and row count don't correlate.
50
100
  - The package was previously named `cursor-reads-xlsx` β€” that command name still works as an alias.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "xlsx-for-ai",
3
- "version": "1.3.0",
3
+ "version": "1.3.1",
4
4
  "description": "CLI that converts .xlsx files into rich text or JSON dumps that AI coding agents (Claude, Cursor, Copilot, ChatGPT, etc.) can read β€” preserving values, formulas, formatting, colors, column widths, frozen panes, named ranges, tables, and more.",
5
5
  "main": "index.js",
6
6
  "bin": {
@@ -11,6 +11,7 @@
11
11
  "index.js",
12
12
  "cursor-rule-template",
13
13
  "README.md",
14
+ "WHY.md",
14
15
  "LICENSE"
15
16
  ],
16
17
  "keywords": [