@heylemon/lemonade 0.1.6 → 0.1.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,23 +1,511 @@
1
1
  ---
2
- name: xlsx
3
- description: "Create professional spreadsheets, dashboards, and financial models in .xlsx with deterministic scripts and formula validation."
4
- license: Proprietary. LICENSE.txt has complete terms
2
+ name: pro-sheets
3
+ description: "Create professional, formatted spreadsheets dashboards, data tables, financial models, trackers, and reports. Use this skill whenever the user wants to create or edit a spreadsheet, Excel file, tracker, financial model, budget, forecast, data table, or dashboard. Trigger on 'spreadsheet', 'xlsx', 'excel', 'workbook', 'financial model', 'budget', 'forecast', 'tracker', 'dashboard'."
5
4
  ---
6
5
 
7
- # Pro Sheets for XLSX
6
+ # Pro Sheets Professional Excel Development
8
7
 
9
- Use structured JSON specs and the script pair below:
8
+ Build polished, formula-driven Excel workbooks using openpyxl. The AI writes Python code directly each time—no wrapper scripts—tailoring each implementation to the user's request. All formulas are calculated via LibreOffice, ensuring accuracy.
9
+
10
+ ## Key Architecture
11
+
12
+ - **Direct openpyxl coding**: Write formulas inline, customize formatting, control layout without abstraction layers
13
+ - **Mandatory LibreOffice recalculation**: All formulas evaluated before delivering the file
14
+ - **Industry-standard financial modeling**: Color coding, number formatting, documentation standards
15
+ - **pandas for data prep**: Load, analyze, reshape before building the spreadsheet
16
+
17
+ ## Requirements for Outputs
18
+
19
+ Every Excel file must meet these baseline standards:
20
+
21
+ ### All Spreadsheets
22
+ - **Font**: Arial or similar professional font, consistent throughout
23
+ - **Formula integrity**: Zero formula errors—no #REF!, #DIV/0!, #VALUE!, #N/A, #NAME?, #NULL!, #NUM!
24
+ - **Template preservation**: When editing existing files, maintain their structure and style
25
+
26
+ ### Financial Models (strict standards)
27
+
28
+ **Color Coding** (Excel RGB values):
29
+ - **Blue text (0,0,255)**: Hardcoded input values—numbers the user will change for scenarios
30
+ - **Black text (0,0,0)**: Formula cells (calculations, SUM, cross-references)
31
+ - **Green text (0,128,0)**: Cross-sheet references (formulas that link between sheets)
32
+ - **Red text (255,0,0)**: External references (data pulled from outside sources)
33
+ - **Yellow background**: Key assumptions—the 5-10 drivers that change model outputs most
34
+
35
+ **Number Formatting**:
36
+ - Currency: `$#,##0` or `$#,##0,"K"` for thousands (e.g., $45K)
37
+ - Percentages: `0.0%` (e.g., 15.2%)
38
+ - Multiples/ratios: `0.0x` (e.g., 3.2x)
39
+ - Years: Text format (2026, not 2,026)
40
+ - Zero values: Display as "–" not 0
41
+ - Negative numbers: Parentheses `($#,##0)` not minus sign
42
+
43
+ **Formula Rules**:
44
+ - Assumptions always in separate cells, never hardcoded in formulas
45
+ - ❌ Bad: `=Revenue * 0.15` (hardcoded margin)
46
+ - ✅ Good: `=Revenue * B5` where B5 contains 0.15 with cell comment "Gross Margin"
47
+ - Cell references, not hardcoded values: formulas link cells, not embed numbers
48
+ - Document data sources: If a cell contains a hardcoded number, add a comment explaining its origin
49
+
50
+ **Documentation**:
51
+ - Complex formulas: Add Excel comments explaining logic
52
+ - Data sources: Cite external sources for hardcoded values
53
+ - Assumptions sheet: Keep all key inputs in one place, clearly labeled
54
+
55
+ ## Critical: Use Formulas, Not Hardcoded Values
56
+
57
+ This is the core difference between dynamic spreadsheets and static reports. The spreadsheet must recalculate automatically when users change inputs.
58
+
59
+ ### Wrong Approach
60
+ ```python
61
+ import pandas as pd
62
+ df = pd.read_csv('sales.csv')
63
+ total = df['Sales'].sum()
64
+ sheet['B10'] = total # ❌ Hardcoded—won't update if sales change
65
+ ```
66
+
67
+ ### Correct Approach
68
+ ```python
69
+ # Load data, analyze it to understand structure
70
+ df = pd.read_excel('sales.csv')
71
+ print(df.shape, df.columns)
72
+
73
+ # But put the data in the spreadsheet
74
+ for idx, row in df.iterrows():
75
+ sheet[f'A{idx+2}'] = row['Product']
76
+ sheet[f'B{idx+2}'] = row['Sales']
77
+
78
+ # Use formulas for calculations
79
+ sheet['B10'] = '=SUM(B2:B9)' # ✅ Dynamic—updates when sales change
80
+ ```
81
+
82
+ ## Reading and Analyzing Data
83
+
84
+ Use pandas to understand data before building the spreadsheet:
85
+
86
+ ```python
87
+ import pandas as pd
88
+
89
+ # Single sheet
90
+ df = pd.read_excel('file.xlsx')
91
+
92
+ # All sheets
93
+ all_sheets = pd.read_excel('file.xlsx', sheet_name=None)
94
+ for sheet_name, data in all_sheets.items():
95
+ print(f"{sheet_name}: {data.shape}")
96
+
97
+ # Specify dtypes and ranges
98
+ df = pd.read_excel('file.xlsx', sheet_name='Sales',
99
+ dtype={'Year': str, 'Amount': float},
100
+ usecols=['Product', 'Amount', 'Date'])
101
+ ```
102
+
103
+ ## Common Workflow
104
+
105
+ 1. **Choose the tool**: pandas for data analysis, openpyxl for formulas and formatting
106
+ 2. **Load/create workbook**: `from openpyxl import Workbook` or `load_workbook()`
107
+ 3. **Modify data, formulas, formatting**: Write cells with openpyxl
108
+ 4. **Save**: `wb.save('output.xlsx')`
109
+ 5. **Recalculate formulas** (MANDATORY): `python scripts/recalc.py output.xlsx`
110
+ 6. **Verify and fix**: Check recalc output for errors
111
+
112
+ ## Dashboard Design
113
+
114
+ Dashboards are high-visibility outputs. They must look polished when opened — not cramped, clipped, or hard to read.
115
+
116
+ ### KPI Cards That Breathe
117
+
118
+ The #1 mistake: cramming large KPI values into tiny merged cells. Excel clips text that doesn't fit, making stats invisible.
119
+
120
+ **Rules for KPI cards:**
121
+ - **Column width ≥ 18** per card (not 10-14). If merging 2 columns, each should be ≥ 18
122
+ - **Row height for large fonts**: `ws.row_dimensions[row].height = font_size * 2.2` minimum
123
+ - **Explicit row heights** for every KPI row (value, label, change indicator)
124
+ - **Spacer rows** (height 8-12) above and below the KPI band to separate visually
125
+ - **Max 4 KPIs per row** — more than that and they crowd. Use a second row if needed
126
+
127
+ **KPI card pattern** (2 merged columns per card, no gutter columns — keeps table alignment simple):
128
+ ```python
129
+ # 4 KPIs = 8 columns (2 per card). Tables below use columns A-D within the same grid.
130
+ kpis = [
131
+ ("12,400", "Current MAU", "+34% QoQ", True),
132
+ ("$48K", "MRR", "+52% QoQ", True),
133
+ ("62", "NPS", "+8 pts", True),
134
+ ("$42", "CAC", "↓ from $58", True),
135
+ ]
136
+
137
+ kpi_start = 3
138
+ ws.row_dimensions[kpi_start - 1].height = 10 # Spacer above
139
+
140
+ for i, (value, label, change, is_positive) in enumerate(kpis):
141
+ col = i * 2 + 1 # 2 columns per card, no gutter
142
+
143
+ # Merge 2 columns for each row
144
+ for dr in range(3):
145
+ ws.merge_cells(start_row=kpi_start + dr, start_column=col,
146
+ end_row=kpi_start + dr, end_column=col + 1)
147
+
148
+ # Value — large, bold, colored
149
+ ws.cell(kpi_start, col, value).font = Font(name="Arial", size=22, bold=True, color="2E86AB")
150
+ ws.cell(kpi_start, col).alignment = Alignment(horizontal="center", vertical="center")
151
+
152
+ # Label — muted
153
+ ws.cell(kpi_start + 1, col, label).font = Font(name="Arial", size=9, color="7F8C8D")
154
+ ws.cell(kpi_start + 1, col).alignment = Alignment(horizontal="center")
155
+
156
+ # Change — green/red
157
+ color = "27AE60" if is_positive else "E74C3C"
158
+ ws.cell(kpi_start + 2, col, change).font = Font(name="Arial", size=9, bold=True, color=color)
159
+ ws.cell(kpi_start + 2, col).alignment = Alignment(horizontal="center")
160
+
161
+ # Card background + border
162
+ card_fill = PatternFill("solid", fgColor="FFF8E1")
163
+ for dr in range(3):
164
+ for dc in range(2):
165
+ ws.cell(kpi_start + dr, col + dc).fill = card_fill
166
+ ws.cell(kpi_start + dr, col + dc).border = thin_border
167
+
168
+ # Column widths — 18 minimum per column in the card
169
+ ws.column_dimensions[get_column_letter(col)].width = 18
170
+ ws.column_dimensions[get_column_letter(col + 1)].width = 18
171
+
172
+ # ROW HEIGHTS — the critical fix for merged cells
173
+ ws.row_dimensions[kpi_start].height = 48 # Value row — must fit size-22 font
174
+ ws.row_dimensions[kpi_start + 1].height = 20 # Label row
175
+ ws.row_dimensions[kpi_start + 2].height = 20 # Change row
176
+
177
+ # Spacer rows below KPIs before tables start
178
+ ws.row_dimensions[kpi_start + 3].height = 12
179
+ ws.row_dimensions[kpi_start + 4].height = 6
180
+ ```
181
+
182
+ **Why no gutter columns:** Adding narrow spacer columns between KPI cards creates misalignment with the data tables below (which use columns A-D). Keep the column grid consistent — KPI cards and tables should share the same column structure where possible.
183
+
184
+ **Common dashboard mistakes to avoid:**
185
+ - Column width under 16 for KPI cards (text clips silently in merged cells)
186
+ - No `row_dimensions[].height` set (Excel auto-height fails with merged cells — this is the #1 cause of hidden stats)
187
+ - KPI cards touching the tables below (add spacer rows, height 10-16)
188
+ - Gutter columns between cards that break table alignment below
189
+ - Too many merged cells in data tables (breaks sorting and filtering)
190
+ - Using merged cells for data tables at all — only use them for titles and KPI cards
191
+
192
+ ### Section Titles and Table Spacing
193
+ - Section titles: 1 empty spacer row above (height 16-20), bold font size 13-14
194
+ - Tables: Start on the row immediately after the header row, no gap
195
+ - Between tables: 2 empty rows minimum (or 1 spacer row with height 24+)
196
+
197
+ ## Creating New Files
198
+
199
+ Use openpyxl directly. Here's a typical workflow:
200
+
201
+ ```python
202
+ from openpyxl import Workbook
203
+ from openpyxl.styles import Font, PatternFill, Alignment, numbers
204
+ from openpyxl.utils import get_column_letter
205
+
206
+ wb = Workbook()
207
+ ws = wb.active
208
+ ws.title = "Sales"
209
+
210
+ # Headers with styling
211
+ headers = ["Product", "Q1 Sales", "Q2 Sales", "Total"]
212
+ for col, header in enumerate(headers, 1):
213
+ cell = ws.cell(1, col)
214
+ cell.value = header
215
+ cell.font = Font(bold=True, color="FFFFFF", name="Arial")
216
+ cell.fill = PatternFill(start_color="1B3A5C", end_color="1B3A5C", fill_type="solid")
217
+ cell.alignment = Alignment(horizontal="center")
218
+
219
+ # Data rows
220
+ products = [("Widget A", 10000, 12000), ("Widget B", 8000, 9500)]
221
+ for row_idx, (product, q1, q2) in enumerate(products, 2):
222
+ ws.cell(row_idx, 1).value = product
223
+ ws.cell(row_idx, 2).value = q1
224
+ ws.cell(row_idx, 3).value = q2
225
+ # Formula for total
226
+ ws.cell(row_idx, 4).value = f"=B{row_idx}+C{row_idx}"
227
+ # Number formatting
228
+ ws.cell(row_idx, 2).number_format = "$#,##0"
229
+ ws.cell(row_idx, 3).number_format = "$#,##0"
230
+ ws.cell(row_idx, 4).number_format = "$#,##0"
231
+
232
+ # Totals row
233
+ total_row = len(products) + 2
234
+ ws.cell(total_row, 1).value = "TOTAL"
235
+ ws.cell(total_row, 2).value = f"=SUM(B2:B{total_row-1})"
236
+ ws.cell(total_row, 3).value = f"=SUM(C2:C{total_row-1})"
237
+ ws.cell(total_row, 4).value = f"=SUM(D2:D{total_row-1})"
238
+
239
+ # Column widths
240
+ ws.column_dimensions['A'].width = 15
241
+ ws.column_dimensions['B'].width = 12
242
+ ws.column_dimensions['C'].width = 12
243
+ ws.column_dimensions['D'].width = 12
244
+
245
+ # Freeze header
246
+ ws.freeze_panes = "A2"
247
+
248
+ # Auto-filter
249
+ ws.auto_filter.ref = f"A1:D{total_row}"
250
+
251
+ wb.save('output.xlsx')
252
+ ```
253
+
254
+ ### Common Formatting Tasks
255
+
256
+ **Font colors for financial models**:
257
+ ```python
258
+ cell = ws['B5']
259
+ cell.font = Font(color="0000FF") # Blue for inputs
260
+ ```
261
+
262
+ **Cell number formats**:
263
+ ```python
264
+ ws['B2'].number_format = '$#,##0' # Currency
265
+ ws['C3'].number_format = '0.0%' # Percentage
266
+ ws['D4'].number_format = '0.0x' # Multiple
267
+ ws['E5'].number_format = 'YYYY' # Year as text
268
+ ```
269
+
270
+ **Borders**:
271
+ ```python
272
+ from openpyxl.styles import Border, Side
273
+
274
+ thin_border = Border(
275
+ left=Side(style='thin'),
276
+ right=Side(style='thin'),
277
+ top=Side(style='thin'),
278
+ bottom=Side(style='thin')
279
+ )
280
+ ws['B2'].border = thin_border
281
+ ```
282
+
283
+ **Merged cells** (use sparingly, never in data tables):
284
+ ```python
285
+ ws.merge_cells('A1:D1')
286
+ ws['A1'] = 'Q1 2026 Performance'
287
+ ```
288
+
289
+ **Row heights for merged cells** (critical — Excel auto-height fails with merges):
290
+ ```python
291
+ # ALWAYS set explicit row heights when using large fonts or merged cells
292
+ ws.row_dimensions[1].height = 36 # Title row
293
+ ws.row_dimensions[3].height = 52 # Large KPI values (size 20-26 font)
294
+ ws.row_dimensions[4].height = 22 # KPI labels
295
+ ```
296
+
297
+ ## Editing Existing Files
298
+
299
+ ```python
300
+ from openpyxl import load_workbook
301
+
302
+ # Load without calculating (preserves formulas)
303
+ wb = load_workbook('existing.xlsx')
304
+ ws = wb['Sales']
305
+
306
+ # Modify cells
307
+ ws['B2'].value = 50000
308
+
309
+ # Insert rows (shifts everything below)
310
+ ws.insert_rows(5, 3) # Insert 3 rows at row 5
311
+
312
+ # Delete rows
313
+ ws.delete_rows(5, 2) # Delete 2 rows starting at row 5
314
+
315
+ # Add new sheet
316
+ new_sheet = wb.create_sheet('Assumptions', 0) # Insert at position 0
317
+
318
+ # Save
319
+ wb.save('existing.xlsx')
320
+ ```
321
+
322
+ ### Critical Warning: data_only=True Destroys Formulas
323
+
324
+ ```python
325
+ # ❌ WRONG: Loads calculated values, strips formulas
326
+ wb = load_workbook('file.xlsx', data_only=True)
327
+
328
+ # ✅ RIGHT: Preserves formulas for editing
329
+ wb = load_workbook('file.xlsx', data_only=False) # or just omit
330
+ ```
331
+
332
+ ## Formula Recalculation
333
+
334
+ After creating or editing any file with formulas, **always recalculate**:
10
335
 
11
336
  ```bash
12
- pip install openpyxl --break-system-packages 2>/dev/null
13
- python scripts/create_xlsx.py spec.json output.xlsx
14
- python scripts/validate_xlsx.py output.xlsx
337
+ python scripts/recalc.py output.xlsx [timeout_seconds]
338
+ ```
339
+
340
+ ### What It Does
341
+ - Sets up LibreOffice macro (RecalculateAndSave) on first run
342
+ - Runs soffice headless to open the file and recalculate all formulas
343
+ - Scans all cells for Excel error values: #VALUE!, #DIV/0!, #REF!, #NAME?, #NULL!, #NUM!, #N/A
344
+ - Returns JSON with status, error count, and locations
345
+
346
+ ### Output Format
347
+ ```json
348
+ {
349
+ "status": "success",
350
+ "total_errors": 0,
351
+ "total_formulas": 145,
352
+ "error_summary": {},
353
+ "error_details": [],
354
+ "file": "output.xlsx"
355
+ }
356
+ ```
357
+
358
+ If errors exist:
359
+ ```json
360
+ {
361
+ "status": "errors_found",
362
+ "total_errors": 2,
363
+ "error_summary": {
364
+ "#DIV/0!": 1,
365
+ "#REF!": 1
366
+ },
367
+ "error_details": [
368
+ {"cell": "C5", "error": "#DIV/0!", "sheet": "Sales"},
369
+ {"cell": "F12", "error": "#REF!", "sheet": "Assumptions"}
370
+ ]
371
+ }
15
372
  ```
16
373
 
17
- Rules:
18
- - Always use formulas (never hardcode calculated values).
19
- - Keep headers clear with units.
20
- - Keep formatting consistent.
21
- - For financial models: blue inputs, black formulas, yellow assumptions.
374
+ ## Formula Verification Checklist
375
+
376
+ ### Essential Tests
377
+ - **Cell references**: Verify all cell addresses are correct (B2, not B2:B2)
378
+ - **Column mapping**: Ensure formula references match data layout
379
+ - **Row offsets**: Check that formulas adjust correctly when rows are inserted/deleted
380
+ - **Sheet references**: Confirm cross-sheet formulas use correct syntax: `='Sheet Name'!A1`
381
+
382
+ ### Common Pitfalls
383
+ - **NaN in formulas**: If source data contains empty cells, wrap with IF: `=IF(ISBLANK(B2),0,B2*C2)`
384
+ - **Far-right columns**: Column Z, AA, AB—verify Excel alphabet mapping
385
+ - **Division by zero**: Always guard: `=IF(B2=0,0,A2/B2)`
386
+ - **Wrong sheet references**: `=SUM(Data!B:B)` not `=SUM(Data.B:B)`
387
+ - **Cross-sheet format mismatch**: If Formula sheet references Data sheet, ensure same row structure
388
+
389
+ ### Testing Strategy
390
+ 1. **Start small**: Create 5-row test version, verify formulas calculate
391
+ 2. **Verify dependencies**: If C = A + B, test that changing A updates C
392
+ 3. **Test edge cases**: Empty cells, zeros, negative numbers, very large numbers
393
+ 4. **Run recalc.py**: Always validate before delivering
394
+
395
+ ## Best Practices
396
+
397
+ ### Choosing Tools
398
+
399
+ **Use pandas when**:
400
+ - Loading CSV, Excel, database data
401
+ - Data cleaning: pivots, filters, groupby
402
+ - Analysis: aggregations, calculations on entire dataset
403
+ - Output: single DataFrame to spreadsheet
404
+
405
+ **Use openpyxl when**:
406
+ - Building custom layouts (merged cells, KPI cards)
407
+ - Complex formatting (fonts, colors, borders)
408
+ - Formulas and dynamic calculations
409
+ - Editing existing files
410
+ - Multi-sheet workbooks with cross-references
411
+
412
+ ### openpyxl Tips
413
+
414
+ **1-based indexing**: Row and column numbers start at 1
415
+ ```python
416
+ ws.cell(1, 1) # A1
417
+ ws.cell(5, 3) # C5
418
+ ws['A1'] # Also A1
419
+ ```
420
+
421
+ **Preserve formulas when loading**:
422
+ ```python
423
+ wb = load_workbook('file.xlsx') # Preserves formulas
424
+ # NOT: load_workbook('file.xlsx', data_only=True)
425
+ ```
426
+
427
+ **Large files**: Use read_only or write_only mode
428
+ ```python
429
+ wb = load_workbook('huge.xlsx', read_only=True) # For reading only
430
+ ws = wb.active
431
+ for row in ws.iter_rows():
432
+ # Process
433
+ ```
434
+
435
+ **Iterating ranges**:
436
+ ```python
437
+ # By row
438
+ for row in ws.iter_rows(min_row=2, max_row=100, values_only=False):
439
+ for cell in row:
440
+ print(cell.value)
441
+
442
+ # By column
443
+ for col in ws.iter_cols(min_col=1, max_col=5):
444
+ for cell in col:
445
+ print(cell.value)
446
+ ```
447
+
448
+ ### pandas Tips
449
+
450
+ **Specify data types** to avoid parsing errors:
451
+ ```python
452
+ df = pd.read_excel('file.xlsx', dtype={'Year': str, 'Amount': float})
453
+ ```
454
+
455
+ **Load specific columns** to reduce memory:
456
+ ```python
457
+ df = pd.read_excel('file.xlsx', usecols=['Product', 'Sales', 'Date'])
458
+ ```
459
+
460
+ **Parse dates**:
461
+ ```python
462
+ df = pd.read_excel('file.xlsx', parse_dates=['Close Date'])
463
+ ```
464
+
465
+ ## Code Style Guidelines
466
+
467
+ Write minimal, concise Python. Avoid unnecessary comments—the code should be clear. But DO add Excel cell comments for complex formulas.
468
+
469
+ ```python
470
+ # Good: Clear variable names, no fluff
471
+ df = pd.read_excel('sales.csv')
472
+ for idx, row in df.iterrows():
473
+ ws[f'A{idx+2}'] = row['Product']
474
+ ws[f'B{idx+2}'] = row['Amount']
475
+
476
+ # Bad: Excessive comments
477
+ # Load the Excel file from disk
478
+ df = pd.read_excel('sales.csv') # Read the CSV
479
+ # Loop through each row
480
+ for idx, row in df.iterrows(): # Iterate rows
481
+ ws[f'A{idx+2}'] = row['Product'] # Set product name
482
+ ```
483
+
484
+ **For complex formulas, add Excel comments**:
485
+ ```python
486
+ # In Python, calculate but explain in the sheet
487
+ ws['E5'].value = '=IF(D5=0,0,(C5-B5)/B5)' # YoY growth
488
+ ws['E5'].comment = Comment("YoY growth % = (Current - Prior) / Prior", author="Model")
489
+ ```
490
+
491
+ ## Financial Model Structure Example
492
+
493
+ ```
494
+ Row 1: [Metric] [2024A] [2025E] [2026E] [2027E]
495
+ Row 2: Revenue ($K) [1000] [1500] [2100] [2800] <- Blue inputs
496
+ Row 3: Growth Rate — 50% 40% 33% <- Blue inputs, yellow bg
497
+ Row 4: COGS ($K) [400] [600] [820] [1090] <- Formulas (black)
498
+ Row 5: Gross Margin % 40.0% 40.0% 39.0% 39.0% <- Blue inputs, yellow bg
499
+ Row 6: OpEx ($K) [300] [350] [420] [500] <- Blue inputs
500
+ Row 7: EBITDA ($K) [300] [550] [860] [1210] <- Formulas (black)
501
+
502
+ Formulas:
503
+ C2 = B2 * (1 + C3) [2024 * (1 + growth)]
504
+ C4 = C2 * (1 - C5) [Revenue * (1 - COGS margin)]
505
+ C7 = C2 - C4 - C6 [Revenue - COGS - OpEx]
506
+ ```
22
507
 
23
- See `references/spec-format.md` for full JSON schema and examples.
508
+ Cell colors:
509
+ - B2, C3, D3, E3: Blue text (inputs)
510
+ - C3, D3, E3: Yellow background (key assumptions)
511
+ - C2, C4, C7: Black text (formulas)