numbers-parser 4.7.1__py3-none-any.whl → 4.8.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- numbers_parser/__init__.py +2 -1
- numbers_parser/cell.py +385 -100
- numbers_parser/cell_storage.py +151 -162
- numbers_parser/constants.py +165 -1
- numbers_parser/document.py +932 -228
- numbers_parser/formula.py +10 -10
- numbers_parser/model.py +291 -124
- numbers_parser-4.8.0.dist-info/METADATA +378 -0
- {numbers_parser-4.7.1.dist-info → numbers_parser-4.8.0.dist-info}/RECORD +12 -12
- {numbers_parser-4.7.1.dist-info → numbers_parser-4.8.0.dist-info}/WHEEL +1 -1
- numbers_parser-4.7.1.dist-info/METADATA +0 -626
- {numbers_parser-4.7.1.dist-info → numbers_parser-4.8.0.dist-info}/LICENSE.rst +0 -0
- {numbers_parser-4.7.1.dist-info → numbers_parser-4.8.0.dist-info}/entry_points.txt +0 -0
|
@@ -1,626 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.1
|
|
2
|
-
Name: numbers-parser
|
|
3
|
-
Version: 4.7.1
|
|
4
|
-
Summary: Read and write Apple Numbers spreadsheets
|
|
5
|
-
Home-page: https://github.com/masaccio/numbers-parser
|
|
6
|
-
License: MIT
|
|
7
|
-
Author: Jon Connell
|
|
8
|
-
Author-email: python@figsandfudge.com
|
|
9
|
-
Requires-Python: >=3.8,<4.0
|
|
10
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
11
|
-
Classifier: Operating System :: OS Independent
|
|
12
|
-
Classifier: Programming Language :: Python :: 3
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.8
|
|
14
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
15
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
16
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
-
Classifier: Programming Language :: Python :: 3
|
|
18
|
-
Classifier: Topic :: Office/Business :: Financial :: Spreadsheet
|
|
19
|
-
Requires-Dist: compact-json (>=1.1.3,<2.0.0)
|
|
20
|
-
Requires-Dist: importlib-resources (>=6.1.1,<7.0.0)
|
|
21
|
-
Requires-Dist: pendulum (>=3.0,<4.0)
|
|
22
|
-
Requires-Dist: protobuf (>=4.21.1,<5.0.0)
|
|
23
|
-
Requires-Dist: python-snappy (>=0.6.1,<0.7.0)
|
|
24
|
-
Requires-Dist: regex (>=2022.9.13,<2023.0.0)
|
|
25
|
-
Requires-Dist: roman (>=3.3,<4.0)
|
|
26
|
-
Requires-Dist: setuptools (>=69.0.3,<70.0.0)
|
|
27
|
-
Requires-Dist: sigfig (>=1.3.2,<2.0.0)
|
|
28
|
-
Project-URL: Documentation, https://github.com/masaccio/numbers-parser/blob/main/README.md
|
|
29
|
-
Project-URL: Repository, https://github.com/masaccio/numbers-parser
|
|
30
|
-
Description-Content-Type: text/markdown
|
|
31
|
-
|
|
32
|
-
# numbers-parser
|
|
33
|
-
|
|
34
|
-
[](https://github.com/masaccio/numbers-parser/actions/workflows/run-all-tests.yml)
|
|
35
|
-
[](https://github.com/masaccio/numbers-parser/actions/workflows/codeql.yml)
|
|
36
|
-
[](https://codecov.io/gh/masaccio/numbers-parser)
|
|
37
|
-
[](https://badge.fury.io/py/numbers-parser)
|
|
38
|
-
|
|
39
|
-
`numbers-parser` is a Python module for parsing [Apple Numbers](https://www.apple.com/numbers/)`.numbers` files. It supports Numbers files generated by Numbers version 10.3, and up with the latest tested version being 13.2 (current as of September 2023).
|
|
40
|
-
|
|
41
|
-
It supports and is tested against Python versions from 3.8 onwards. It is not compatible with earlier versions of Python.
|
|
42
|
-
|
|
43
|
-
## Installation
|
|
44
|
-
|
|
45
|
-
``` bash
|
|
46
|
-
python3 -m pip install numbers-parser
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
A pre-requisite for this package is [python-snappy](https://pypi.org/project/python-snappy/) which will be installed by Python automatically, but python-snappy also requires that the binary libraries for snappy compression are present.
|
|
50
|
-
|
|
51
|
-
The most straightforward way to install the binary dependencies is to use [Homebrew](https://brew.sh) and source Python from Homebrew rather than from macOS as described in the [python-snappy github](https://github.com/andrix/python-snappy):
|
|
52
|
-
|
|
53
|
-
For Intel Macs:
|
|
54
|
-
|
|
55
|
-
``` bash
|
|
56
|
-
brew install snappy python3
|
|
57
|
-
CPPFLAGS="-I/usr/local/include -L/usr/local/lib" python3 -m pip install python-snappy
|
|
58
|
-
```
|
|
59
|
-
|
|
60
|
-
For Apple Silicon Macs:
|
|
61
|
-
|
|
62
|
-
``` bash
|
|
63
|
-
brew install snappy python3
|
|
64
|
-
CPPFLAGS="-I/opt/homebrew/include -L/opt/homebrew/lib" python3 -m pip install python-snappy
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
For Linux (your package manager may be different):
|
|
68
|
-
|
|
69
|
-
``` bash
|
|
70
|
-
sudo apt-get -y install libsnappy-dev
|
|
71
|
-
```
|
|
72
|
-
|
|
73
|
-
On Windows, you will need to either arrange for snappy to be found for VSC++ or you can install python binary libraries compiled by [Christoph Gohlke](https://www.lfd.uci.edu/~gohlke/pythonlibs/#python-snappy). You must select the correct python version for your installation. For example for python 3.11:
|
|
74
|
-
|
|
75
|
-
``` text
|
|
76
|
-
C:\Users\Jon>pip install C:\Users\Jon\Downloads\python_snappy-0.6.1-cp311-cp311-win_amd64.whl
|
|
77
|
-
```
|
|
78
|
-
|
|
79
|
-
## API changes in version 4.0
|
|
80
|
-
|
|
81
|
-
To better partition cell styles, background image data which was supported in earlier versions through the methods `image_data` and `image_filename` is now part of the new `cell_style` property. Using the deprecated methods `image_data` and `image_filename` will issue a `DeprecationWarning` if used.The legacy methods will be removed in a future version of numbers-parser.
|
|
82
|
-
|
|
83
|
-
`NumberCell` cell values are now limited to 15 significant figures to match the implementation of floating point numbers in Apple Numbers. For example, the value `1234567890123456` is rounded to `1234567890123460` in the same way as in Numbers. Previously, using native `float` with no checking resulted in rounding errors in unpacking internal numbers. Attempting to write a number with too many significant digits results in a `RuntimeWarning`.
|
|
84
|
-
|
|
85
|
-
The previously deprecated methods `Document.sheets()` and `Sheet.tables()` are now only available using the properties of the same name (see examples in this README).
|
|
86
|
-
|
|
87
|
-
## Usage
|
|
88
|
-
|
|
89
|
-
Reading documents:
|
|
90
|
-
|
|
91
|
-
``` python
|
|
92
|
-
from numbers_parser import Document
|
|
93
|
-
doc = Document("my-spreadsheet.numbers")
|
|
94
|
-
sheets = doc.sheets
|
|
95
|
-
tables = sheets[0].tables
|
|
96
|
-
rows = tables[0].rows()
|
|
97
|
-
```
|
|
98
|
-
|
|
99
|
-
### Referring to sheets and tables
|
|
100
|
-
|
|
101
|
-
Sheets and tables are iterables that can be indexed using either an integer index or using the name of the sheet/table:
|
|
102
|
-
|
|
103
|
-
``` python
|
|
104
|
-
# list access method
|
|
105
|
-
sheet_1 = doc.sheets[0]
|
|
106
|
-
print("Opened sheet", sheet_1.name)
|
|
107
|
-
|
|
108
|
-
# dict access method
|
|
109
|
-
table_1 = sheets["Table 1"]
|
|
110
|
-
print("Opened table", table_1.name)
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
### Accessing data
|
|
114
|
-
|
|
115
|
-
`Table` objects have a `rows` method which contains a nested list with an entry for each row of the table. Each row is itself a list of the column values.
|
|
116
|
-
|
|
117
|
-
``` python
|
|
118
|
-
data = sheets["Table 1"].rows()
|
|
119
|
-
print("Cell A1 contains", data[0][0])
|
|
120
|
-
print("Cell C2 contains", data[2][1])
|
|
121
|
-
```
|
|
122
|
-
|
|
123
|
-
Cells are objects with a common base class of `Cell`. All cell types have a property `value` which returns the contents of the cell in as a python datatype. `numbers-parser` uses [pendulum](https://pendulum.eustace.io) instead of python's builtin types. Available cell types are:
|
|
124
|
-
|
|
125
|
-
| Cell type | value type | Additional properties |
|
|
126
|
-
| ----------------- | ------------------- | ------------------------------------------- |
|
|
127
|
-
| NumberCell | `float` | |
|
|
128
|
-
| TextCell | `str` | |
|
|
129
|
-
| RichTextCell | `str` | See [Bullets and lists](#bullets-and-lists) |
|
|
130
|
-
| EmptyCell | `None` | |
|
|
131
|
-
| BoolCell | `bool` | |
|
|
132
|
-
| DateCell | `pendulum.datetime` | |
|
|
133
|
-
| DurationCell | `pendulum.duration` | |
|
|
134
|
-
| ErrorCell | `None` | |
|
|
135
|
-
| MergedCell | `None` | See [Merged cells](#merged-cells) |
|
|
136
|
-
|
|
137
|
-
Where cell values are not `None` the property `formatted_value` returns the cell value as a `str` as displayed in Numbers. Cells that have no values in a table are represented as `EmptyCell` and cells containing evaluation errors of any kind `ErrorCell`.
|
|
138
|
-
|
|
139
|
-
### Cell references
|
|
140
|
-
|
|
141
|
-
Data for single cells is accessed using `Table.cell()`. Cell references can be either zero-offset row/column integers or an Excel/Numbers cell reference using a column letter and row number:
|
|
142
|
-
|
|
143
|
-
``` python
|
|
144
|
-
doc = Document("my-spreadsheet.numbers")
|
|
145
|
-
sheets = doc.sheets
|
|
146
|
-
tables = sheets["Sheet 1"].tables
|
|
147
|
-
table = tables["Table 1"]
|
|
148
|
-
|
|
149
|
-
# row, column syntax
|
|
150
|
-
print("Cell A1 contains", table.cell(0, 0))
|
|
151
|
-
# Excel/Numbers-style cell references
|
|
152
|
-
print("Cell C2 contains", table.cell("C2"))
|
|
153
|
-
```
|
|
154
|
-
|
|
155
|
-
### Merged cells
|
|
156
|
-
|
|
157
|
-
`Cell.is_merged` returns `True` for any cell that is the result of merging rows and/or columns. Cells eliminated from the table by the merge can still be indexed using `Table.cell()` and are of type `MergedCell`.
|
|
158
|
-
|
|
159
|
-
Consider this example:
|
|
160
|
-
|
|
161
|
-
<!-- markdownlint-disable MD033 -->
|
|
162
|
-
<table>
|
|
163
|
-
<tr>
|
|
164
|
-
<td>A1</td>
|
|
165
|
-
<td rowspan=2>B1</td>
|
|
166
|
-
</tr>
|
|
167
|
-
<tr>
|
|
168
|
-
<td>A2</td>
|
|
169
|
-
</tr>
|
|
170
|
-
</table>
|
|
171
|
-
<!-- markdownlint-enable MD033 -->
|
|
172
|
-
|
|
173
|
-
The properties of merges are tested using the following properties:
|
|
174
|
-
|
|
175
|
-
| Cell | Type | `value` | `is_merged` | `size` | `rect` | `merge_range` |
|
|
176
|
-
| ----- | ---------- | -------- | ----------- | ------- | ------------ | ------------- |
|
|
177
|
-
| A1 | TextCell | `A1` | `False` | (1, 1) | `None` | `None` |
|
|
178
|
-
| A2 | TextCell | `A2` | `False` | (1, 1) | `None` | `None` |
|
|
179
|
-
| B1 | TextCell | `B1` | `True` | (2, 1) | `None` | `None` |
|
|
180
|
-
| B2 | MergedCell | `None` | `False` | `None` | (1, 0, 2, 0) | `"B1:B2"` |
|
|
181
|
-
|
|
182
|
-
The tuple values of the `rect` property of a `MergedCell` are also available using the properties `row_start`, `col_start`, `row_end`, and `col_end`.
|
|
183
|
-
|
|
184
|
-
### Row and column iterators
|
|
185
|
-
|
|
186
|
-
Tables have iterators for row-wise and column-wise iteration with each iterator returning a list of the cells in that row or column
|
|
187
|
-
|
|
188
|
-
``` python
|
|
189
|
-
for row in table.iter_rows(min_row=2, max_row=7, values_only=True):
|
|
190
|
-
sum += row
|
|
191
|
-
for col in table.iter_cols(min_row=2, max_row=7):
|
|
192
|
-
sum += col.value
|
|
193
|
-
```
|
|
194
|
-
|
|
195
|
-
### Formulas
|
|
196
|
-
|
|
197
|
-
Formula evaluation relies on Numbers storing current values which should usually be the case. In cells containing a formula, `value` returns the computed value of the formula. The formula itself is available using the `formula` property.
|
|
198
|
-
|
|
199
|
-
### Pandas
|
|
200
|
-
|
|
201
|
-
Since the return value of `rows()` is a list of lists, you can pass this directly to pandas. Assuming you have a Numbers table with a single header which contains the names of the pandas series you want to create you can construct a pandas dataframe using:
|
|
202
|
-
|
|
203
|
-
``` python
|
|
204
|
-
import pandas as pd
|
|
205
|
-
|
|
206
|
-
doc = Document("simple.numbers")
|
|
207
|
-
sheets = doc.sheets
|
|
208
|
-
tables = sheets[0].tables
|
|
209
|
-
data = tables[0].rows(values_only=True)
|
|
210
|
-
df = pd.DataFrame(data[1:], columns=data[0])
|
|
211
|
-
```
|
|
212
|
-
|
|
213
|
-
### Bullets and lists
|
|
214
|
-
|
|
215
|
-
Cells that contain bulleted or numbered lists can be identified by the `is_bulleted` property. Data from such cells is returned using the `value` property as with other cells, but can additionally extracted using the `bullets` property. `bullets` returns a list of the paragraphs in the cell without the bullet or numbering character. Newlines are not included when bullet lists are extracted using `bullets`.
|
|
216
|
-
|
|
217
|
-
``` python
|
|
218
|
-
doc = Document("bullets.numbers")
|
|
219
|
-
sheets = doc.sheets
|
|
220
|
-
tables = sheets[0].tables
|
|
221
|
-
table = tables[0]
|
|
222
|
-
if not table.cell(0, 1).is_bulleted:
|
|
223
|
-
print(table.cell(0, 1).value)
|
|
224
|
-
else:
|
|
225
|
-
bullets = ["* " + s for s in table.cell(0, 1).bullets]
|
|
226
|
-
print("\n".join(bullets))
|
|
227
|
-
```
|
|
228
|
-
|
|
229
|
-
Bulleted and numbered data can also be extracted with the bullet or number characters present in the text for each line in the cell in the same way as above but using the `formatted_bullets` property. A single space is inserted between the bullet character and the text string and in the case of bullets, this will be the Unicode character seen in Numbers, for example `"• some text"`.
|
|
230
|
-
|
|
231
|
-
### Hyperlinks
|
|
232
|
-
|
|
233
|
-
Numbers does not support hyperlinks to cells within a spreadsheet, but does allow embedding links in cells. When cells contain hyperlinks, `numbers_parser` returns the text version of the cell. The `hyperlinks` property of cells where `is_bulleted` is `True` is a list of text and URL tuples:
|
|
234
|
-
|
|
235
|
-
``` python
|
|
236
|
-
cell = table.cell(0, 0)
|
|
237
|
-
(text, url) = cell.hyperlinks[0]
|
|
238
|
-
```
|
|
239
|
-
|
|
240
|
-
### Styles
|
|
241
|
-
|
|
242
|
-
`numbers_parser` currently only supports paragraph styles and cell styles. The following paragraph styles are supported:
|
|
243
|
-
|
|
244
|
-
* font attributes: bold, italic, underline, strikethrough
|
|
245
|
-
* font selection and size
|
|
246
|
-
* text foreground color
|
|
247
|
-
* horizontal and vertical alignment
|
|
248
|
-
* cell background color
|
|
249
|
-
* cell indents (first line, left, right, and text inset)
|
|
250
|
-
|
|
251
|
-
Table styles that allow new tables to adopt a style across the whole table are not planned.
|
|
252
|
-
|
|
253
|
-
Numbers conflates style attributes that can be stored in paragraph styles (the style menu in the text panel) with the settings that are available on the Style tab of the Text panel. Some attributes in Numbers are not applied to new cells when a style is applied. To keep the API simple, `numbers-parser` packs all styling into a single `Style` object. When a document is saved, the attributes not stored in a paragraph style are applied to each cell that includes it. Attributes behaving in this way are currently `Cell.alignment.vertical` and `Cell.style.text_inset`. The cell background color `Cell.style.bg_color` also behaves this way, though this is in line with the separation in Numbers.
|
|
254
|
-
|
|
255
|
-
#### Reading styles
|
|
256
|
-
|
|
257
|
-
The cell method `style` returns a `Style` object containing all the style information for that cell. Cells with identical style settings contain references to a single style object.
|
|
258
|
-
|
|
259
|
-
Cell style attributes can be returned using a number of methods:
|
|
260
|
-
|
|
261
|
-
* `Cell.style.alignment`: the horizontal and vertical alignment of the cell as an `Alignment` names tuple
|
|
262
|
-
* `Cell.style.bg_color`: cell background color as an `RGB` named tuple, or a list of `RGB` values for gradients
|
|
263
|
-
* `Cell.style.bold`: `True` if the cell font is bold
|
|
264
|
-
* `Cell.style.font_color`: font color as an `RGB` named tuple
|
|
265
|
-
* `Cell.style.font_size`: font size in points (`float`)
|
|
266
|
-
* `Cell.style.font_name`: font name (`str`)
|
|
267
|
-
* `Cell.style.italic`: `True` if the cell font is italic
|
|
268
|
-
* `Cell.style.name`: cell style (`str`)
|
|
269
|
-
* `Cell.style.underline`: `True` if the cell font is underline
|
|
270
|
-
* `Cell.style.strikethrough`: `True` if the cell font is strikethrough
|
|
271
|
-
* `Cell.style.first_indent`: first line indent in points (`float`)
|
|
272
|
-
* `Cell.style.left_indent`: left indent in points (`float`)
|
|
273
|
-
* `Cell.style.right_indent`: right indent in points (`float`)
|
|
274
|
-
* `Cell.style.text_inset`: text inset in points (`float`)
|
|
275
|
-
* `Cell.style.text_wrap`: `True` if text wrapping is enabled (default for new cells)
|
|
276
|
-
|
|
277
|
-
#### Cell images
|
|
278
|
-
|
|
279
|
-
The methods `style.bg_image.filename` and `style.bg_image.data` return data about the image used for a cell's background, where set. If a cell has no background image, `style.bg_image` is `None`.
|
|
280
|
-
|
|
281
|
-
``` python
|
|
282
|
-
cell = table.cell("B1")
|
|
283
|
-
with open (cell.style.bg_image.filename, "wb") as f:
|
|
284
|
-
f.write(cell.style.bg_image.data)
|
|
285
|
-
```
|
|
286
|
-
|
|
287
|
-
Due to a limitation in Python's [ZipFile](https://docs.python.org/3/library/zipfile.html), Python versions older than 3.11 do not support image filenames with UTF-8 characters (see [issue 69](https://github.com/masaccio/numbers-parser/issues/69)). `cell.style.bg_image` returns `None` for such files and issues a `RuntimeWarning`.
|
|
288
|
-
|
|
289
|
-
### Formatting
|
|
290
|
-
|
|
291
|
-
In addition to rendering values as they are displayed in Numbers using the cell property `formatted_value`, `numbers-parser` has limited support for setting cell formats when saving files.
|
|
292
|
-
|
|
293
|
-
Formats are provided to the `Table.write` method:
|
|
294
|
-
|
|
295
|
-
``` python
|
|
296
|
-
date = datetime(2023, 4, 1, 13, 25, 42)
|
|
297
|
-
table.write(0, 0, date, formatting={"date_time_format": "EEEE, d MMMM yyyy"})
|
|
298
|
-
table.write(0, 1, 1234.560, formatting={"decimal_places": 3})
|
|
299
|
-
```
|
|
300
|
-
|
|
301
|
-
The following cell types are supported along with the associated formatting parameters:
|
|
302
|
-
|
|
303
|
-
<!-- markdownlint-disable MD033 -->
|
|
304
|
-
<table>
|
|
305
|
-
<thead>
|
|
306
|
-
<tr>
|
|
307
|
-
<th>Cell Type</th>
|
|
308
|
-
<th><code>formatting</code> parameter</th>
|
|
309
|
-
<th>Description</th>
|
|
310
|
-
</tr>
|
|
311
|
-
</thead>
|
|
312
|
-
<tbody>
|
|
313
|
-
<tr>
|
|
314
|
-
<td><code>DateCell</code></td>
|
|
315
|
-
<td><code>date_time_format</code>
|
|
316
|
-
<td>A POSIX <code>strftime</code>-like formatting string. See <a href="#datetime-formatting">Date/time formatting</a> for a list of supported directives</td>
|
|
317
|
-
</tr>
|
|
318
|
-
<tr>
|
|
319
|
-
<td rowspan=5><code>NumberCell</code></td>
|
|
320
|
-
<td><code>decimal_places</code></td>
|
|
321
|
-
<td>The number of decimal places, or <code>None</code> for automatic</td>
|
|
322
|
-
</tr>
|
|
323
|
-
<tr>
|
|
324
|
-
<td><code>negative_style</code>
|
|
325
|
-
<td>How negative numbers are represented</td>
|
|
326
|
-
</tr>
|
|
327
|
-
<tr>
|
|
328
|
-
<td><code>show_thousands_separator</code>
|
|
329
|
-
<td><code>True</code> if the number should include a thousands seperator, e.g. <code>,</code></td>
|
|
330
|
-
</tr>
|
|
331
|
-
<tr>
|
|
332
|
-
<td><code>currency_code</code>
|
|
333
|
-
<td>An ISO currency code. When present, indicates that the number is formatted as a currency in Numbers rather than a plain decimal number.</td>
|
|
334
|
-
</tr>
|
|
335
|
-
<tr>
|
|
336
|
-
<td><code>use_accounting_style</code>
|
|
337
|
-
<td><code>True</code> if the currency symbol should be formatted to the left of the cell and separated from the number value by a tab. A <code>RuntimeWarning</code> is generated if this is combined with <code>negative_style</code>.</td>
|
|
338
|
-
</tr>
|
|
339
|
-
</tbody>
|
|
340
|
-
</table>
|
|
341
|
-
<!-- markdownlint-enable MD033 -->
|
|
342
|
-
|
|
343
|
-
#### Date/time formatting
|
|
344
|
-
|
|
345
|
-
`date_time_format` uses Numbers notation for date and time formatting rather than POSIX `strftime` as there are a number of extensions. Date components are specified using directives which must be separated by whitespace. Supported directives are:
|
|
346
|
-
|
|
347
|
-
| Directive | Meaning | Example |
|
|
348
|
-
| --------- | ------------------------------------------------------------- | ---------------------- |
|
|
349
|
-
| a | Locale’s AM or PM | am, pm |
|
|
350
|
-
| EEEE | Full weekday name | Monday, Tuesday, ... |
|
|
351
|
-
| EEE | Abbreviated weekday name | Mon, Tue, ... |
|
|
352
|
-
| yyyy | Year with century as a decimal number | 1999, 2023, etc. |
|
|
353
|
-
| yy | Year without century as a zero-padded decimal number | 00, 01, ... 99 |
|
|
354
|
-
| y | Year without century as a decimal number | 0, 1, ... 99 |
|
|
355
|
-
| MMMM | Full month name | January, February, ... |
|
|
356
|
-
| MMM | Abbreviated month name | Jan, Feb, ... |
|
|
357
|
-
| MM | Month as a zero-padded decimal number | 01, 02, ... 12 |
|
|
358
|
-
| M | Month as a decimal number | 1, 2, ... 12 |
|
|
359
|
-
| d | Day as a decimal number | 1, 2, ... 31 |
|
|
360
|
-
| dd | Day as a zero-padded decimal number | 01, 02, ... 31 |
|
|
361
|
-
| DDD | Day of the year as a zero-padded 3-digit number | 001 - 366 |
|
|
362
|
-
| DD | Day of the year as a minimum zero-padded 2-digit number | 01 - 366 |
|
|
363
|
-
| D | Day of the year | 1 - 366 |
|
|
364
|
-
| HH | Hour (24-hour clock) as a zero-padded decimal number | 00, 01, ... 23 |
|
|
365
|
-
| H | Hour (24-hour clock) as a decimal number | 0, 1, ... 23 |
|
|
366
|
-
| hh | Hour (12-hour clock) as a zero-padded decimal number | 01, 02, ... 12 |
|
|
367
|
-
| h | Hour (12-hour clock) as a decimal number | 1, 2, ... 12 |
|
|
368
|
-
| k | Hour (24-hour clock) as a decimal number to 24 | 1, 2, ... 24 |
|
|
369
|
-
| kk | Hour (24-hour clock) as a zero-padded decimal number to 24 | 01, 02, ... 24 |
|
|
370
|
-
| K | Hour (12-hour clock) as a decimal number from 0 | 0, 1, ... 11 |
|
|
371
|
-
| KK | Hour (12-hour clock) as a zero-padded decimal number from 0 | 00, 01, ... 11 |
|
|
372
|
-
| mm | Minutes as a zero-padded number | 00, 01, ... 59 |
|
|
373
|
-
| m | Minutes as a number | 0, 1, ... 59 |
|
|
374
|
-
| ss | Seconds as a zero-padded number | 00, 01, ... 59 |
|
|
375
|
-
| s | Seconds as a number | 0, 1, ... 59 |
|
|
376
|
-
| W | Week number in the month (first week is zero) | 0, 1, ... 5 |
|
|
377
|
-
| ww | Week number of the year (Monday as the first day of the week) | 0, 1, ... 53 |
|
|
378
|
-
| G | AD or BC (only AD is supported) | AD |
|
|
379
|
-
| F | How many times the day of falls in the month | 1, 2, ... 5 |
|
|
380
|
-
| S | Seconds to one decimal place | 0 - 9 |
|
|
381
|
-
| SS | Seconds to two decimal places | 00 - 99 |
|
|
382
|
-
| SSS | Seconds to three decimal places | 000 - 999 |
|
|
383
|
-
| SSSS | Seconds to four decimal places | 0000 - 9999 |
|
|
384
|
-
| SSSSS | Seconds to five decimal places | 00000 - 9999 |
|
|
385
|
-
|
|
386
|
-
#### Number formatting
|
|
387
|
-
|
|
388
|
-
All `formatting` parameters for `NumberCell` cells are optional and formatting defaults to automatic number of decimals, standard negative numbr notation and no thousands separator.
|
|
389
|
-
|
|
390
|
-
The `negative_style` must be a valid `constants.NegativeNumberStyle` enum. Supported values are:
|
|
391
|
-
|
|
392
|
-
<!-- markdownlint-disable MD033 -->
|
|
393
|
-
| Value | Examples |
|
|
394
|
-
| ----------------------| ----------------------------------------- |
|
|
395
|
-
| `MINUS` | -1234.560 |
|
|
396
|
-
| `RED` | <span style="color:red">1234.560</span> |
|
|
397
|
-
| `PARENTHESES` | (1234.560) |
|
|
398
|
-
| `RED_AND_PARENTHESES` | <span style="color:red">(1234.560)</span> |
|
|
399
|
-
<!-- markdownlint-enable MD033 -->
|
|
400
|
-
|
|
401
|
-
### Borders
|
|
402
|
-
|
|
403
|
-
`numbers-parser` supports reading and writing cell borders, though the interface for each differs. Individual cells can have each of their four borders tested, but when drawing new borders, these are set for the table to allow for drawing borders across multiple cells. Setting the border of merged cells is not possible unless the edge of the cells is at the end of the merged region.
|
|
404
|
-
|
|
405
|
-
Borders are represented using the `Border` class that can be initialized with line width, color and line style:
|
|
406
|
-
|
|
407
|
-
``` python
|
|
408
|
-
border = Border(4.0, RGB(0, 162, 255), "solid"))
|
|
409
|
-
```
|
|
410
|
-
|
|
411
|
-
Valid values for the line `style` parameter are `"solid"`, `"dashes"`, `"dots"` and `"none"`.
|
|
412
|
-
|
|
413
|
-
#### Reading Cell Borders
|
|
414
|
-
|
|
415
|
-
Cells have a property `border` which itself has the properties `top`, `right`, `bottom` and `left`, each of which is a `Border` class representing the line type for that cell. Cells with no border set at all, and merged cells which are inside the range of the merge return `None` for these cells. The absence of a specified border is different from no border in Numbers which is a valid `Border` class with `style="none"`.
|
|
416
|
-
|
|
417
|
-
#### Writing Cell Borders
|
|
418
|
-
|
|
419
|
-
The `Table` method `set_cell_border()` sets the border for a cell edge or a range of cells:
|
|
420
|
-
|
|
421
|
-
``` python
|
|
422
|
-
table.set_cell_border("C1", ["top", "left"], Border(0.0, RGB(0, 0, 0), "none"))
|
|
423
|
-
table.set_cell_border(0, 4, "right", Border(1.0, RGB(0, 0, 0), "solid"), 3)
|
|
424
|
-
```
|
|
425
|
-
|
|
426
|
-
The last positional parameter specifies the length of the border and defaults to 1. A single call to `set_cell_border()` can set the borders to one or more sides of the cell as above. Like `Table.write()`, `set_cell_border()` supports both row/column and Excel-style cell references.
|
|
427
|
-
|
|
428
|
-
## Writing Numbers files
|
|
429
|
-
|
|
430
|
-
Whilst support for writing numbers files has been stable since version 3.4.0, you are highly recommended not to overwrite working Numbers files and instead save data to a new file.
|
|
431
|
-
|
|
432
|
-
### Limitations
|
|
433
|
-
|
|
434
|
-
Current limitations to write support are:
|
|
435
|
-
|
|
436
|
-
* Creating cells of type `BulletedTextCell` is not supported
|
|
437
|
-
* New tables are inserted with a fixed offset below the last table in a worksheet which does not take into account title or caption size
|
|
438
|
-
* New sheets insert tables with formats copied from the first table in the previous sheet rather than default table formats
|
|
439
|
-
|
|
440
|
-
### Cell values
|
|
441
|
-
|
|
442
|
-
`numbers-parser` will automatically empty rows and columns for any cell references that are out of range of the current table. The `write` method accepts the same cell numbering notation as `cell` plus an additional argument representing the new cell value. The type of the new value will be used to determine the cell type.
|
|
443
|
-
|
|
444
|
-
``` python
|
|
445
|
-
doc = Document("old-sheet.numbers")
|
|
446
|
-
sheets = doc.sheets
|
|
447
|
-
tables = sheets[0].tables
|
|
448
|
-
table = tables[0]
|
|
449
|
-
table.write(1, 1, "This is new text")
|
|
450
|
-
table.write("B7", datetime(2020, 12, 25))
|
|
451
|
-
doc.save("new-sheet.numbers")
|
|
452
|
-
```
|
|
453
|
-
|
|
454
|
-
Sheet names and table names can be changed by assigning a new value to the `name` of each:
|
|
455
|
-
|
|
456
|
-
```python
|
|
457
|
-
sheets[0].name = "My new sheet"
|
|
458
|
-
tables[0].name = "Edited table"
|
|
459
|
-
````
|
|
460
|
-
|
|
461
|
-
### Adding tables and sheets
|
|
462
|
-
|
|
463
|
-
Additional tables and worksheets can be added to a `Document` before saving. If no sheet name or table name is supplied, `numbers-parser` will use `Sheet 1`, `Sheet 2`, etc.
|
|
464
|
-
|
|
465
|
-
```python
|
|
466
|
-
doc = Document()
|
|
467
|
-
doc.add_sheet("New Sheet", "New Table")
|
|
468
|
-
sheet = doc.sheets["New Sheet"]
|
|
469
|
-
table = sheet.tables["New Table"]
|
|
470
|
-
table.write(1, 1, 1000)
|
|
471
|
-
table.write(1, 2, 2000)
|
|
472
|
-
table.write(1, 3, 3000)
|
|
473
|
-
|
|
474
|
-
doc.save("sheet.numbers")
|
|
475
|
-
```
|
|
476
|
-
|
|
477
|
-
### Table geometries
|
|
478
|
-
|
|
479
|
-
`numbers-parser` can query and change the position and size of tables. Changes made to a table's row height or column width is retained when files are saved.
|
|
480
|
-
|
|
481
|
-
#### Row and column sizes
|
|
482
|
-
|
|
483
|
-
Row heights and column widths are queried and set using the `row_height` and `col_width` methods:
|
|
484
|
-
|
|
485
|
-
```python
|
|
486
|
-
doc = Document("sheet.numbers")
|
|
487
|
-
table = doc.sheets[0].tables[0]
|
|
488
|
-
print(f"Table size is {table.height} x {table.width}")
|
|
489
|
-
print(f"Table row 1 height is {table.row_height(0)}")
|
|
490
|
-
table.row_height(0, 40)
|
|
491
|
-
print(f"Table row 1 height is now {table.row_height(0)}")
|
|
492
|
-
print(f"Table column A width is {table.col_width(0)}")
|
|
493
|
-
table.col_width(0, 200)
|
|
494
|
-
print(f"Table column A width is {table.col_width(0)}")
|
|
495
|
-
```
|
|
496
|
-
|
|
497
|
-
#### Header row and columns
|
|
498
|
-
|
|
499
|
-
When new tables are created, `numbers-parser` follows the Numbers convention of creating a table with one row header and one column header. You can change the number of headers by modifying the appropriate property:
|
|
500
|
-
|
|
501
|
-
```python
|
|
502
|
-
doc = Document("sheet.numbers")
|
|
503
|
-
table = doc.sheets[0].tables[0]
|
|
504
|
-
table.num_header_rows = 2
|
|
505
|
-
table.num_header_cols = 0
|
|
506
|
-
doc.save("saved.numbers")
|
|
507
|
-
```
|
|
508
|
-
|
|
509
|
-
A zero header count will remove the headers from the table. Attempting to set a negative number of headers, or using more headers that rows or columns in the table will raise a `ValueError` exception.
|
|
510
|
-
|
|
511
|
-
#### Positioning tables
|
|
512
|
-
|
|
513
|
-
By default, new tables are positioned at a fixed offset below the last table vertically in a sheet and on the left side of the sheet. Large table headers and captions may result in new tables overlapping existing ones. The `add_table` method takes optional coordinates for positioning a table. A table's height and coordinates can also be queried to help aligning new tables:
|
|
514
|
-
|
|
515
|
-
```python
|
|
516
|
-
(x, y) = sheet.table[0].coordinates
|
|
517
|
-
y += sheet.table[0].height + 200.0
|
|
518
|
-
new_table = sheet.add_table("Offset Table", x, y)
|
|
519
|
-
```
|
|
520
|
-
|
|
521
|
-
### Editing paragraph styles
|
|
522
|
-
|
|
523
|
-
Cell text styles, known as paragraph styles, are those applied by the Text tab in Numbers Format pane. To simplify the API, when writing documents, it is not possible to make ad hoc changes to cells without assigning an existing style or creating a new one. This differs to the Numbers interface where cells can have modified styles on a per cell basis. Such styles are read correctly when reading Numbers files.
|
|
524
|
-
|
|
525
|
-
Character styles, which allow formatting changes within cells such as "This is **bold** text" are not supported.
|
|
526
|
-
|
|
527
|
-
Styles are created using the `Document`'s `add_style` method, and can be applied to cells either as part of a `write` or using `set_cell_style`:
|
|
528
|
-
|
|
529
|
-
``` python
|
|
530
|
-
red_text = doc.add_style(
|
|
531
|
-
name="Red Text",
|
|
532
|
-
font_name="Lucida Grande",
|
|
533
|
-
font_color=RGB(230, 25, 25),
|
|
534
|
-
font_size=14.0,
|
|
535
|
-
bold=True,
|
|
536
|
-
italic=True,
|
|
537
|
-
alignment=Alignment("right", "top"),
|
|
538
|
-
)
|
|
539
|
-
table.write("B2", "Red", style=red_text)
|
|
540
|
-
table.set_cell_style("C2", red_text)
|
|
541
|
-
```
|
|
542
|
-
|
|
543
|
-
New styles are automatically added to the list of styles selectable in the Numbers Text pane.
|
|
544
|
-
|
|
545
|
-
Cell styles can also be referred to by name in both `Table.write` and `Table.set_cell_style`. A `dict` of available styles is returned by `Document.styles`. This contains key value pairs of style names and `Style` objects. Any changes to `Style` objects in the document are written back such that those styles are changed for all cells that use them.
|
|
546
|
-
|
|
547
|
-
``` python
|
|
548
|
-
doc = Document("styles.numbers")
|
|
549
|
-
styles = doc.styles
|
|
550
|
-
styles["Title"].font_size = 20.0
|
|
551
|
-
```
|
|
552
|
-
|
|
553
|
-
Since `Style` objects are shared, changing `Cell.style.font_size` will have the effect of changing the font size for that style and will in turn affect the styles of all cells using that style.
|
|
554
|
-
|
|
555
|
-
## Command-line scripts
|
|
556
|
-
|
|
557
|
-
When installed from [PyPI](https://pypi.org/project/numbers-parser/), a command-like script `cat-numbers` is installed in Python's scripts folder. This script dumps Numbers spreadsheets into Excel-compatible CSV format, iterating through all the spreadsheets passed on the command-line.
|
|
558
|
-
|
|
559
|
-
``` text
|
|
560
|
-
usage: cat-numbers [-h] [-T | -S | -b] [-V] [--debug] [--formulas]
|
|
561
|
-
[--formatting] [-s SHEET] [-t TABLE] [document ...]
|
|
562
|
-
|
|
563
|
-
Export data from Apple Numbers spreadsheet tables
|
|
564
|
-
|
|
565
|
-
positional arguments:
|
|
566
|
-
document Document(s) to export
|
|
567
|
-
|
|
568
|
-
optional arguments:
|
|
569
|
-
-h, --help show this help message and exit
|
|
570
|
-
-T, --list-tables List the names of tables and exit
|
|
571
|
-
-S, --list-sheets List the names of sheets and exit
|
|
572
|
-
-b, --brief Don't prefix data rows with name of sheet/table (default: false)
|
|
573
|
-
-V, --version
|
|
574
|
-
--debug Enable debug output
|
|
575
|
-
--formulas Dump formulas instead of formula results
|
|
576
|
-
--formatting Dump formatted cells (durations) as they appear in Numbers
|
|
577
|
-
-s SHEET, --sheet SHEET Names of sheet(s) to include in export
|
|
578
|
-
-t TABLE, --table TABLE Names of table(s) to include in export
|
|
579
|
-
```
|
|
580
|
-
|
|
581
|
-
Note: `--formatting` will return different capitalization for 12-hour times due to differences between Numbers' representation of these dates and `datetime.strftime`. Numbers in English locales displays 12-hour times with 'am' and 'pm', but `datetime.strftime` on macOS at least cannot return lower-case versions of AM/PM.
|
|
582
|
-
|
|
583
|
-
## Numbers File Formats
|
|
584
|
-
|
|
585
|
-
Numbers uses a proprietary, compressed binary format to store its tables. This format is comprised of a zip file containing images, as well as
|
|
586
|
-
[Snappy](https://github.com/google/snappy)-compressed [Protobuf](https://github.com/protocolbuffers/protobuf) `.iwa` files containing metadata, text, and all other definitions used in the spreadsheet.
|
|
587
|
-
|
|
588
|
-
### Protobuf updates
|
|
589
|
-
|
|
590
|
-
As `numbers-parser` includes private Protobuf definitions extracted from a copy of Numbers, new versions of Numbers will inevitably create `.numbers` files that cannot be read by `numbers-parser`. As new versions of Numbers are released, running `make bootstrap` will perform all the steps necessary to recreate the protobuf files used `numbers-parser` to read Numbers spreadsheets.
|
|
591
|
-
|
|
592
|
-
The default protobuf package installation may not include the C++ optimized version which is required by the bootstrapping scripts to extract protobufs. You will receive the following error during build if this is the case:
|
|
593
|
-
|
|
594
|
-
`This script requires the Protobuf installation to use the C++ implementation. Please reinstall Protobuf with C++ support.`
|
|
595
|
-
|
|
596
|
-
To include the C++ support, download a released version of Google protobuf [from github](https://github.com/protocolbuffers/protobuf). Build instructions are described in [`src/README.md`](https://github.com/protocolbuffers/protobuf/blob/main/src/README).These have changed greatly over time, but as of April 2023, this was useful:
|
|
597
|
-
|
|
598
|
-
``` shell
|
|
599
|
-
bazel build :protoc :protobuf
|
|
600
|
-
cmake . -DCMAKE_CXX_STANDARD=14
|
|
601
|
-
cmake --build . --parallel 8
|
|
602
|
-
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
|
|
603
|
-
export LD_LIBRARY_PATH=../bazel-bin/src/google
|
|
604
|
-
cd python
|
|
605
|
-
python3 setup.py -q bdist_wheel --cpp_implementation --warnings_as_errors --compile_static_extension
|
|
606
|
-
```
|
|
607
|
-
|
|
608
|
-
This can then be used `make bootstrap` in the `numbers-parser` source tree. The signing workflow assumes that you have an Apple Developer Account and that you have created provisioning profile that includes iCloud. Using a self-signed certificate does not seem to work, at least on Apple Silicon (a working PR contradicting this is greatly appreciated).
|
|
609
|
-
|
|
610
|
-
`make bootstrap` requires [PyObjC](https://pypi.org/project/pyobjc/) to generate font maps, but this dependency is excluded from Poetry to ensure that tests can run on non-Mac OSes. You can run `poetry run pip install PyObjC` to get the required packages.
|
|
611
|
-
|
|
612
|
-
## Credits
|
|
613
|
-
|
|
614
|
-
`numbers-parser` was built by [Jon Connell](http://github.com/masaccio) but relies heavily on from [prior work](https://github.com/psobot/keynote-parser) by [Peter Sobot](https://petersobot.com) to read the IWA format archives used by Apple's iWork family of applications, and to regenerate the mapping files required for Python. Both modules are derived from [previous work](https://github.com/obriensp/iWorkFileFormat/blob/master/Docs/index.md) by [Sean Patrick O'Brien](http://www.obriensp.com).
|
|
615
|
-
|
|
616
|
-
Decoding the data structures inside Numbers files was helped greatly by [Stingray-Reader](https://github.com/slott56/Stingray-Reader) by [Steven Lott](https://github.com/slott56).
|
|
617
|
-
|
|
618
|
-
Formula tests were adapted from JavaScript tests used in [fast-formula-parser](https://github.com/LesterLyu/fast-formula-parser).
|
|
619
|
-
|
|
620
|
-
Decimal128 conversion to and from byte storage was adapted from work done by the
|
|
621
|
-
[SheetsJS project](https://github.com/SheetJS/sheetjs). SheetJS also helped greatly with some of the steps required to successfully save a Numbers spreadsheet.
|
|
622
|
-
|
|
623
|
-
## License
|
|
624
|
-
|
|
625
|
-
All code in this repository is licensed under the [MIT License](https://github.com/masaccio/numbers-parser/blob/master/LICENSE.rst)
|
|
626
|
-
|
|
File without changes
|
|
File without changes
|