read-excel-file 7.0.1 → 7.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +21 -11
  2. package/package.json +4 -3
  3. package/types.d.ts +1 -1
package/README.md CHANGED
@@ -1,12 +1,22 @@
1
1
  # `read-excel-file`
2
2
 
3
- Read `*.xlsx` files of moderate size in a web browser or on a server.
3
+ Read `.xlsx` files in a web browser or in Node.js.
4
4
 
5
5
  It also supports parsing spreadsheet rows into JSON objects using a [schema](#schema).
6
6
 
7
7
  [Demo](https://catamphetamine.gitlab.io/read-excel-file/)
8
8
 
9
- Also check out [`write-excel-file`](https://www.npmjs.com/package/write-excel-file) for writing `*.xlsx` files.
9
+ Also check out [`write-excel-file`](https://www.npmjs.com/package/write-excel-file) for writing `.xlsx` files.
10
+
11
+ ## Performance
12
+
13
+ Here're the results of reading [sample `.xlsx` files](https://examplefile.com/document/xlsx) of different sizes:
14
+
15
+ |File Size| Browser | Node.js |
16
+ |---------|---------|-----------|
17
+ | 1 MB | 0.2 sec.| 0.25 sec. |
18
+ | 10 MB | 1.5 sec.| 2 sec. |
19
+ | 50 MB | 8.5 sec.| 14 sec. |
10
20
 
11
21
  ## Install
12
22
 
@@ -96,7 +106,7 @@ The one that works both in a web browser and Node.js. Only supports a [`Blob`](h
96
106
  // Import from '/universal' subpackage.
97
107
  import readXlsxFile from 'read-excel-file/universal'
98
108
 
99
- // Read data from a `Blob` with `*.xlsx` file contents.
109
+ // Read data from a `Blob` with `.xlsx` file contents.
100
110
  readXlsxFile(blob).then((rows) => {
101
111
  // `rows` is an array of "rows".
102
112
  // Each "row" is an array of "cells".
@@ -181,13 +191,13 @@ readSheetNames(file).then((sheetNames) => {
181
191
 
182
192
  ## Dates
183
193
 
184
- `*.xlsx` file format originally had no dedicated "date" type, so dates are in almost all cases stored simply as numbers, equal to the count of days since `01/01/1900`. To correctly interpret such numbers as dates, each date cell has a special ["format"](https://xlsxwriter.readthedocs.io/format.html#format-set-num-format) (example: `"d mmm yyyy"`) that instructs the spreadsheet viewer application to format the number in the cell as a date in a given format.
194
+ `.xlsx` file format originally had no dedicated "date" type, so dates are in almost all cases stored simply as numbers, equal to the count of days since `01/01/1900`. To correctly interpret such numbers as dates, each date cell has a special ["format"](https://xlsxwriter.readthedocs.io/format.html#format-set-num-format) (example: `"d mmm yyyy"`) that instructs the spreadsheet viewer application to format the number in the cell as a date in a given format.
185
195
 
186
196
  When using `readXlsxFile()` with a [`schema`](#schema) parameter, all columns having `type: Date` are automatically parsed as dates.
187
197
 
188
198
  When using `readXlsxFile()` without a `schema` parameter, it attempts to guess whether the cell value is a date or a number by looking at the cell's "format" — if the "format" is one of the [standard date formats](https://docs.microsoft.com/en-us/dotnet/api/documentformat.openxml.spreadsheet.numberingformat?view=openxml-2.8.1) then the cell value is interpreted as a date. So usually there's no need to configure anything and it usually works out-of-the-box.
189
199
 
190
- Sometimes though, an `*.xlsx` file might use a non-standard date format like `"mm/dd/yyyy"`. To read such files correctly, pass a `dateFormat` parameter to tell it to parse cells having such "format" as date cells.
200
+ Sometimes though, an `.xlsx` file might use a non-standard date format like `"mm/dd/yyyy"`. To read such files correctly, pass a `dateFormat` parameter to tell it to parse cells having such "format" as date cells.
191
201
 
192
202
  ```js
193
203
  readXlsxFile(file, { dateFormat: 'mm/dd/yyyy' })
@@ -195,7 +205,7 @@ readXlsxFile(file, { dateFormat: 'mm/dd/yyyy' })
195
205
 
196
206
  ## Numbers
197
207
 
198
- In `*.xlsx` files, numbers are stored as strings. `read-excel-file` manually parses such numeric cell values from strings to numbers. But there's an inherent issue with javascript numbers in general: their [floating-point precision](https://www.youtube.com/watch?v=2gIxbTn7GSc) might not be enough for applications that require 100% precision. An example would be finance and banking. To support such demanding use-cases, this library supports passing a custom `parseNumber(string)` function as an option.
208
+ In `.xlsx` files, numbers are stored as strings. `read-excel-file` manually parses such numeric cell values from strings to numbers. But there's an inherent issue with javascript numbers in general: their [floating-point precision](https://www.youtube.com/watch?v=2gIxbTn7GSc) might not be enough for applications that require 100% precision. An example would be finance and banking. To support such demanding use-cases, this library supports passing a custom `parseNumber(string)` function as an option.
199
209
 
200
210
  Example: Use "decimals" to represent numbers with 100% precision in banking applications.
201
211
 
@@ -221,7 +231,7 @@ Dynamically calculated cells using formulas (`SUM`, etc) are not supported.
221
231
 
222
232
  ## Performance
223
233
 
224
- There have been some reports about performance issues when reading extremely large `*.xlsx` spreadsheets using this library. It's true that this library's main point have been usability and convenience, and not performance when handling huge datasets. For example, the time of parsing a file with 100,000 rows could be up to 10 seconds. If your application has to quickly read huge datasets, perhaps consider using something like [`xlsx`](https://github.com/catamphetamine/read-excel-file/issues/38#issuecomment-544286628) package instead. There're no comparative benchmarks between the two packages, so we don't know how much the difference would be. If you'll be making any benchmarks, share those in the "Issues" so that we could include them in this readme.
234
+ There have been some reports about performance issues when reading extremely large `.xlsx` spreadsheets using this library. It's true that this library's main point have been usability and convenience, and not performance when handling huge datasets. For example, the time of parsing a file with 100,000 rows could be up to 10 seconds. If your application has to quickly read huge datasets, perhaps consider using something like [`xlsx`](https://github.com/catamphetamine/read-excel-file/issues/38#issuecomment-544286628) package instead. There're no comparative benchmarks between the two packages, so we don't know how much the difference would be. If you'll be making any benchmarks, share those in the "Issues" so that we could include them in this readme.
225
235
 
226
236
  ## Schema
227
237
 
@@ -287,7 +297,7 @@ If there're any errors during the conversion of spreadsheet data to JSON objects
287
297
  Below is an example of using a `schema`.
288
298
 
289
299
  ```js
290
- // An example *.xlsx document:
300
+ // An example .xlsx document:
291
301
  // -----------------------------------------------------------------------------------------
292
302
  // | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE | CONTACT | STATUS |
293
303
  // -----------------------------------------------------------------------------------------
@@ -458,9 +468,9 @@ function stringifyValue(value) {
458
468
  ```
459
469
  </details>
460
470
 
461
- ## Fix Spreadsheet Before Parsing With Schema
471
+ ## Fix Spreadsheet Structure When Using Schema
462
472
 
463
- Sometimes, a spreadsheet doesn't have the required structure to parse it with `schema`. For example, header row might be missing, or there could be some purely presentational / empty / "garbage" rows that should be removed before parsing. To fix that, pass a `transformData(data)` function as an option. It will modify spreadsheet content before it is parsed with `schema`.
473
+ Sometimes, a spreadsheet doesn't have the required structure to read it using a `schema`. For example, header row might be missing, or there could be some purely presentational / empty / "garbage" rows that should be skipped. To fix that, pass a `transformData(data)` function as an option. It will transform spreadsheet content before it is parsed with `schema`. The `data` argument is an array of rows, each row being an array of cell values.
464
474
 
465
475
  ```js
466
476
  readXlsxFile(file, {
@@ -477,7 +487,7 @@ readXlsxFile(file, {
477
487
 
478
488
  ## Browser Support
479
489
 
480
- An `*.xlsx` file is just a `*.zip` archive with an `*.xslx` file extension. This package uses [`fflate`](https://www.npmjs.com/package/fflate) for `*.zip` decompression. See `fflate`'s [browser support](https://www.npmjs.com/package/fflate#browser-support) for further details.
490
+ An `.xlsx` file is just a `*.zip` archive with an `*.xslx` file extension. This package uses [`fflate`](https://www.npmjs.com/package/fflate) for `*.zip` decompression. See `fflate`'s [browser support](https://www.npmjs.com/package/fflate#browser-support) for further details.
481
491
 
482
492
  ## CDN
483
493
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "read-excel-file",
3
- "version": "7.0.1",
4
- "description": "Read `*.xlsx` files of moderate size in a web browser or on a server.",
3
+ "version": "7.0.3",
4
+ "description": "Read `.xlsx` files in a web browser or in Node.js",
5
5
  "type": "module",
6
6
  "exports": {
7
7
  "./universal": {
@@ -28,8 +28,9 @@
28
28
  },
29
29
  "sideEffects": false,
30
30
  "scripts": {
31
- "test": "mocha --bail \"source/**/*.test.js\" \"test/*.test.js\" \"test/!(exports)/**/*.test.js\"",
31
+ "test": "mocha --bail \"source/**/*.test.js\" \"test/*.test.js\" \"test/!(exports|benchmark)/**/*.test.js\"",
32
32
  "test:exports": "mocha --bail \"test/exports/**/*.test.js\"",
33
+ "test:benchmark": "mocha --bail \"test/benchmark/**/*.test.js\"",
33
34
  "clean-for-build": "rimraf ./commonjs/**/* ./modules/**/*",
34
35
  "build-commonjs-modules": "better-npm-run build-commonjs-modules",
35
36
  "build-commonjs-package.json": "node build-scripts/create-commonjs-package-json.js",
package/types.d.ts CHANGED
@@ -52,7 +52,7 @@ export type Schema<Object = Record<string, any>, ColumnTitle extends string = st
52
52
  interface SchemaParseCellValueErrorGeneralProperties<CustomType = never> {
53
53
  row: number;
54
54
  column: string;
55
- type?: BasicType | CustomType;
55
+ type?: BaseType | CustomType;
56
56
  }
57
57
 
58
58
  export interface SchemaParseCellValueError<ParsedValue = any> extends SchemaParseCellValueErrorGeneralProperties<ParsedValue> {