read-excel-file 4.0.5 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/.gitlab-ci.yml +15 -0
  2. package/CHANGELOG.md +27 -0
  3. package/LICENSE +1 -1
  4. package/README.md +110 -17
  5. package/bundle/index.html +22 -16
  6. package/bundle/read-excel-file.min.js +2 -2
  7. package/bundle/read-excel-file.min.js.map +1 -1
  8. package/commonjs/convertMapToSchema.js +41 -0
  9. package/commonjs/convertMapToSchema.js.map +1 -0
  10. package/commonjs/convertMapToSchema.test.js.map +1 -0
  11. package/commonjs/convertToJson.js +19 -9
  12. package/commonjs/convertToJson.js.map +1 -1
  13. package/commonjs/convertToJson.test.js.map +1 -1
  14. package/commonjs/readXlsx.js +6 -3
  15. package/commonjs/readXlsx.js.map +1 -1
  16. package/commonjs/readXlsxFileContents.js +17 -4
  17. package/commonjs/readXlsxFileContents.js.map +1 -1
  18. package/commonjs/readXlsxFileNode.test.js.map +1 -1
  19. package/index.d.ts.test +20 -0
  20. package/modules/convertMapToSchema.js +34 -0
  21. package/modules/convertMapToSchema.js.map +1 -0
  22. package/modules/convertMapToSchema.test.js.map +1 -0
  23. package/modules/convertToJson.js +19 -9
  24. package/modules/convertToJson.js.map +1 -1
  25. package/modules/convertToJson.test.js.map +1 -1
  26. package/modules/readXlsx.js +6 -3
  27. package/modules/readXlsx.js.map +1 -1
  28. package/modules/readXlsxFileContents.js +14 -4
  29. package/modules/readXlsxFileContents.js.map +1 -1
  30. package/modules/readXlsxFileNode.test.js.map +1 -1
  31. package/node/index.commonjs.js +6 -0
  32. package/node/index.d.ts.test +23 -0
  33. package/node/index.js +5 -0
  34. package/node/package.json +9 -0
  35. package/package.json +7 -6
  36. package/schema/index.commonjs.js +2 -0
  37. package/schema/index.d.ts.test +6 -0
  38. package/schema/index.js +1 -0
  39. package/schema/package.json +9 -0
  40. package/types.d.ts +80 -0
  41. package/website/index.html +105 -0
  42. package/node.js +0 -6
package/.gitlab-ci.yml ADDED
@@ -0,0 +1,15 @@
1
+ image: node:10
2
+
3
+ pages:
4
+ script:
5
+ - npm install
6
+ - npm run build
7
+ - mv ./bundle ./public
8
+ - cp --recursive ./website/* ./public/
9
+
10
+ artifacts:
11
+ paths:
12
+ - public
13
+
14
+ only:
15
+ - master
package/CHANGELOG.md CHANGED
@@ -1,3 +1,30 @@
1
+ <!--
2
+ 5.0.0 / 30.08.2020
3
+ ==================
4
+
5
+ * Added [TypeScript](https://github.com/catamphetamine/read-excel-file/issues/71) definitions.
6
+
7
+ * Removed deprecated `URL`, `Integer` and `Email` exports (use the string variants instead: `"URL"`, `"Integer"`, `"Email"`).
8
+
9
+ * Removed undocumented `convertToJson()` export.
10
+ -->
11
+
12
+ 4.1.0 / 09.11.2020
13
+ ==================
14
+
15
+ * Renamed schema entry `parse()` function: now it's called `type`. This way, `type` could be both a built-in type and a custom type.
16
+
17
+ * Changed the built-in `"Integer"`, `"URL"` and `"Email"` types: now they're exported functions again instead of strings. Strings still work.
18
+
19
+ * Added `map` parameter: similar to `schema` but doesn't perform any parsing or validation. Can be used to map an Excel file to an array of objects that could be parsed/validated using [`yup`](https://github.com/jquense/yup).
20
+
21
+ * `type` of a schema entry is no longer required: if no `type` is specified, then the cell value is returned "as is" (string, or number, or boolean, or `Date`).
22
+
23
+ 4.0.8 / 08.11.2020
24
+ ==================
25
+
26
+ * Updated `JSZip` to the latest version. The [issue](https://gitlab.com/catamphetamine/read-excel-file/-/issues/8). The [original issue](https://github.com/catamphetamine/read-excel-file/issues/54).
27
+
1
28
  4.0.0 / 25.05.2019
2
29
  ==================
3
30
 
package/LICENSE CHANGED
@@ -1,6 +1,6 @@
1
1
  MIT License
2
2
 
3
- Copyright (c) 2018 github.com/catamphetamine
3
+ Copyright (c) 2018 gitlab.com/catamphetamine
4
4
 
5
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
6
  of this software and associated documentation files (the "Software"), to deal
package/README.md CHANGED
@@ -1,8 +1,16 @@
1
1
  # `read-excel-file`
2
2
 
3
- Read `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.
3
+ Read small to medium `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.
4
4
 
5
- [Demo](https://catamphetamine.github.io/read-excel-file/)
5
+ [Demo](https://catamphetamine.gitlab.io/read-excel-file/)
6
+
7
+ ## Restrictions
8
+
9
+ There have been some [complaints](https://github.com/catamphetamine/read-excel-file/issues/38#issuecomment-544286628) about this library not being able to handle large `*.xlsx` spreadsheets. It's true that this library's main point have been usability and convenience, and not performance or the ability to handle huge datasets. For example, the time of parsing a 2000 rows / 20 columns file is about 3 seconds, and when parsing a 30k+ rows file, it may throw a `RangeError: Maximum call stack size exceeded`. So, for handling huge datasets, use something like [`xlsx`](https://github.com/catamphetamine/read-excel-file/issues/38#issuecomment-544286628) package instead. This library is suitable for handling small to medium `*.xlsx` files.
10
+
11
+ ## GitHub
12
+
13
+ On March 9th, 2020, GitHub, Inc. silently [banned](https://medium.com/@catamphetamine/how-github-blocked-me-and-all-my-libraries-c32c61f061d3) my account (and all my libraries) without any notice. I opened a support ticked but they didn't answer. Because of that, I had to move all my libraries to [GitLab](https://gitlab.com/catamphetamine).
6
14
 
7
15
  ## Install
8
16
 
@@ -10,6 +18,8 @@ Read `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.
10
18
  npm install read-excel-file --save
11
19
  ```
12
20
 
21
+ If you're not using a bundler then use a [standalone version from a CDN](#cdn).
22
+
13
23
  ## Browser
14
24
 
15
25
  ```html
@@ -48,7 +58,11 @@ readXlsxFile(fs.createReadStream('/path/to/file')).then((rows) => {
48
58
 
49
59
  ## Dates
50
60
 
51
- XLSX format has no dedicated "date" type so dates are stored internally as simply numbers along with a "format" (e.g. `"MM/DD/YY"`). When using `readXlsx()` with `schema` parameter all dates get parsed correctly in any case. But if using `readXlsx()` without `schema` parameter (to get "raw" data) then this library attempts to guess whether a cell value is a date or not by examining the cell "format" (e.g. `"MM/DD/YY"`), so in most cases dates are detected and parsed automatically. For exotic cases one can pass an explicit `dateFormat` parameter (e.g. `"MM/DD/YY"`) to instruct the library to parse numbers with such "format" as dates.
61
+ XLSX format has no dedicated "date" type so dates are stored internally as simply numbers along with a "format" (e.g. `"MM/DD/YY"`). When using `readXlsx()` with `schema` parameter all dates get parsed correctly in any case. But if using `readXlsx()` without `schema` parameter (to get "raw" data) then this library attempts to guess whether a cell value is a date or not by examining the cell "format" (e.g. `"MM/DD/YY"`), so in most cases dates are detected and parsed automatically. For exotic cases one can pass an explicit `dateFormat` parameter (e.g. `"MM/DD/YY"`) to instruct the library to parse numbers with such "format" as dates:
62
+
63
+ ```js
64
+ readXlsxFile(file, { dateFormat: 'MM/DD/YY' })
65
+ ```
52
66
 
53
67
  ## JSON
54
68
 
@@ -56,11 +70,11 @@ To convert rows to JSON pass `schema` option to `readXlsxFile()`. It will return
56
70
 
57
71
  ```js
58
72
  // An example *.xlsx document:
59
- // -----------------------------------------------------------------------------
60
- // | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE | CONTACT |
61
- // -----------------------------------------------------------------------------
62
- // | 03/24/2018 | 123 | true | Chemistry | (123) 456-7890 |
63
- // -----------------------------------------------------------------------------
73
+ // -----------------------------------------------------------------------------------------
74
+ // | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE | CONTACT | STATUS |
75
+ // -----------------------------------------------------------------------------------------
76
+ // | 03/24/2018 | 123 | true | Chemistry | (123) 456-7890 | SCHEDULED |
77
+ // -----------------------------------------------------------------------------------------
64
78
 
65
79
  const schema = {
66
80
  'START DATE': {
@@ -75,6 +89,8 @@ const schema = {
75
89
  type: Number,
76
90
  required: true
77
91
  },
92
+ // 'COURSE' is not a real Excel file column name,
93
+ // it can be any string — it's just for code readability.
78
94
  'COURSE': {
79
95
  prop: 'course',
80
96
  type: {
@@ -94,13 +110,22 @@ const schema = {
94
110
  'CONTACT': {
95
111
  prop: 'contact',
96
112
  required: true,
97
- parse(value) {
113
+ type: (value) => {
98
114
  const number = parsePhoneNumber(value)
99
115
  if (!number) {
100
116
  throw new Error('invalid')
101
117
  }
102
118
  return number
103
119
  }
120
+ },
121
+ 'STATUS': {
122
+ prop: 'status',
123
+ type: String,
124
+ oneOf: [
125
+ 'SCHEDULED',
126
+ 'STARTED',
127
+ 'FINISHED'
128
+ ]
104
129
  }
105
130
  }
106
131
 
@@ -116,30 +141,74 @@ readXlsxFile(file, { schema }).then(({ rows, errors }) => {
116
141
  title: 'Chemistry'
117
142
  },
118
143
  contact: '+11234567890',
144
+ status: 'SCHEDULED'
119
145
  }]
120
146
  })
121
147
  ```
122
148
 
149
+ If no `type` is specified then the cell value is returned "as is".
150
+
123
151
  There are also some additional exported `type`s:
124
152
 
125
- * `"Integer"` for parsing integer `Number`s.
126
- * `"URL"` for parsing URLs.
127
- * `"Email"` for parsing email addresses.
153
+ * `Integer` for parsing integer `Number`s.
154
+ * `URL` for parsing URLs.
155
+ * `Email` for parsing email addresses.
156
+
157
+ A schema entry for a column may also define an optional `validate(value)` function for validating the parsed value: in that case, it must `throw` an `Error` if the `value` is invalid.
158
+
159
+ #### Map
160
+
161
+ Sometimes, a developer might want to use some other (more advanced) solution for schema parsing and validation (like [`yup`](https://github.com/jquense/yup)). If a developer passes a `map` instead of a `schema` to `readXlsxFile()`, then it would just map each data row to a JSON object without doing any parsing or validation.
162
+
163
+ ```js
164
+ // An example *.xlsx document:
165
+ // -----------------------------------------------------------------------------------------
166
+ // | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE | CONTACT | STATUS |
167
+ // -----------------------------------------------------------------------------------------
168
+ // | 03/24/2018 | 123 | true | Chemistry | (123) 456-7890 | SCHEDULED |
169
+ // -----------------------------------------------------------------------------------------
170
+
171
+ const map = {
172
+ 'START DATE': 'date',
173
+ 'NUMBER OF STUDENTS': 'numberOfStudents',
174
+ 'COURSE': {
175
+ 'course': {
176
+ 'IS FREE': 'isFree',
177
+ 'COURSE TITLE': 'title'
178
+ }
179
+ },
180
+ 'CONTACT': 'contact',
181
+ 'STATUS': 'status'
182
+ }
183
+
184
+ readXlsxFile(file, { map }).then(({ rows }) => {
185
+ rows === [{
186
+ date: new Date(2018, 2, 24),
187
+ numberOfStudents: 123,
188
+ course: {
189
+ isFree: true,
190
+ title: 'Chemistry'
191
+ },
192
+ contact: '(123) 456-7890',
193
+ status: 'SCHEDULED'
194
+ }]
195
+ })
196
+ ```
128
197
 
129
- A schema entry for a column can also have a `validate(value)` function for validating the parsed value. It must `throw` an `Error` if the value is invalid.
198
+ #### Displaying schema errors
130
199
 
131
- A React component for displaying error info could look like this:
200
+ A React component for displaying schema parsing/validation errors could look like this:
132
201
 
133
202
  ```js
134
203
  import { parseExcelDate } from 'read-excel-file'
135
204
 
136
205
  function ParseExcelError({ children: error }) {
137
- // Human-readable value.
206
+ // Get a human-readable value.
138
207
  let value = error.value
139
208
  if (error.type === Date) {
140
209
  value = parseExcelDate(value).toString()
141
210
  }
142
- // Error summary.
211
+ // Render error summary.
143
212
  return (
144
213
  <div>
145
214
  <code>"{error.error}"</code>
@@ -156,6 +225,8 @@ function ParseExcelError({ children: error }) {
156
225
  }
157
226
  ```
158
227
 
228
+ #### Transforming rows/columns before schema is applied
229
+
159
230
  When using a `schema` there's also an optional `transformData(data)` parameter which can be used for the cases when the spreadsheet rows/columns aren't in the correct format. For example, the heading row may be missing, or there may be some purely presentational or empty rows. Example:
160
231
 
161
232
  ```js
@@ -163,13 +234,17 @@ readXlsxFile(file, {
163
234
  schema,
164
235
  transformData(data) {
165
236
  // Adds header row to the data.
166
- return ['ID', 'NAME', ...].concat(data)
237
+ return [['ID', 'NAME', ...]].concat(data)
167
238
  // Removes empty rows.
168
239
  return data.filter(row => row.filter(column => column !== null).length > 0)
169
240
  }
170
241
  })
171
242
  ```
172
243
 
244
+ ## TypeScript
245
+
246
+ See [testing `index.d.ts`](https://github.com/catamphetamine/read-excel-file/issues/71#issuecomment-675140448).
247
+
173
248
  ## Browser compatibility
174
249
 
175
250
  Node.js `*.xlxs` parser uses `xpath` and `xmldom` packages for XML parsing. The same packages could be used in a browser because [all modern browsers](https://caniuse.com/#search=domparser) (except IE 11) have native `DOMParser` built-in which could is used instead (meaning smaller footprint and better performance) but since Internet Explorer 11 support is still required the browser version doesn't use the native `DOMParser` and instead uses `xpath` and `xmldom` packages for XML parsing just like the Node.js version.
@@ -204,6 +279,24 @@ readXlsxFile(file, { getSheets: true }).then((sheets) => {
204
279
  })
205
280
  ```
206
281
 
282
+ ## CDN
283
+
284
+ One can use any npm CDN service, e.g. [unpkg.com](https://unpkg.com) or [jsdelivr.net](https://jsdelivr.net)
285
+
286
+ ```html
287
+ <script src="https://unpkg.com/read-excel-file@4.x/bundle/read-excel-file.min.js"></script>
288
+
289
+ <script>
290
+ var input = document.getElementById('input')
291
+ input.addEventListener('change', function() {
292
+ readXlsxFile(input.files[0]).then(function() {
293
+ // `rows` is an array of rows
294
+ // each row being an array of cells.
295
+ })
296
+ })
297
+ </script>
298
+ ```
299
+
207
300
  ## References
208
301
 
209
302
  For XML parsing [`xmldom`](https://github.com/jindw/xmldom) and [`xpath`](https://github.com/goto100/xpath) are used.
package/bundle/index.html CHANGED
@@ -52,14 +52,14 @@
52
52
  </head>
53
53
 
54
54
  <body>
55
- <a id="main-link" href="https://github.com/catamphetamine/read-excel-file">
55
+ <a id="main-link" href="https://gitlab.com/catamphetamine/read-excel-file">
56
56
  read-excel-file
57
57
  </a>
58
58
 
59
59
  <input type="file" id="input" />
60
60
 
61
61
  <div style="font-size: 12px">
62
- * Parsing to JSON with a strict schema is supported. <a target="_blank" href="https://github.com/catamphetamine/read-excel-file#json" style="color: #0093C4; text-decoration: none">Read more</a>.
62
+ * Parsing to JSON with a strict schema is supported. <a target="_blank" href="https://gitlab.com/catamphetamine/read-excel-file#json" style="color: #0093C4; text-decoration: none">Read more</a>.
63
63
  </div>
64
64
 
65
65
  <div id="result-table"></div>
@@ -75,20 +75,26 @@
75
75
  // each row being an array of cells.
76
76
  document.getElementById('result').innerText = JSON.stringify(data, null, 2)
77
77
 
78
- document.getElementById('result-table').innerHTML =
79
- '<table>' +
80
- '<tbody>' +
81
- data.map(function (row) {
82
- return '<tr>' +
83
- row.map(function (cell) {
84
- return '<td>' +
85
- (cell === null ? '' : cell) +
86
- '</td>'
87
- }).join('') +
88
- '</tr>'
89
- }).join('') +
90
- '</tbody>' +
91
- '</table>'
78
+ // Applying `innerHTML` hangs the browser when there're a lot of rows/columns.
79
+ // For example, for a file having 2000 rows and 20 columns on a modern
80
+ // mid-tier CPU it parses the file (using a "schema") for 3 seconds
81
+ // (blocking) with 100% single CPU core usage.
82
+ // Then applying `innerHTML` hangs the browser.
83
+
84
+ // document.getElementById('result-table').innerHTML =
85
+ // '<table>' +
86
+ // '<tbody>' +
87
+ // data.map(function (row) {
88
+ // return '<tr>' +
89
+ // row.map(function (cell) {
90
+ // return '<td>' +
91
+ // (cell === null ? '' : cell) +
92
+ // '</td>'
93
+ // }).join('') +
94
+ // '</tr>'
95
+ // }).join('') +
96
+ // '</tbody>' +
97
+ // '</table>'
92
98
  }, function (error) {
93
99
  console.error(error)
94
100
  alert("Error while parsing Excel file. See console output for the error stack trace.")